TY - GEN
T1 - Context-Specific Feature Augmentation for Improving Social Determinants of Health Extraction
AU - Gong, Lei
AU - Shor, Andrey
AU - Zhang, Aidong
AU - Jha, Kishlay
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Social determinants of health (SDoH) factors such as poverty, social environment, and unemployment are known to profoundly impact health outcomes. However, extracting SDoH from the electronic health records (EHR) is a challenge due to the unstructured nature of clinical narratives that encode them. To address this, several approaches ranging from rule-based natural language processing to large language models have been proposed in the literature. Despite significant advances, the existing SDoH extraction approaches are not robust to the noise present in clinical notes or discharge summaries and thus yield unsatisfactory performance. In other words, the noisy information in clinical notes leads to the generation of low-quality feature representations of medical concepts that severely impacts the performance of SDoH extraction.In this paper, we propose a novel approach that augments EHR discharge summaries with context-specific semantic knowledge from biomedical literature to generate robust feature representations needed for accurate SDoH extraction. Specifically, our approach identifies key contextual information (e.g., symptoms, diseases, and medications) from EHR discharge summaries and retrieves relevant scientific articles to generate additional semantic context for SDoH classifier. Moreover, to effectively fuse complementary information from both EHR discharge summaries and biomedical literature, we propose a new feature infusion strategy that adaptively fuses feature representations based on their contextual relevance. Experimental results on the benchmark MIMIC-SDoH dataset demonstrate that the proposed approach significantly outperforms baseline algorithms and highlight the role of context-specific feature augmentation in enhancing the accuracy of SDoH extraction.
AB - Social determinants of health (SDoH) factors such as poverty, social environment, and unemployment are known to profoundly impact health outcomes. However, extracting SDoH from the electronic health records (EHR) is a challenge due to the unstructured nature of clinical narratives that encode them. To address this, several approaches ranging from rule-based natural language processing to large language models have been proposed in the literature. Despite significant advances, the existing SDoH extraction approaches are not robust to the noise present in clinical notes or discharge summaries and thus yield unsatisfactory performance. In other words, the noisy information in clinical notes leads to the generation of low-quality feature representations of medical concepts that severely impacts the performance of SDoH extraction.In this paper, we propose a novel approach that augments EHR discharge summaries with context-specific semantic knowledge from biomedical literature to generate robust feature representations needed for accurate SDoH extraction. Specifically, our approach identifies key contextual information (e.g., symptoms, diseases, and medications) from EHR discharge summaries and retrieves relevant scientific articles to generate additional semantic context for SDoH classifier. Moreover, to effectively fuse complementary information from both EHR discharge summaries and biomedical literature, we propose a new feature infusion strategy that adaptively fuses feature representations based on their contextual relevance. Experimental results on the benchmark MIMIC-SDoH dataset demonstrate that the proposed approach significantly outperforms baseline algorithms and highlight the role of context-specific feature augmentation in enhancing the accuracy of SDoH extraction.
KW - electronic health records
KW - feature augmentation
KW - social determinants of health
UR - https://www.scopus.com/pages/publications/85218046752
U2 - 10.1109/BigData62323.2024.10825225
DO - 10.1109/BigData62323.2024.10825225
M3 - Conference contribution
AN - SCOPUS:85218046752
T3 - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
SP - 1736
EP - 1745
BT - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
A2 - Ding, Wei
A2 - Lu, Chang-Tien
A2 - Wang, Fusheng
A2 - Di, Liping
A2 - Wu, Kesheng
A2 - Huan, Jun
A2 - Nambiar, Raghu
A2 - Li, Jundong
A2 - Ilievski, Filip
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Big Data, BigData 2024
Y2 - 15 December 2024 through 18 December 2024
ER -