TY - GEN
T1 - Topic discovery for biomedical corpus using mesh embeddings
AU - Xun, Guangxu
AU - Jha, Kishlay
AU - Yuan, Ye
AU - Zhang, Aidong
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Discovering latent topics from biomedical documents has become a pivotal task in many biomedical text mining applications. Medical Subject Headings (MeSH) terms, which are curated by human experts, provide highly precise keyword representations for biomedical documents. However, the performance of conventional topic models on MeSH documents is usually unsatisfying due to the limited length of individual MeSH documents. In this paper, we propose a novel topic model for MeSH documents using MeSH embeddings. The proposed topic model is able to overcome the lack of context information problem in MeSH documents by 1) exploiting the rich term-level co-occurrence patterns instead of the sparse document-level co-occurrence patterns, and 2) incorporating additional MeSH semantics in MeSH embeddings learned from a large external biomedical knowledge base. Experimental result on a real-world biomedical dataset shows the efficacy of the proposed model in discovering coherent topics from MeSH documents.
AB - Discovering latent topics from biomedical documents has become a pivotal task in many biomedical text mining applications. Medical Subject Headings (MeSH) terms, which are curated by human experts, provide highly precise keyword representations for biomedical documents. However, the performance of conventional topic models on MeSH documents is usually unsatisfying due to the limited length of individual MeSH documents. In this paper, we propose a novel topic model for MeSH documents using MeSH embeddings. The proposed topic model is able to overcome the lack of context information problem in MeSH documents by 1) exploiting the rich term-level co-occurrence patterns instead of the sparse document-level co-occurrence patterns, and 2) incorporating additional MeSH semantics in MeSH embeddings learned from a large external biomedical knowledge base. Experimental result on a real-world biomedical dataset shows the efficacy of the proposed model in discovering coherent topics from MeSH documents.
KW - Biomedical topic discovery
KW - MeSH embeddings
UR - https://www.scopus.com/pages/publications/85073010927
U2 - 10.1109/BHI.2019.8834559
DO - 10.1109/BHI.2019.8834559
M3 - Conference contribution
AN - SCOPUS:85073010927
T3 - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings
BT - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019
Y2 - 19 May 2019 through 22 May 2019
ER -