TY - GEN
T1 - Concepts-bridges
T2 - 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
AU - Jha, Kishlay
AU - Xun, Guangxu
AU - Wang, Yaqing
AU - Gopalakrishnan, Vishrawas
AU - Zhang, Aidong
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Given two topics of interest (A and C) that are otherwise disconnected - for instance two concepts: a disease ("Migraine") and a therapeutic substance ("Magnesium") - this paper attempts to find the conceptual bridges (e.g., serotonin (B)) that connects them in a meaningful way. This problem of mining implicit linkage is known as hypotheses generation and its potential to accelerate scientific progress is widely recognized. Almost all of the prior studies to tackle this problem ignore the temporal dynamics of concepts. This is limiting because it is known that the semantic meaning of a concept evolves over time. To overcome this issue, in this study, we define this problem as mining time-aware Top-k conceptual bridges, and in doing so provide a systematic approach to formalize the problem. Specifically, the proposed model first extracts relevant entities from the corpus, represents them in time-specific latent spaces, and then further reasons upon it to generate novel and experimentally testable hypotheses. The key challenge in this approach is to learn a mapping function that encodes the temporal characteristics of concepts and aligns the across-time latent spaces. To solve this, we propose an effective algorithm that learns precise mapping sensitive to both global and local semantics of the input query. Both qualitative and quantitative evaluations performed on the largest available biomedical corpus substantiate the importance of leveraging temporal dynamics and suggests that the generated hypotheses are novel and worthy of clinical trials.
AB - Given two topics of interest (A and C) that are otherwise disconnected - for instance two concepts: a disease ("Migraine") and a therapeutic substance ("Magnesium") - this paper attempts to find the conceptual bridges (e.g., serotonin (B)) that connects them in a meaningful way. This problem of mining implicit linkage is known as hypotheses generation and its potential to accelerate scientific progress is widely recognized. Almost all of the prior studies to tackle this problem ignore the temporal dynamics of concepts. This is limiting because it is known that the semantic meaning of a concept evolves over time. To overcome this issue, in this study, we define this problem as mining time-aware Top-k conceptual bridges, and in doing so provide a systematic approach to formalize the problem. Specifically, the proposed model first extracts relevant entities from the corpus, represents them in time-specific latent spaces, and then further reasons upon it to generate novel and experimentally testable hypotheses. The key challenge in this approach is to learn a mapping function that encodes the temporal characteristics of concepts and aligns the across-time latent spaces. To solve this, we propose an effective algorithm that learns precise mapping sensitive to both global and local semantics of the input query. Both qualitative and quantitative evaluations performed on the largest available biomedical corpus substantiate the importance of leveraging temporal dynamics and suggests that the generated hypotheses are novel and worthy of clinical trials.
KW - Hypotheses generation
KW - Temporal dynamics
KW - Word embeddings
UR - https://www.scopus.com/pages/publications/85051460825
U2 - 10.1145/3219819.3220071
DO - 10.1145/3219819.3220071
M3 - Conference contribution
AN - SCOPUS:85051460825
SN - 9781450355520
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1599
EP - 1607
BT - KDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 19 August 2018 through 23 August 2018
ER -