TY - GEN
T1 - Automatic identification of ROI in figure images toward improving hybrid (text and image) biomedical document retrieval
AU - You, Daekeun
AU - Antani, Sameer
AU - Demner-Fushman, Dina
AU - Rahman, Md Mahmudur
AU - Govindaraju, Venu
AU - Thoma, George R.
PY - 2011
Y1 - 2011
N2 - Biomedical images are often referenced for clinical decision support (CDS), educational purposes, and research. They appear in specialized databases or in biomedical publications and are not meaningfully retrievable using primarily textbased retrieval systems. The task of automatically finding the images in an article that are most useful for the purpose of determining relevance to a clinical situation is quite challenging. An approach is to automatically annotate images extracted from scientific publications with respect to their usefulness for CDS. As an important step toward achieving the goal, we proposed figure image analysis for localizing pointers (arrows, symbols) to extract regions of interest (ROI) that can then be used to obtain meaningful local image content. Content-based image retrieval (CBIR) techniques can then associate local image ROIs with identified biomedical concepts in figure captions for improved hybrid (text and image) retrieval of biomedical articles. In this work we present methods that make robust our previous Markov random field (MRF)-based approach for pointer recognition and ROI extraction. These include use of Active Shape Models (ASM) to overcome problems in recognizing distorted pointer shapes and a region segmentation method for ROI extraction. We measure the performance of our methods on two criteria: (i) effectiveness in recognizing pointers in images, and (ii) improved document retrieval through use of extracted ROIs. Evaluation on three test sets shows 87% accuracy in the first criterion. Further, the quality of document retrieval using local visual features and text is shown to be better than using visual features alone.
AB - Biomedical images are often referenced for clinical decision support (CDS), educational purposes, and research. They appear in specialized databases or in biomedical publications and are not meaningfully retrievable using primarily textbased retrieval systems. The task of automatically finding the images in an article that are most useful for the purpose of determining relevance to a clinical situation is quite challenging. An approach is to automatically annotate images extracted from scientific publications with respect to their usefulness for CDS. As an important step toward achieving the goal, we proposed figure image analysis for localizing pointers (arrows, symbols) to extract regions of interest (ROI) that can then be used to obtain meaningful local image content. Content-based image retrieval (CBIR) techniques can then associate local image ROIs with identified biomedical concepts in figure captions for improved hybrid (text and image) retrieval of biomedical articles. In this work we present methods that make robust our previous Markov random field (MRF)-based approach for pointer recognition and ROI extraction. These include use of Active Shape Models (ASM) to overcome problems in recognizing distorted pointer shapes and a region segmentation method for ROI extraction. We measure the performance of our methods on two criteria: (i) effectiveness in recognizing pointers in images, and (ii) improved document retrieval through use of extracted ROIs. Evaluation on three test sets shows 87% accuracy in the first criterion. Further, the quality of document retrieval using local visual features and text is shown to be better than using visual features alone.
KW - Active Shape Model
KW - biomedical article retrieval
KW - Biomedical image analysis
KW - content-based image retrieval
KW - image overlay extraction
KW - pointer symbol extraction
UR - https://www.scopus.com/pages/publications/79955722131
U2 - 10.1117/12.873434
DO - 10.1117/12.873434
M3 - Conference contribution
AN - SCOPUS:79955722131
SN - 9780819484116
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Proceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XVIII
T2 - Document Recognition and Retrieval XVIII
Y2 - 26 January 2011 through 27 January 2011
ER -