TY - GEN
T1 - Learning visual shape lexicon for document image content recognition
AU - Zhu, Guangyu
AU - Yu, Xiaodong
AU - Li, Yi
AU - Doermann, David
PY - 2008
Y1 - 2008
N2 - Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content categorization using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant shape feature that is generic enough to be detected repeatably and segmentation free. We learn a concise, structurally indexed shape lexicon from training by clustering and partitioning feature types through graph cuts. We demonstrate our approach on two challenging document image content recognition problems: 1) The classification of 4,500 Web images crawled from Google Image Search into three content categories - pure image, image with text, and document image, and 2) Language identification of 8 languages (Arabic, Chinese, English, Hindi, Japanese, Korean, Russian, and Thai) on a 1,512 complex document image database composed of mixed machine printed text and handwriting. Our approach is capable to handle high intra-class variability and shows results that exceed other state-of-the-art approaches, allowing it to be used as a content recognizer in image indexing and retrieval systems.
AB - Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content categorization using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant shape feature that is generic enough to be detected repeatably and segmentation free. We learn a concise, structurally indexed shape lexicon from training by clustering and partitioning feature types through graph cuts. We demonstrate our approach on two challenging document image content recognition problems: 1) The classification of 4,500 Web images crawled from Google Image Search into three content categories - pure image, image with text, and document image, and 2) Language identification of 8 languages (Arabic, Chinese, English, Hindi, Japanese, Korean, Russian, and Thai) on a 1,512 complex document image database composed of mixed machine printed text and handwriting. Our approach is capable to handle high intra-class variability and shows results that exceed other state-of-the-art approaches, allowing it to be used as a content recognizer in image indexing and retrieval systems.
UR - https://www.scopus.com/pages/publications/56749096300
U2 - 10.1007/978-3-540-88688-4_55
DO - 10.1007/978-3-540-88688-4_55
M3 - Conference contribution
AN - SCOPUS:56749096300
SN - 3540886850
SN - 9783540886853
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 745
EP - 758
BT - Computer Vision - ECCV 2008 - 10th European Conference on Computer Vision, Proceedings
PB - Springer Verlag
T2 - 10th European Conference on Computer Vision, ECCV 2008
Y2 - 12 October 2008 through 18 October 2008
ER -