Skip to main navigation Skip to search Skip to main content

Multilingual word spotting in offline handwritten documents

  • SUNY Buffalo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

In this work, we propose a novel multilingual word spotting framework based on Hidden Markov Models that works on corpus of multilingual handwritten documents and documents that contain more than one handwritten script. The system deals with large multilingual vocabularies without need for word or character segmentation. A keyword is represented by concatenating its character models. We propose and compare two systems: a script identifier based (IDB) and a script identifier free (IDF) system. IDB uses a HMM based script identifier before spotting a keyword. While, IDF does the spotting without the script identification. The system is evaluated on a mixed corpus of public dataset from several scripts such as IAM for English, AMA for Arabic and LAW for Devanagari and on synthetic dataset generated by concatenating words and lines from different scripts in a document image.

Original languageEnglish
Title of host publicationICPR 2012 - 21st International Conference on Pattern Recognition
Pages310-313
Number of pages4
StatePublished - 2012
Event21st International Conference on Pattern Recognition, ICPR 2012 - Tsukuba, Japan
Duration: Nov 11 2012Nov 15 2012

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Conference

Conference21st International Conference on Pattern Recognition, ICPR 2012
Country/TerritoryJapan
CityTsukuba
Period11/11/1211/15/12

Fingerprint

Dive into the research topics of 'Multilingual word spotting in offline handwritten documents'. Together they form a unique fingerprint.

Cite this