Skip to main navigation Skip to search Skip to main content

Identifying script on word-level with informational confidence

  • University of Maryland, College Park

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

54 Scopus citations

Abstract

In this paper, we present a multiple classifier system for script identification. Applying a Gabor filter analysis of textures on word-level, our system identifies Latin and non-Latin words in bilingual printed documents. The classifier system comprises four different architectures based on nearest neighbors, weighted Euclidean distances, Gaussian mixture models, and support vector machines. We report results for Arabic, Chinese, Hindi, and Korean script. Moreover, we show that combining informational confidence values using sum-rule can consistently outperform the best single recognition rate.

Original languageEnglish
Title of host publicationProceedings of the Eighth International Conference on Document Analysis and Recognition
Pages416-420
Number of pages5
DOIs
StatePublished - 2005
Event8th International Conference on Document Analysis and Recognition - Seoul, Korea, Republic of
Duration: Aug 31 2005Sep 1 2005

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume2005
ISSN (Print)1520-5363

Conference

Conference8th International Conference on Document Analysis and Recognition
Country/TerritoryKorea, Republic of
CitySeoul
Period08/31/0509/1/05

Fingerprint

Dive into the research topics of 'Identifying script on word-level with informational confidence'. Together they form a unique fingerprint.

Cite this