TY - GEN
T1 - Identifying script on word-level with informational confidence
AU - Jaeger, Stefan
AU - Huanfeng, Ma
AU - Doermann, David
PY - 2005
Y1 - 2005
N2 - In this paper, we present a multiple classifier system for script identification. Applying a Gabor filter analysis of textures on word-level, our system identifies Latin and non-Latin words in bilingual printed documents. The classifier system comprises four different architectures based on nearest neighbors, weighted Euclidean distances, Gaussian mixture models, and support vector machines. We report results for Arabic, Chinese, Hindi, and Korean script. Moreover, we show that combining informational confidence values using sum-rule can consistently outperform the best single recognition rate.
AB - In this paper, we present a multiple classifier system for script identification. Applying a Gabor filter analysis of textures on word-level, our system identifies Latin and non-Latin words in bilingual printed documents. The classifier system comprises four different architectures based on nearest neighbors, weighted Euclidean distances, Gaussian mixture models, and support vector machines. We report results for Arabic, Chinese, Hindi, and Korean script. Moreover, we show that combining informational confidence values using sum-rule can consistently outperform the best single recognition rate.
UR - https://www.scopus.com/pages/publications/33947363973
U2 - 10.1109/ICDAR.2005.134
DO - 10.1109/ICDAR.2005.134
M3 - Conference contribution
AN - SCOPUS:33947363973
SN - 0769524206
SN - 9780769524207
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 416
EP - 420
BT - Proceedings of the Eighth International Conference on Document Analysis and Recognition
T2 - 8th International Conference on Document Analysis and Recognition
Y2 - 31 August 2005 through 1 September 2005
ER -