Skip to main navigation Skip to search Skip to main content

Historical document image segmentation using background light intensity normalization

  • SUNY Buffalo

Research output: Contribution to journalConference articlepeer-review

17 Scopus citations

Abstract

This paper presents a new document binarization algorithm for camera images of historical documents, which are especially found in The Library of Congress of the Unite States. The algorithm uses a background light intensity normalization algorithm to enhance an image before a local adaptive binarization algorithm is applied. The image normalization algorithm uses an adaptive linear or non-linear function to approximate the uneven background of the image due to the uneven surface of the document paper, aged color or uneven light source of the cameras for image lifting. Our algorithm adaptively captures the background of a document image with a "best fit" approximation. The document image is then normalized with respect to the approximation before a thresholding algorithm is applied. The technique works for both gray scale and color historical handwritten document images with significant improvement in readability for both human and OCR.

Original languageEnglish
Article number19
Pages (from-to)167-174
Number of pages8
JournalProceedings of SPIE - The International Society for Optical Engineering
Volume5676
DOIs
StatePublished - 2005
EventProceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XII - San Jose, CA, United States
Duration: Jan 19 2005Jan 20 2005

Keywords

  • Character recognition
  • Document analysis
  • Historical handwritten document image
  • Image segmentation

Fingerprint

Dive into the research topics of 'Historical document image segmentation using background light intensity normalization'. Together they form a unique fingerprint.

Cite this