Abstract
This paper presents a new document binarization algorithm for camera images of historical documents, which are especially found in The Library of Congress of the Unite States. The algorithm uses a background light intensity normalization algorithm to enhance an image before a local adaptive binarization algorithm is applied. The image normalization algorithm uses an adaptive linear or non-linear function to approximate the uneven background of the image due to the uneven surface of the document paper, aged color or uneven light source of the cameras for image lifting. Our algorithm adaptively captures the background of a document image with a "best fit" approximation. The document image is then normalized with respect to the approximation before a thresholding algorithm is applied. The technique works for both gray scale and color historical handwritten document images with significant improvement in readability for both human and OCR.
| Original language | English |
|---|---|
| Article number | 19 |
| Pages (from-to) | 167-174 |
| Number of pages | 8 |
| Journal | Proceedings of SPIE - The International Society for Optical Engineering |
| Volume | 5676 |
| DOIs | |
| State | Published - 2005 |
| Event | Proceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XII - San Jose, CA, United States Duration: Jan 19 2005 → Jan 20 2005 |
Keywords
- Character recognition
- Document analysis
- Historical handwritten document image
- Image segmentation
Fingerprint
Dive into the research topics of 'Historical document image segmentation using background light intensity normalization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver