TY - GEN
T1 - Image enhancement for degraded binary document images
AU - Shi, Zhixin
AU - Setlur, Srirangaraj
AU - Govindaraju, Venu
PY - 2011
Y1 - 2011
N2 - This paper presents a novel set of image enhancement algorithms for binary images of poorly scanned real world page documents. Problems that are targeted by the methods described include large blobs or clutter noise, salt-and-pepper noise and detection and removal of non-text objects such as form lines or rule-lines. The algorithms described are shown to be very effective in removing clutter noise and pepper noise as well as form lines and rule-lines. A region growing algorithm is also described to enhance the quality of the text and to fix the problems arising from the salt noise which leaves holes in the text and creates broken strokes. The methods were tested on 204 images from the challenge set of the DARPA MADCAT Arabic handwritten document image data. The results indicate that the methods described are robust and are capable of significantly improving the image quality for downstream OCR systems.
AB - This paper presents a novel set of image enhancement algorithms for binary images of poorly scanned real world page documents. Problems that are targeted by the methods described include large blobs or clutter noise, salt-and-pepper noise and detection and removal of non-text objects such as form lines or rule-lines. The algorithms described are shown to be very effective in removing clutter noise and pepper noise as well as form lines and rule-lines. A region growing algorithm is also described to enhance the quality of the text and to fix the problems arising from the salt noise which leaves holes in the text and creates broken strokes. The methods were tested on 204 images from the challenge set of the DARPA MADCAT Arabic handwritten document image data. The results indicate that the methods described are robust and are capable of significantly improving the image quality for downstream OCR systems.
UR - https://www.scopus.com/pages/publications/82355186257
U2 - 10.1109/ICDAR.2011.305
DO - 10.1109/ICDAR.2011.305
M3 - Conference contribution
AN - SCOPUS:82355186257
SN - 9780769545202
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 895
EP - 899
BT - Proceedings - 11th International Conference on Document Analysis and Recognition, ICDAR 2011
T2 - 11th International Conference on Document Analysis and Recognition, ICDAR 2011
Y2 - 18 September 2011 through 21 September 2011
ER -