Skip to main navigation Skip to search Skip to main content

Clutter noise removal in binary document images

  • University of Maryland, College Park

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The paper presents a clutter detection and removal algorithm for complex document images. This distance transform based technique aims to remove irregular and independent unwanted clutter while preserving the text content. The novelty of this approach is in its approximation to the clutter-content boundary when the clutter is attached to the content in irregular ways. As an intermediate step, a residual image is created, which forms the basis for clutter detection and removal. Clutter detection and removal are independent of clutter's position, size, shape, and connectivity with text. The method is tested on a collection of highly degraded and noisy, machine-printed and handwritten Arabic and English documents, and results show pixel-level accuracies of 99.18 and 98.67 % for clutter detection and removal, respectively. This approach is also extended to documents having a mix of clutter and salt-and-pepper noise.

Original languageEnglish
Pages (from-to)351-369
Number of pages19
JournalInternational Journal on Document Analysis and Recognition
Volume16
Issue number4
DOIs
StatePublished - Dec 2013

Keywords

  • Clutter removal
  • Image enhancement
  • Margin removal
  • Noise border removal
  • Pixel-based noise removal

Fingerprint

Dive into the research topics of 'Clutter noise removal in binary document images'. Together they form a unique fingerprint.

Cite this