Abstract
Document images belong to a unique class of images where the information is embedded in the language represented by a series of symbols on the page rather than in the visual objects themselves. Since these symbols tend to appear repeatedly, a domain-specific image coding strategy can be designed to facilitate enhanced compression and retrieval. In this paper we describe a coding methodology that not only exploits component-level redundancy to reduce code length but also supports efficient data access. The approach identifies and organizes symbol patterns which appear repeatedly. Similar components are represented by a single prototype stored in a library and the location of each component instance is coded along with the residual between it and its prototype. A representation is built which provides a natural information index allowing access to individual components. Compression results are competitive and compressed-domain access is superior to competing methods. Applications to network-related problems have been considered, and show promising results.
| Original language | English |
|---|---|
| Pages (from-to) | 121-135 |
| Number of pages | 15 |
| Journal | Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology |
| Volume | 20 |
| Issue number | 1-2 |
| State | Published - 1998 |
Fingerprint
Dive into the research topics of 'Document Image Coding for Processing and Retrieval'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver