Abstract
In this paper we present a new dual mode, twin-folio structured English handwriting dataset IBM-UB-1. IBM-UB-1 is our first major release from a large multilingual handwriting corpus. Containing over 6000 pages of handwritten matter, this dataset can not only be used for unconstrained handwriting recognition, more importantly, the dataset's unique twin-folio structure presents a natural fit for research on writer identification, keyword spotting, indexing and various forms of handwritten document search and retrieval. We first describe two central characteristics of the dataset - the twin-folio structure and dual modality (online/offline) - and their relevance to current research problems. Secondly, we describe the dataset, its collection and construction, and provide key descriptive statistics. Finally, we evaluate the dataset on two different research domains - handwriting recognition and writer identification - and present related experimental results.
| Original language | English |
|---|---|
| Article number | 6628577 |
| Pages (from-to) | 13-17 |
| Number of pages | 5 |
| Journal | Proceedings of the International Conference on Document Analysis and Recognition, ICDAR |
| DOIs | |
| State | Published - 2013 |
| Event | 12th International Conference on Document Analysis and Recognition, ICDAR 2013 - Washington, DC, United States Duration: Aug 25 2013 → Aug 28 2013 |
Keywords
- Dataset
- Dual mode
- English online handwriting dataset
- Handwriting recognition
- Offline-online
- offline/online
- Online handwriting dataset
- twin-folio
- Unconstrained handwriting dataset
- Writer Identification
Fingerprint
Dive into the research topics of 'IBM-UB-1: A dual mode unconstrained english handwriting dataset'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver