Skip to main navigation Skip to search Skip to main content

IBM-UB-1: A dual mode unconstrained english handwriting dataset

  • SUNY Buffalo

Research output: Contribution to journalConference articlepeer-review

29 Scopus citations

Abstract

In this paper we present a new dual mode, twin-folio structured English handwriting dataset IBM-UB-1. IBM-UB-1 is our first major release from a large multilingual handwriting corpus. Containing over 6000 pages of handwritten matter, this dataset can not only be used for unconstrained handwriting recognition, more importantly, the dataset's unique twin-folio structure presents a natural fit for research on writer identification, keyword spotting, indexing and various forms of handwritten document search and retrieval. We first describe two central characteristics of the dataset - the twin-folio structure and dual modality (online/offline) - and their relevance to current research problems. Secondly, we describe the dataset, its collection and construction, and provide key descriptive statistics. Finally, we evaluate the dataset on two different research domains - handwriting recognition and writer identification - and present related experimental results.

Original languageEnglish
Article number6628577
Pages (from-to)13-17
Number of pages5
JournalProceedings of the International Conference on Document Analysis and Recognition, ICDAR
DOIs
StatePublished - 2013
Event12th International Conference on Document Analysis and Recognition, ICDAR 2013 - Washington, DC, United States
Duration: Aug 25 2013Aug 28 2013

Keywords

  • Dataset
  • Dual mode
  • English online handwriting dataset
  • Handwriting recognition
  • Offline-online
  • offline/online
  • Online handwriting dataset
  • twin-folio
  • Unconstrained handwriting dataset
  • Writer Identification

Fingerprint

Dive into the research topics of 'IBM-UB-1: A dual mode unconstrained english handwriting dataset'. Together they form a unique fingerprint.

Cite this