Concepedia

Publication | Closed Access

CASIA Online and Offline Chinese Handwriting Databases

507

Citations

12

References

2011

Year

TLDR

The paper introduces paired online and offline Chinese handwriting databases comprising isolated characters and handwritten texts. The databases were created from 1,020 writers’ Anoto‑pen recordings, yielding six datasets (three for isolated characters and three for handwritten texts) that are segmented, character‑annotated, and split into standard training and test sets. The datasets provide roughly 3.9 million isolated‑character samples across 7,356 classes and 1.35 million character samples from 5,090 pages, enabling research in diverse handwritten document analysis tasks.

Abstract

This paper introduces a pair of online and offline Chinese handwriting databases, containing samples of isolated characters and handwritten texts. The samples were produced by 1,020 writers using Anoto pen on papers for obtaining both online trajectory data and offline images. Both the online samples and offline samples are divided into six datasets, three for isolated characters (DB1.0-C1.2) and three for handwritten texts (DB2.0-C2.2). The (either online or offline) datasets of isolated characters contain about 3.9 million samples of 7,356 classes (7,185 Chinese characters and 171 symbols), and the datasets of handwritten texts contain about 5,090 pages and 1.35 million character samples. Each dataset is segmented and annotated at character level, and is partitioned into standard training and test subsets. The online and offline databases can be used for the research of various handwritten document analysis tasks.

References

YearCitations

Page 1