Publication | Open Access
A Survey of Handwritten Character Recognition with MNIST and EMNIST
263
Citations
58
References
2019
Year
Convolutional Neural NetworkEngineeringMachine LearningBiometricsRepresentation LearningHandwritten Digit RecognitionImage ClassificationImage AnalysisData SciencePattern RecognitionMnist DatasetText RecognitionCharacter RecognitionData AugmentationMachine VisionOptical Character RecognitionComputer ScienceStatistical Pattern RecognitionDeep LearningComputer VisionDeep Neural NetworksHandwritten Character RecognitionLimited Data LearningPattern Recognition Application
MNIST has long been a benchmark for computer‑vision methods, especially convolutional neural networks, and in 2017 the related EMNIST dataset was released to include both digits and letters with more data. This paper provides the first exhaustive, up‑to‑date survey of state‑of‑the‑art results on MNIST and explains EMNIST while surveying its results. The review categorizes studies by whether they use data augmentation or the raw dataset, and further separates CNN‑based work from other approaches, providing separate summaries for each category. Recent studies achieve test error rates below 1% on MNIST, indicating the problem has become largely non‑challenging.
This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST’s. In this paper, EMNIST is explained and some results are surveyed.
| Year | Citations | |
|---|---|---|
Page 1
Page 1