Concepedia

Publication | Open Access

The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions

2.9K

Citations

25

References

2018

Year

Unknown Author(s)
Scientific Data

TLDR

Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. The authors release the HAM10000 dataset to address the limited size and diversity of existing dermatoscopic image collections. They collected 10,015 dermatoscopic images from diverse populations and modalities, applied varied acquisition and cleaning methods, and built semi‑automatic workflows powered by specially trained neural networks. The resulting HAM10000 dataset contains 10,015 images, publicly available via the ISIC archive, covering all major pigmented lesion categories, with over half confirmed by pathology and the rest by follow‑up, expert consensus, or confocal microscopy, making it a benchmark for machine‑learning and human‑expert comparison.

Abstract

Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. We tackle this problem by releasing the HAM10000 ("Human Against Machine with 10000 training images") dataset. We collected dermatoscopic images from different populations acquired and stored by different modalities. Given this diversity we had to apply different acquisition and cleaning methods and developed semi-automatic workflows utilizing specifically trained neural networks. The final dataset consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive. This benchmark dataset can be used for machine learning and for comparisons with human experts. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions. More than 50% of lesions have been confirmed by pathology, while the ground truth for the rest of the cases was either follow-up, expert consensus, or confirmation by in-vivo confocal microscopy.

References

YearCitations

Page 1