Structured crowdsourcing enables convolutional segmentation of histology images

TLDR

While deep‑learning algorithms excel at semantic image segmentation, they require large annotated datasets to achieve high accuracy. The authors recruited 25 participants of varying experience to annotate 151 breast‑cancer whole‑slide images in the Digital Slide Archive, then trained fully convolutional networks on the resulting 20 000+ tissue‑region annotations, achieving a mean AUC of 0.945 and markedly improving classification performance, with the dataset publicly released. The study found low inter‑participant discordance for tumor and stroma but higher discordance for subjective or rare tissue classes, and that senior‑participant feedback facilitated the creation of over 20 000 curated annotations, which in turn enabled the networks to achieve the reported accuracy and improve image‑classification results.

Abstract

While deep-learning algorithms have demonstrated outstanding performance in semantic image segmentation tasks, large annotation datasets are needed to create accurate models. Annotation of histology images is challenging due to the effort and experience required to carefully delineate tissue structures, and difficulties related to sharing and markup of whole-slide images.We recruited 25 participants, ranging in experience from senior pathologists to medical students, to delineate tissue regions in 151 breast cancer slides using the Digital Slide Archive. Inter-participant discordance was systematically evaluated, revealing low discordance for tumor and stroma, and higher discordance for more subjectively defined or rare tissue classes. Feedback provided by senior participants enabled the generation and curation of 20 000+ annotated tissue regions. Fully convolutional networks trained using these annotations were highly accurate (mean AUC=0.945), and the scale of annotation data provided notable improvements in image classification accuracy.Dataset is freely available at: https://goo.gl/cNM4EL.Supplementary data are available at Bioinformatics online.

References

Page 1

	Year	Citations

Page 1