The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

TLDR

Vision‑based semantic segmentation of urban scenes is essential for autonomous driving, yet it requires large, pixel‑level annotated datasets that are costly to produce. This study proposes generating realistic synthetic images with automatic pixel‑level annotations from a virtual world to assess their utility for training deep convolutional neural networks. The authors created the SYNTHIA dataset, a diverse collection of synthetic urban images with automatically generated annotations, and combined it with publicly available real‑world annotated images for training. Experiments demonstrate that incorporating SYNTHIA into the training set markedly improves DCNN performance on semantic segmentation tasks.

Abstract

Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. Recent revolutionary results of deep convolutional neural networks (DCNNs) foreshadow the advent of reliable classifiers to perform such visual tasks. However, DCNNs require learning of many parameters from raw images, thus, having a sufficient amount of diverse images with class annotations is needed. These annotations are obtained via cumbersome, human labour which is particularly challenging for semantic segmentation since pixel-level annotations are required. In this paper, we propose to use a virtual world to automatically generate realistic synthetic images with pixel-level annotations. Then, we address the question of how useful such data can be for semantic segmentation - in particular, when using a DCNN paradigm. In order to answer this question we have generated a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations. We use SYNTHIA in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments with DCNNs that show how the inclusion of SYNTHIA in the training stage significantly improves performance on the semantic segmentation task.

References

Page 1

	Year	Citations

Page 1