Transfer Beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

Abstract

Autonomous vehicles clearly benefit from the expanded Field of View (FoV) of 360° sensors, but modern semantic segmentation approaches rely heavily on annotated training data which is rarely available for <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">panoramic</i> images. We look at this problem from the perspective of domain adaptation and bring <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">panoramic</i> semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">pinhole</i> camera images. To achieve this, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation and collect DensePass - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\rightarrow$ </tex-math></inline-formula> PANORAMIC domain shift and accompanied with pinhole camera training examples obtained from Cityscapes. DensePass covers both, labelled- and unlabelled 360° images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> pinhole) domain. Since data-driven models are especially susceptible to changes in data distribution, we introduce P2PDA - a generic framework for Pinhole <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\rightarrow$ </tex-math></inline-formula> Panoramic semantic segmentation which addresses the challenge of domain divergence with different variants of attention-augmented domain adaptation modules, enabling the transfer in output-, feature-, and feature confidence spaces. P2PDA intertwines uncertainty-aware adaptation using confidence values regulated on-the-fly through attention heads with discrepant predictions. Our framework facilitates context exchange when learning domain correspondences and dramatically improves the adaptation performance of accuracy- and efficiency-focused models. Comprehensive experiments verify that our framework clearly surpasses unsupervised domain adaptation- and specialized panoramic segmentation approaches as well as state-of-the-art semantic segmentation methods.

References

Page 1

	Year	Citations

Page 1