Behind Every Domain There is a Shift: Adapting Distortion-Aware Vision Transformers for Panoramic Semantic Segmentation

Abstract

In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$360^\circ$</tex-math></inline-formula> imagery. To tackle these problems, first, we propose the upgraded Transformer for Panoramic Semantic Segmentation, ie, Trans4PASS+, equipped with <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Deformable Patch Embedding (DPE) and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Deformable MLP (DMLPv2) modules for handling object deformations and image distortions whenever (before or after adaptation) and wherever (shallow or deep levels). Second, we enhance the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Mutual Prototypical Adaptation (MPA) strategy via pseudo-label rectification for unsupervised domain adaptive panoramic segmentation. Third, aside from Pinhole-to-Panoramic ( <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Pin2Pan ) adaptation, we create a new dataset (SynPASS) with 9,080 panoramic images, facilitating Synthetic-to-Real ( <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Syn2Real ) adaptation scheme in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$360^\circ$</tex-math></inline-formula> imagery. Extensive experiments are conducted, which cover indoor and outdoor scenarios, and each of them is investigated with <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Pin2Pan and <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Syn2Real regimens. Trans4PASS+ achieves state-of-the-art performances on four domain adaptive panoramic semantic segmentation benchmarks. Code is available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/jamycheung/Trans4PASS</uri> .

References

Page 1

	Year	Citations

Page 1