Concepedia

TLDR

Domain adaptation remains a difficult challenge, yet correctly aligning data representations can make models robust across different observation systems, especially when domain‑invariant representations are used. This work introduces a regularized unsupervised optimal transport framework to align source and target representations. The method learns a transport plan that matches the source and target probability density functions while keeping same‑class source samples close, thereby leveraging both labeled source data and the overall distributions. Experiments on synthetic and real visual adaptation tasks demonstrate that the approach consistently outperforms state‑of‑the‑art methods, improves performance on domain‑invariant deep‑learning features, and can be extended to semi‑supervised settings with few target labels.

Abstract

Domain adaptation is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data representation become more robust when confronted to data depicting the same classes, but described by another observation system. Among the many strategies proposed, finding domain-invariant representations has shown excellent properties, in particular since it allows to train a unique classifier effective in all domains. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains. We learn a transportation plan matching both PDFs, which constrains labeled samples of the same class in the source domain to remain close during transport. This way, we exploit at the same time the labeled samples in the source and the distributions observed in both domains. Experiments on toy and challenging real visual adaptation examples show the interest of the method, that consistently outperforms state of the art approaches. In addition, numerical experiments show that our approach leads to better performances on domain invariant deep learning features and can be easily adapted to the semi-supervised case where few labeled samples are available in the target domain.

References

YearCitations

Page 1