Concepedia

TLDR

The Wasserstein distance is a valuable tool for data analysis, but statistical inference is hampered by the lack of known distributional limits. The study derives the asymptotic distribution of empirical Wasserstein distances for finitely supported probability measures, formulating it as the optimal value of a random linear program. This derivation relies on directional Hadamard differentiability to establish the distributional limits. The findings provide confidence intervals for empirical Wasserstein distances, expose bootstrap failures, and are illustrated on two datasets.

Abstract

Summary The Wasserstein distance is an attractive tool for data analysis but statistical inference is hindered by the lack of distributional limits. To overcome this obstacle, for probability measures supported on finitely many points, we derive the asymptotic distribution of empirical Wasserstein distances as the optimal value of a linear programme with random objective function. This facilitates statistical inference (e.g. confidence intervals for sample-based Wasserstein distances) in large generality. Our proof is based on directional Hadamard differentiability. Failure of the classical bootstrap and alternatives are discussed. The utility of the distributional results is illustrated on two data sets.

References

YearCitations

Page 1