Publication | Open Access
Fast Computation of Wasserstein Barycenters
462
Citations
27
References
2013
Year
Empirical Probability MeasuresEngineeringMachine LearningWasserstein BarycentersOptimal TransportUnsupervised Machine LearningSubgradient MethodImage AnalysisData SciencePattern RecognitionMathematical ModellingPublic HealthRegularization (Mathematics)Approximation TheoryManifold LearningInverse ProblemsComputer ScienceDeep LearningMedical Image ComputingReproducing Kernel MethodConvex OptimizationKernel MethodWasserstein Distance
The Wasserstein barycenter is the measure minimizing the sum of its Wasserstein distances to each element in a set, but a direct implementation of algorithms to compute it is too costly due to repeated resolution of large optimal transport problems. The authors aim to develop efficient algorithms for computing Wasserstein barycenters of empirical probability measures. They propose two subgradient‑based algorithms that smooth the Wasserstein distance with an entropic regularizer, yielding a strictly convex objective whose gradients are efficiently computed via matrix scaling, thereby avoiding repeated resolution of large optimal transport problems. The algorithms successfully visualize a large family of images and solve a constrained clustering problem.
We present new algorithms to compute the mean of a set of empirical probability measures under the optimal transport metric. This mean, known as the Wasserstein barycenter, is the measure that minimizes the sum of its Wasserstein distances to each element in that set. We propose two original algorithms to compute Wasserstein barycenters that build upon the subgradient method. A direct implementation of these algorithms is, however, too costly because it would require the repeated resolution of large primal and dual optimal transport problems to compute subgradients. Extending the work of Cuturi (2013), we propose to smooth the Wasserstein distance used in the definition of Wasserstein barycenters with an entropic regularizer and recover in doing so a strictly convex objective whose gradients can be computed for a considerably cheaper computational cost using matrix scaling algorithms. We use these algorithms to visualize a large family of images and to solve a constrained clustering problem.
| Year | Citations | |
|---|---|---|
Page 1
Page 1