Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
review-article

Computational Optimal Transport: : With Applications to Data Science

Published: 12 February 2019 Publication History

Abstract

Optimal transport (OT) theory can be informally described using the words of the French mathematician Gaspard Monge (1746–1818): A worker with a shovel in hand has to move a large pile of sand lying on a construction site. The goal of the worker is to erect with all that sand a target pile with a prescribed shape (for example, that of a giant sand castle). Naturally, the worker wishes to minimize her total effort, quantified for instance as the total distance or time spent carrying shovelfuls of sand. Mathematicians interested in OT cast that problem as that of comparing two probability distributions—two different piles of sand of the same volume. They consider all of the many possible ways to morph, transport or reshape the first pile into the second, and associate a “global” cost to every such transport, using the “local” consideration of how much it costs to move a grain of sand from one place to another. Mathematicians are interested in the properties of that least costly transport, as well as in its efficient computation. That smallest cost not only defines a distance between distributions, but it also entails a rich geometric structure on the space of probability distributions. That structure is canonical in the sense that it borrows key geometric properties of the underlying “ground” space on which these distributions are defined. For instance, when the underlying space is Euclidean, key concepts such as interpolation, barycenters, convexity or gradients of functions extend naturally to the space of distributions endowed with an OT geometry. OT has been (re)discovered in many settings and under different forms, giving it a rich history. While Monge’s seminal work was motivated by an engineering problem, Tolstoi in the 1920s and Hitchcock, Kantorovich and Koopmans in the 1940s established its significance to logistics and economics. Dantzig solved it numerically in 1949 within the framework of linear programming, giving OT a firm footing in optimization. OT was later revisited by analysts in the 1990s, notably Brenier, while also gaining fame in computer vision under the name of earth mover’s distances. Recent years have witnessed yet another revolution in the spread of OT, thanks to the emergence of approximate solvers that can scale to large problem dimensions. As a consequence, OT is being increasingly used to unlock various problems in imaging sciences (such as color or texture processing), graphics (for shape manipulation) or machine learning (for regression, classification and generative modeling). This paper reviews OT with a bias toward numerical methods, and covers the theoretical properties of OT that can guide the design of new algorithms.We focus in particular on the recent wave of efficient algorithms that have helped OT find relevance in data sciences. We give a prominent place to the many generalizations of OT that have been proposed in but a few years, and connect them with related approaches originating from statistical inference, kernel methods and information theory. All of the figures can be reproduced using code made available on a companion website. This website hosts the book project Computational Optimal Transport. You will also find slides and computational resources.

Cited By

View all
  • (2024)GeONetProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702744(1453-1478)Online publication date: 15-Jul-2024
  • (2024)Reflected schrödinger bridge for constrained generative modelingProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702725(1055-1082)Online publication date: 15-Jul-2024
  • (2024)Switched flow matchingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694655(62443-62475)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Foundations and Trends® in Machine Learning
Foundations and Trends® in Machine Learning  Volume 11, Issue 5-6
Feb 2019
257 pages
ISSN:1935-8237
EISSN:1935-8245
Issue’s Table of Contents

Publisher

Now Publishers Inc.

Hanover, MA, United States

Publication History

Published: 12 February 2019

Qualifiers

  • Review-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)GeONetProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702744(1453-1478)Online publication date: 15-Jul-2024
  • (2024)Reflected schrödinger bridge for constrained generative modelingProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702725(1055-1082)Online publication date: 15-Jul-2024
  • (2024)Switched flow matchingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694655(62443-62475)Online publication date: 21-Jul-2024
  • (2024)Reducing balancing error for causal inference via optimal transportProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694375(55913-55927)Online publication date: 21-Jul-2024
  • (2024)On a neural implementation of brenier's polar factorizationProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694090(49434-49454)Online publication date: 21-Jul-2024
  • (2024)Statistically optimal generative modeling with maximum deviation from the empirical distributionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694082(49203-49225)Online publication date: 21-Jul-2024
  • (2024)Sample complexity bounds for estimating probability divergences under invariancesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694001(47396-47417)Online publication date: 21-Jul-2024
  • (2024)OT-CLIPProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693897(44865-44886)Online publication date: 21-Jul-2024
  • (2024)DySLIMProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693848(43649-43684)Online publication date: 21-Jul-2024
  • (2024)A fixed-point approach for causal generative modelingProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693843(43504-43541)Online publication date: 21-Jul-2024
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media