Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

OT-net: a reusable neural optimal transport solver

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

With the widespread application of optimal transport (OT), its calculation becomes essential, and various algorithms have emerged. However, the existing methods either have low efficiency or cannot represent discontinuous maps. A novel reusable neural OT solver OT-Net is thus presented, which first learns Brenier’s height representation via the neural network to get its potential, and then obtains the OT map by the gradient of the potential. The algorithm has two merits: (1) When new target samples are added, the OT map can be calculated straightly, which greatly improves the efficiency and reusability of the map. (2) It can easily represent discontinuous maps, which allows it to match any target distribution with discontinuous supports and achieve sharp boundaries, and thus eliminate mode collapse. Moreover, we conducted error analyses on the proposed algorithm and demonstrated the empirical success of our approach in image generation, color transfer, and domain adaptation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and material

The data/reanalysis that supports the findings of this study are publicly available online at http://yann.lecun.com/exdb/mnist/, and https://github.com/zalandoresearch/fashion-mnist, and http://www.cs.toronto.edu/kriz/cifar.html, and http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html.

Code availability

The code can be obtained by contacting Shenghao Li and Zezeng Li.

Notes

  1. https://pythonot.github.io/#

References

  • Abbasnejad, M. E., Shi, Q., Abbasnejad, I., Hengel, A. V. D., & Dick, A. (2017). Bayesian conditional generative adverserial networks. arXiv:1706.05477.

  • Alexandrov, A. D. (2005). Convex polyhedra (Vol. 109). Springer.

    Google Scholar 

  • Alvarez-Melis, D., Jaakkola, T., & Jegelka, S. (2018). Structured optimal transport. In International Conference on Artificial Intelligence and Statistics, pp. 1771–1780

  • An, D. et al. (2019). Ae-ot: A new generative model based on extended semi-discrete optimal transport. In ICLR 2020.

  • An, D., Lei, N., Xu, X., & Gu, X. (2022). Efficient optimal transport algorithm by accelerated gradient descent. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 10119–10128.

    Article  PubMed  Google Scholar 

  • Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein gan. arXiv:1701.07875.

  • Asadulaev, A., Korotin, A., Egiazarian, V., & Burnaev, E. (2022). Neural optimal transport with general cost functionals. arXiv:2205.15403.

  • Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37, A1111–A1138.

    Article  MathSciNet  Google Scholar 

  • Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv:1703.10717.

  • Bojanowski, P., Joulin, A., Lopez-Paz, D., & Szlam, A. (2017). Optimizing the latent space of generative networks. arXiv:1707.05776.

  • Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011). Displacement interpolation using Lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia Conference, pp. 1–12

  • Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011). Displacement interpolation using Lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia Conference, pp. 1–12

  • Bonneel, N., & Coeurjolly, D. (2019). Spot: Sliced partial optimal transport. ACM Transactions on Graphics (TOG), 38, 1–13.

    Article  Google Scholar 

  • Bonneel, N., Peyré, G., & Cuturi, M. (2016). Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Transactions on Graphics, 35, 71–1.

    Article  Google Scholar 

  • Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44, 375–417.

    Article  MathSciNet  Google Scholar 

  • Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44, 375–417.

    Article  MathSciNet  Google Scholar 

  • Chang, W., Shi, Y., Tuan, H., & Wang, J. (2022). Unified optimal transport framework for universal domain adaptation. Advances in Neural Information Processing Systems, 35, 29512–29524.

    Google Scholar 

  • Chen, Y. et al. (2019). A gradual, semi-discrete approach to generative network training via explicit wasserstein minimization. In International Conference on Machine Learning, pp. 1071–1080.

  • Chen, S., & Figalli, A. (2017). Partial w2, p regularity for optimal transport maps. Journal of Functional Analysis, 272, 4588–4605.

    Article  MathSciNet  Google Scholar 

  • Chuang, C.-Y., Jegelka, S., & Alvarez-Melis, D. (2023). Infoot: Information maximizing optimal transport. In International Conference on Machine Learning, pp. 6228–6242.

  • Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. In Advances in neural information processing systems (Vol. 30). MIT Press.

  • Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in neural information processing systems (Vol. 26). ACM.

  • Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D. & Courty, N. (2018). Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 447–463.

  • Daniels, M., Maunu, T., & Hand, P. (2021). Score-based generative neural networks for large-scale optimal transport. Advances in Neural Information Processing Systems, 34, 12955–12965.

    Google Scholar 

  • Dumoulin, V., et al. (2016). Adversarially learned inference. arXiv:1606.00704

  • Dvurechensky, P., Gasnikov, A., & Kroshnin, A. (2018). Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In International Conference on Machine Learning, pp. 1367–1376.

  • Fan, J., Liu, S., Ma, S., Chen, Y., & Zhou, H. (2021). Scalable computation of monge maps with general costs. arXiv:2106.03812 4

  • Fedus, W., et al. (2017). Many paths to equilibrium: Gans do not need to decrease a divergence at every step. arXiv:1710.08446.

  • Ferradans, S., Papadakis, N., Peyré, G., & Aujol, J.-F. (2014). Regularized discrete optimal transport. SIAM Journal on Imaging Sciences, 7, 1853–1882.

    Article  MathSciNet  Google Scholar 

  • Flamary, R., Courty, N., Tuia, D., & Rakotomamonjy, A. (2016). Optimal transport for domain adaptation. In IEEE transactions on pattern analysis and machine intelligence (Vol. 1). MIT Press.

  • Flamary, R., et al. (2021). Pot: Python optimal transport. The Journal of Machine Learning Research, 22, 3571–3578.

    Google Scholar 

  • Gazdieva, M., Rout, L., Korotin, A., Filippov, A., & Burnaev, E. Unpaired image super-resolution with optimal transport maps. arXiv:2202.01116 (2022).

  • Golla, T., Kneiphof, T., Kuhlmann, H., Weinmann, M., & Klein, R. (2020). Temporal upsampling of point cloud sequences by optimal transport for plant growth visualization. Computer Graphics Forum, 39, 167–179.

    Article  Google Scholar 

  • Gu, X., Luo, F., Sun, J., & Yau, S.-T. (2016). Variational principles for minkowski type problems, discrete optimal transport, and discrete monge-ampère equations. Asian Journal of Mathematics, 20, 383–398.

    Article  MathSciNet  Google Scholar 

  • Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein gans. In Advances in neural information processing systems (Vol. 30). MIT Press.

  • Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems (Vol. 30). MIT Press.

  • Hoshen, Y., Li, K., & Malik, J. (2019). Non-adversarial image synthesis with generative latent nearest neighbors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5811–5819

  • Kantorovich, L. (1942). On the transfer of masses. Doklady Akademii Nauk, 37, 227. in russian.

    Google Scholar 

  • Kingma, D. P., & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).

  • Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv:1312.6114.

  • Korotin, A., Selikhanovych, D., & Burnaev, E. (2022). Neural optimal transport. arXiv:2201.12220

  • Krizhevsky, A., & Hinton, G. et al. (2009). Learning multiple layers of features from tiny images. Master’s thesis, University of Tront.

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278–2324.

    Article  Google Scholar 

  • Lei, N., et al. (2020). A geometric understanding of deep learning. Engineering, 6, 361–374.

    Article  Google Scholar 

  • Li, Z., Wang, W., Lei, N., & Wang, R. (2022). Weakly supervised point cloud upsampling via optimal transport. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2564–2568.

  • Li, Z., Lei, N., Shi, J., & Xue, H. (2022). Real-world super-resolution under the guidance of optimal transport. Machine Vision and Applications, 33, 48.

    Article  CAS  Google Scholar 

  • Lin, Z., Khetan, A., Fanti, G., & Oh, S. (2018). Pacgan: The power of two samples in generative adversarial networks. In Advances in neural information processing systems (Vol. 31). MIT Press.

  • Liu, H., Gu, X. & Samaras, D. (2019). Wasserstein gan with quadratic transport cost. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4832–4841

  • Lucic, M., Kurach, K., Michalski, M., Gelly, S., & Bousquet, O. (2018). Are gans created equal? A large-scale study. In Advances in neural information processing systems (Vol. 31). MIT Press.

  • Makkuva, A., Taghvaei, A., Oh, S., & Lee, J. (2020). Optimal transport mapping via input convex neural networks. In International Conference on Machine Learning, pp. 6672–6681

  • Mao, X., et al. (2017). Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision pp. 2794–2802

  • Metropolis, N., & Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44, 335–341.

    Article  MathSciNet  CAS  PubMed  Google Scholar 

  • Metz, L., Poole, B., Pfau, D., & Sohl-Dickstein, J. (2016). Unrolled generative adversarial networks. arXiv:1611.02163.

  • Monge, G. (1781). Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris.

  • Petzka, H., Fischer, A., & Lukovnicov, D. (2017). On the regularization of wasserstein gans. arXiv:1709.08894.

  • Rakotomamonjy, A., Flamary, R., Gasso, G., El Alaya, M., Berar, M., & Courty, N. (2022). Optimal transport for conditional domain matching and label shift. Machine Learning, 111, 1651–1670.

  • Rosca, M., Lakshminarayanan, B., Warde-Farley, D., & Mohamed, S. (2017). Variational approaches for auto-encoding generative adversarial networks. arXiv:1706.04987.

  • Rout, L., Korotin, A., & Burnaev, E. Generative modeling with optimal transport maps. arXiv:2110.02999 (2021).

  • Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training gans. In Advances in neural information processing systems (Vol. 29). MIT Press.

  • Sanjabi, M., Ba, J., Razaviyayn, M., & Lee, J. D. (2018). On the convergence and robustness of training gans with regularized optimal transport. In Advances in neural information processing systems (Vol. 31). MIT Press.

  • Seguy, V. et al. (2018). Large-scale optimal transport and mapping estimation. In ICLR 2018-International Conference on Learning Representations 1–15.

  • Srivastava, A., Valkov, L., Russell, C., Gutmann, M. U., & Sutton, C. (2017). Veegan: Reducing mode collapse in gans using implicit variational learning. In Advances in neural information processing systems (Vol. 30). MIT Press.

  • Strössner, C., & Kressner, D. (2023). Low-rank tensor approximations for solving multimarginal optimal transport problems. SIAM Journal on Imaging Sciences, 16, 169–191.

    Article  MathSciNet  Google Scholar 

  • Tran, Q. H., et al. (2023). Unbalanced co-optimal transport. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 10006–10016.

    Article  Google Scholar 

  • Wang, W., Xu, H., Wang, G., Wang, W., & Carin, L. (2021). Zero-shot recognition via optimal transport. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3471–3481.

  • Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747

  • Xie, Y., Wang, X., Wang, R., & Zha, H. (2020). A fast proximal point method for computing exact wasserstein distance. In Uncertainty in artificial intelligence (pp. 433–453). PMLR.

  • Zhai, S., Cheng, Y., Feris, R., & Zhang, Z. (2016). Generative adversarial networks as variational training of energy based models. arXiv:1611.01799

  • Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2018). From facial expression recognition to interpersonal relation prediction. International Journal of Computer Vision, 126, 550–569.

    Article  MathSciNet  Google Scholar 

Download references

Funding

This research was supported by the National Key R &D Program of China (2021YFA1003003), and the National Natural Science Foundation of China under Grant (61936002, T2225012).

Author information

Authors and Affiliations

Authors

Contributions

ZL provided original ideas and code implementation of the proposed algorithm. SL was responsible for most of the experimental validation and manuscript writing. LJ, NL, and ZL provided constructive ideas for theoretical derivation and experimental setup. All authors participated in the writing of the manuscript, and read and approved the final manuscript.

Corresponding author

Correspondence to Na Lei.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent for publication

The authors of this manuscript consent to its publication.

Ethics approval and Consent to participate

The authors declare that this research did not require Ethics approval or Consent to participate since it does not concern human participants or human or animal datasets.

Additional information

Editor: Hendrik Blockeel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

This section introduces the Encoder–Decoder architecture that is applied to the generative model and height representation network architecture of our algorithm. The autoencoder network structures are presented in Tables 7 and 8, and the height representation network structure can be found in Table 9, where \(P_{data}\) and \(P_{latent}\) represent the data distribution and the latent coding distribution, respectively. The Encoder–Decoder architecture was trained using the Adam algorithm with mini-batches of size 512, and learning rates of 2e-4, 1e-4, and 2e-5 in FASHION-MNIST Xiao et al. (2017), Cifar-10 Krizhevsky et al. (2009), and CelebA Zhang et al. (2018), respectively. The height representation network was also trained using Adam with mini-batches of size 512, and learning rates of 0.004, 0.005, and 0.005 in FASHION-MNIST, Cifar-10, and CelebA, respectively.

Table 7 The Encoder architecture for CelebA, FASHION-MNIST and Cifar-10
Table 8 The Decoder architecture for CelebA, FASHION-MNIST and Cifar-10
Table 9 The height representation architecture for OT-Net

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Li, S., Jin, L. et al. OT-net: a reusable neural optimal transport solver. Mach Learn 113, 1243–1268 (2024). https://doi.org/10.1007/s10994-023-06493-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06493-9

Keywords