Abstract
With the widespread application of optimal transport (OT), its calculation becomes essential, and various algorithms have emerged. However, the existing methods either have low efficiency or cannot represent discontinuous maps. A novel reusable neural OT solver OT-Net is thus presented, which first learns Brenier’s height representation via the neural network to get its potential, and then obtains the OT map by the gradient of the potential. The algorithm has two merits: (1) When new target samples are added, the OT map can be calculated straightly, which greatly improves the efficiency and reusability of the map. (2) It can easily represent discontinuous maps, which allows it to match any target distribution with discontinuous supports and achieve sharp boundaries, and thus eliminate mode collapse. Moreover, we conducted error analyses on the proposed algorithm and demonstrated the empirical success of our approach in image generation, color transfer, and domain adaptation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and material
The data/reanalysis that supports the findings of this study are publicly available online at http://yann.lecun.com/exdb/mnist/, and https://github.com/zalandoresearch/fashion-mnist, and http://www.cs.toronto.edu/kriz/cifar.html, and http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html.
Code availability
The code can be obtained by contacting Shenghao Li and Zezeng Li.
References
Abbasnejad, M. E., Shi, Q., Abbasnejad, I., Hengel, A. V. D., & Dick, A. (2017). Bayesian conditional generative adverserial networks. arXiv:1706.05477.
Alexandrov, A. D. (2005). Convex polyhedra (Vol. 109). Springer.
Alvarez-Melis, D., Jaakkola, T., & Jegelka, S. (2018). Structured optimal transport. In International Conference on Artificial Intelligence and Statistics, pp. 1771–1780
An, D. et al. (2019). Ae-ot: A new generative model based on extended semi-discrete optimal transport. In ICLR 2020.
An, D., Lei, N., Xu, X., & Gu, X. (2022). Efficient optimal transport algorithm by accelerated gradient descent. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 10119–10128.
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein gan. arXiv:1701.07875.
Asadulaev, A., Korotin, A., Egiazarian, V., & Burnaev, E. (2022). Neural optimal transport with general cost functionals. arXiv:2205.15403.
Benamou, J.-D., Carlier, G., Cuturi, M., Nenna, L., & Peyré, G. (2015). Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37, A1111–A1138.
Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv:1703.10717.
Bojanowski, P., Joulin, A., Lopez-Paz, D., & Szlam, A. (2017). Optimizing the latent space of generative networks. arXiv:1707.05776.
Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011). Displacement interpolation using Lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia Conference, pp. 1–12
Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011). Displacement interpolation using Lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia Conference, pp. 1–12
Bonneel, N., & Coeurjolly, D. (2019). Spot: Sliced partial optimal transport. ACM Transactions on Graphics (TOG), 38, 1–13.
Bonneel, N., Peyré, G., & Cuturi, M. (2016). Wasserstein barycentric coordinates: Histogram regression using optimal transport. ACM Transactions on Graphics, 35, 71–1.
Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44, 375–417.
Brenier, Y. (1991). Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44, 375–417.
Chang, W., Shi, Y., Tuan, H., & Wang, J. (2022). Unified optimal transport framework for universal domain adaptation. Advances in Neural Information Processing Systems, 35, 29512–29524.
Chen, Y. et al. (2019). A gradual, semi-discrete approach to generative network training via explicit wasserstein minimization. In International Conference on Machine Learning, pp. 1071–1080.
Chen, S., & Figalli, A. (2017). Partial w2, p regularity for optimal transport maps. Journal of Functional Analysis, 272, 4588–4605.
Chuang, C.-Y., Jegelka, S., & Alvarez-Melis, D. (2023). Infoot: Information maximizing optimal transport. In International Conference on Machine Learning, pp. 6228–6242.
Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. In Advances in neural information processing systems (Vol. 30). MIT Press.
Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in neural information processing systems (Vol. 26). ACM.
Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D. & Courty, N. (2018). Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 447–463.
Daniels, M., Maunu, T., & Hand, P. (2021). Score-based generative neural networks for large-scale optimal transport. Advances in Neural Information Processing Systems, 34, 12955–12965.
Dumoulin, V., et al. (2016). Adversarially learned inference. arXiv:1606.00704
Dvurechensky, P., Gasnikov, A., & Kroshnin, A. (2018). Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. In International Conference on Machine Learning, pp. 1367–1376.
Fan, J., Liu, S., Ma, S., Chen, Y., & Zhou, H. (2021). Scalable computation of monge maps with general costs. arXiv:2106.03812 4
Fedus, W., et al. (2017). Many paths to equilibrium: Gans do not need to decrease a divergence at every step. arXiv:1710.08446.
Ferradans, S., Papadakis, N., Peyré, G., & Aujol, J.-F. (2014). Regularized discrete optimal transport. SIAM Journal on Imaging Sciences, 7, 1853–1882.
Flamary, R., Courty, N., Tuia, D., & Rakotomamonjy, A. (2016). Optimal transport for domain adaptation. In IEEE transactions on pattern analysis and machine intelligence (Vol. 1). MIT Press.
Flamary, R., et al. (2021). Pot: Python optimal transport. The Journal of Machine Learning Research, 22, 3571–3578.
Gazdieva, M., Rout, L., Korotin, A., Filippov, A., & Burnaev, E. Unpaired image super-resolution with optimal transport maps. arXiv:2202.01116 (2022).
Golla, T., Kneiphof, T., Kuhlmann, H., Weinmann, M., & Klein, R. (2020). Temporal upsampling of point cloud sequences by optimal transport for plant growth visualization. Computer Graphics Forum, 39, 167–179.
Gu, X., Luo, F., Sun, J., & Yau, S.-T. (2016). Variational principles for minkowski type problems, discrete optimal transport, and discrete monge-ampère equations. Asian Journal of Mathematics, 20, 383–398.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein gans. In Advances in neural information processing systems (Vol. 30). MIT Press.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017). Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems (Vol. 30). MIT Press.
Hoshen, Y., Li, K., & Malik, J. (2019). Non-adversarial image synthesis with generative latent nearest neighbors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5811–5819
Kantorovich, L. (1942). On the transfer of masses. Doklady Akademii Nauk, 37, 227. in russian.
Kingma, D. P., & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv:1312.6114.
Korotin, A., Selikhanovych, D., & Burnaev, E. (2022). Neural optimal transport. arXiv:2201.12220
Krizhevsky, A., & Hinton, G. et al. (2009). Learning multiple layers of features from tiny images. Master’s thesis, University of Tront.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278–2324.
Lei, N., et al. (2020). A geometric understanding of deep learning. Engineering, 6, 361–374.
Li, Z., Wang, W., Lei, N., & Wang, R. (2022). Weakly supervised point cloud upsampling via optimal transport. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2564–2568.
Li, Z., Lei, N., Shi, J., & Xue, H. (2022). Real-world super-resolution under the guidance of optimal transport. Machine Vision and Applications, 33, 48.
Lin, Z., Khetan, A., Fanti, G., & Oh, S. (2018). Pacgan: The power of two samples in generative adversarial networks. In Advances in neural information processing systems (Vol. 31). MIT Press.
Liu, H., Gu, X. & Samaras, D. (2019). Wasserstein gan with quadratic transport cost. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4832–4841
Lucic, M., Kurach, K., Michalski, M., Gelly, S., & Bousquet, O. (2018). Are gans created equal? A large-scale study. In Advances in neural information processing systems (Vol. 31). MIT Press.
Makkuva, A., Taghvaei, A., Oh, S., & Lee, J. (2020). Optimal transport mapping via input convex neural networks. In International Conference on Machine Learning, pp. 6672–6681
Mao, X., et al. (2017). Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision pp. 2794–2802
Metropolis, N., & Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44, 335–341.
Metz, L., Poole, B., Pfau, D., & Sohl-Dickstein, J. (2016). Unrolled generative adversarial networks. arXiv:1611.02163.
Monge, G. (1781). Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris.
Petzka, H., Fischer, A., & Lukovnicov, D. (2017). On the regularization of wasserstein gans. arXiv:1709.08894.
Rakotomamonjy, A., Flamary, R., Gasso, G., El Alaya, M., Berar, M., & Courty, N. (2022). Optimal transport for conditional domain matching and label shift. Machine Learning, 111, 1651–1670.
Rosca, M., Lakshminarayanan, B., Warde-Farley, D., & Mohamed, S. (2017). Variational approaches for auto-encoding generative adversarial networks. arXiv:1706.04987.
Rout, L., Korotin, A., & Burnaev, E. Generative modeling with optimal transport maps. arXiv:2110.02999 (2021).
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training gans. In Advances in neural information processing systems (Vol. 29). MIT Press.
Sanjabi, M., Ba, J., Razaviyayn, M., & Lee, J. D. (2018). On the convergence and robustness of training gans with regularized optimal transport. In Advances in neural information processing systems (Vol. 31). MIT Press.
Seguy, V. et al. (2018). Large-scale optimal transport and mapping estimation. In ICLR 2018-International Conference on Learning Representations 1–15.
Srivastava, A., Valkov, L., Russell, C., Gutmann, M. U., & Sutton, C. (2017). Veegan: Reducing mode collapse in gans using implicit variational learning. In Advances in neural information processing systems (Vol. 30). MIT Press.
Strössner, C., & Kressner, D. (2023). Low-rank tensor approximations for solving multimarginal optimal transport problems. SIAM Journal on Imaging Sciences, 16, 169–191.
Tran, Q. H., et al. (2023). Unbalanced co-optimal transport. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 10006–10016.
Wang, W., Xu, H., Wang, G., Wang, W., & Carin, L. (2021). Zero-shot recognition via optimal transport. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3471–3481.
Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747
Xie, Y., Wang, X., Wang, R., & Zha, H. (2020). A fast proximal point method for computing exact wasserstein distance. In Uncertainty in artificial intelligence (pp. 433–453). PMLR.
Zhai, S., Cheng, Y., Feris, R., & Zhang, Z. (2016). Generative adversarial networks as variational training of energy based models. arXiv:1611.01799
Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2018). From facial expression recognition to interpersonal relation prediction. International Journal of Computer Vision, 126, 550–569.
Funding
This research was supported by the National Key R &D Program of China (2021YFA1003003), and the National Natural Science Foundation of China under Grant (61936002, T2225012).
Author information
Authors and Affiliations
Contributions
ZL provided original ideas and code implementation of the proposed algorithm. SL was responsible for most of the experimental validation and manuscript writing. LJ, NL, and ZL provided constructive ideas for theoretical derivation and experimental setup. All authors participated in the writing of the manuscript, and read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent for publication
The authors of this manuscript consent to its publication.
Ethics approval and Consent to participate
The authors declare that this research did not require Ethics approval or Consent to participate since it does not concern human participants or human or animal datasets.
Additional information
Editor: Hendrik Blockeel.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
This section introduces the Encoder–Decoder architecture that is applied to the generative model and height representation network architecture of our algorithm. The autoencoder network structures are presented in Tables 7 and 8, and the height representation network structure can be found in Table 9, where \(P_{data}\) and \(P_{latent}\) represent the data distribution and the latent coding distribution, respectively. The Encoder–Decoder architecture was trained using the Adam algorithm with mini-batches of size 512, and learning rates of 2e-4, 1e-4, and 2e-5 in FASHION-MNIST Xiao et al. (2017), Cifar-10 Krizhevsky et al. (2009), and CelebA Zhang et al. (2018), respectively. The height representation network was also trained using Adam with mini-batches of size 512, and learning rates of 0.004, 0.005, and 0.005 in FASHION-MNIST, Cifar-10, and CelebA, respectively.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Z., Li, S., Jin, L. et al. OT-net: a reusable neural optimal transport solver. Mach Learn 113, 1243–1268 (2024). https://doi.org/10.1007/s10994-023-06493-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-023-06493-9