research-article

H-TSP: hierarchically solving the large-scale traveling salesman problem

AUTHORs:

Jiang BianAuthors Info & Claims

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Article No.: 1051, Pages 9345 - 9353

https://doi.org/10.1609/aaai.v37i8.26120

Published: 07 February 2023 Publication History

Abstract

We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Traveling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10,000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.

References

[1]

Applegate, D. L.; Bixby, R. E.; Chvatal, V.; Cook, W.; Espinoza, D. G.; Goycoolea, M.; and Helsgaun, K. 2009. Certification of an optimal TSP tour through 85,900 cities. Operations Research Letters, 37(1): 11-15.

Digital Library

[2]

Bello, I.; Pham, H.; Le, Q. V.; Norouzi, M.; and Bengio, S. 2017. Neural combinatorial optimization with reinforcement learning. 5th International Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings.

[3]

Bengio, Y.; Lodi, A.; and Prouvost, A. 2021. Machine learning for combinatorial optimization: A methodological tour d'horizon. European Journal of Operational Research, 290(2): 405-421.

[4]

da Costa, P. R. d. O.; Rhuggenaath, J.; Zhang, Y.; and Akcay, A. 2020. Learning 2-Opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning. In Pan, S. J.; and Sugiyama, M., eds., Proceedings of the 12th Asian Conference on Machine Learning, volume 129 of Proceedings of Machine Learning Research, 465-480. PMLR.

[5]

Dai, H.; Khalil, E. B.; Zhang, Y.; Dilkina, B.; and Song, L. 2017. Learning combinatorial optimization algorithms over graphs. Advances in Neural Information Processing Systems, 2017-December: 6349-6359.

[6]

David L. Applegate; Robert E. Bixby; Vašek Chvátal; and William J. Cook. 2007. The Traveling Salesman Problem: A Computational Study. Princeton University Press.

[7]

Fu, Z.-H.; Qiu, K.-B.; and Zha, H. 2021. Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8): 7474-7482.

[8]

Ghiani, G.; Guerriero, F.; Laporte, G.; and Musmanno, R. 2003. Real-time vehicle routing: Solution concepts, algorithms and parallel computing strategies. European Journal of Operational Research, 151(1): 1-11.

[9]

Guo, T.; Han, C.; Tang, S.; and Ding, M. 2019. Solving Combinatorial Problems with Machine Learning Methods. Springer Optimization and Its Applications, 147: 207-229.

[10]

Helsgaun, K. 2017. An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. Roskilde: Roskilde University.

[11]

Kool, W.; Van Hoof, H.; and Welling, M. 2019. Attention, learn to solve routing problems! 7th International Conference on Learning Representations, ICLR 2019.

[12]

Kwon, Y. D.; Choo, J.; Kim, B.; Yoon, I.; Gwon, Y.; and Min, S. 2020. POMO: Policy optimization with multiple optima for reinforcement learning. Advances in Neural Information Processing Systems, 2020-December.

[13]

Lang, A. H.; Vora, S.; Caesar, H.; Zhou, L.; Yang, J.; and Beijbom, O. 2019. PointPillars: Fast Encoders for Object Detection From Point Clouds. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12689-12697.

[14]

Ma, Y.; Hao, X.; Hao, J.; Lu, J.; Liu, X.; Xialiang, T.; Yuan, M.; Li, Z.; Tang, J.; and Meng, Z. 2021. A Hierarchical Reinforcement Learning Based Optimization Framework for Large-Scale Dynamic Pickup and Delivery Problems. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 23609-23620. Curran Associates, Inc.

[15]

Mariescu-Istodor, R.; and Fränti, P. 2021. Solving the Large-Scale TSP Problem in 1 h: Santa Claus Challenge 2020. Frontiers in robotics and AI, 8: 689908-689908.

[16]

Nowak, A.; Villar, S.; Bandeira, A. S.; and Bruna, J. 2017. A Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks. ArXiv e-prints, 1706: arXiv:1706.07450.

[17]

Padberg, M.; and Rinaldi, G. 1991. A Branch-and-Cut Algorithm for the Resolution of Large-Scale Symmetric Traveling Salesman Problems. SIAM review, 33(1): 60-100.

[18]

Papadimitriou, C. H. 1977. The Euclidean travelling salesman problem is NP-complete. Theoretical Computer Science, 4: 237-244.

[19]

Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; and Chintala, S. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.

[20]

Rosenkrantz, D. J.; Stearns, R. E.; and Lewis, P. M. 1974. Approximate algorithms for the traveling salesperson problem. In 15th Annual Symposium on Switching and Automata Theory (swat 1974), 33-42. IEEE.

Digital Library

[21]

Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; and Klimov, O. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

[22]

Taillard, É. D.; and Helsgaun, K. 2019. POPMUSIC for the travelling salesman problem. European Journal of Operational Research, 272(2): 420-429.

[23]

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. Advances in Neural Information Processing Systems, 2017-December: 5999-6009.

[24]

Vinyals, O.; Fortunato, M.; and Jaitly, N. 2015. Pointer Networks. In Cortes, C.; Lawrence, N.; Lee, D.; Sugiyama, M.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 28 of NIPS'15. Cambridge, MA, USA: Curran Associates, Inc.

[25]

Williams, R. J. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine learning, 8(3): 229-256.

[26]

Xu, Z.; Li, Z.; Guan, Q.; Zhang, D.; Li, Q.; Nan, J.; Liu, C.; Bian, W.; and Ye, J. 2018. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 905-913.

[27]

Zheng, J.; He, K.; Zhou, J.; Jin, Y.; and Li, C.-M. 2021. Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14): 12445-12452.

Cited By

Luo FLin XLiu FZhang QWang ZOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Neural combinatorial optimization with heavy decoderProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666509(8845-8864)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666509

Recommendations

Reducing Path TSP to TSP

We present a black-box reduction from the path version of the traveling salesman problem (Path TSP) to the classical tour version (TSP). More precisely, given an $\alpha$-approximation algorithm for TSP, then, for any $\epsilon >0$, we obtain an $(\alpha+\...
Hierarchical Solving Method for Large Scale TSP Problems
Advances in Neural Networks – ISNN 2014
Abstract
This paper presents a hierarchical algorithm for solving large-scale traveling salesman problem (TSP), the algorithm first uses clustering algorithms to large-scale TSP problem into a number of small-scale collections of cities, and then put this ...
Reducing path TSP to TSP
STOC 2020: Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing

We present a black-box reduction from the path version of the Traveling Salesman Problem (Path TSP) to the classical tour version (TSP). More precisely, we show that given an α-approximation algorithm for TSP, then, for any ε >0, there is an (α+ε)-...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

February 2023

16496 pages

ISBN:978-1-57735-880-0

Copyright © 2023 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 February 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Luo FLin XLiu FZhang QWang ZOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Neural combinatorial optimization with heavy decoderProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666509(8845-8864)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666509

View Options

View options

Media

Figures

Other

Tables

View Table of Contents