Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1609/aaai.v37i8.26120guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

H-TSP: hierarchically solving the large-scale traveling salesman problem

Published: 07 February 2023 Publication History

Abstract

We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Traveling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10,000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.

References

[1]
Applegate, D. L.; Bixby, R. E.; Chvatal, V.; Cook, W.; Espinoza, D. G.; Goycoolea, M.; and Helsgaun, K. 2009. Certification of an optimal TSP tour through 85,900 cities. Operations Research Letters, 37(1): 11-15.
[2]
Bello, I.; Pham, H.; Le, Q. V.; Norouzi, M.; and Bengio, S. 2017. Neural combinatorial optimization with reinforcement learning. 5th International Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings.
[3]
Bengio, Y.; Lodi, A.; and Prouvost, A. 2021. Machine learning for combinatorial optimization: A methodological tour d'horizon. European Journal of Operational Research, 290(2): 405-421.
[4]
da Costa, P. R. d. O.; Rhuggenaath, J.; Zhang, Y.; and Akcay, A. 2020. Learning 2-Opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning. In Pan, S. J.; and Sugiyama, M., eds., Proceedings of the 12th Asian Conference on Machine Learning, volume 129 of Proceedings of Machine Learning Research, 465-480. PMLR.
[5]
Dai, H.; Khalil, E. B.; Zhang, Y.; Dilkina, B.; and Song, L. 2017. Learning combinatorial optimization algorithms over graphs. Advances in Neural Information Processing Systems, 2017-December: 6349-6359.
[6]
David L. Applegate; Robert E. Bixby; Vašek Chvátal; and William J. Cook. 2007. The Traveling Salesman Problem: A Computational Study. Princeton University Press.
[7]
Fu, Z.-H.; Qiu, K.-B.; and Zha, H. 2021. Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8): 7474-7482.
[8]
Ghiani, G.; Guerriero, F.; Laporte, G.; and Musmanno, R. 2003. Real-time vehicle routing: Solution concepts, algorithms and parallel computing strategies. European Journal of Operational Research, 151(1): 1-11.
[9]
Guo, T.; Han, C.; Tang, S.; and Ding, M. 2019. Solving Combinatorial Problems with Machine Learning Methods. Springer Optimization and Its Applications, 147: 207-229.
[10]
Helsgaun, K. 2017. An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. Roskilde: Roskilde University.
[11]
Kool, W.; Van Hoof, H.; and Welling, M. 2019. Attention, learn to solve routing problems! 7th International Conference on Learning Representations, ICLR 2019.
[12]
Kwon, Y. D.; Choo, J.; Kim, B.; Yoon, I.; Gwon, Y.; and Min, S. 2020. POMO: Policy optimization with multiple optima for reinforcement learning. Advances in Neural Information Processing Systems, 2020-December.
[13]
Lang, A. H.; Vora, S.; Caesar, H.; Zhou, L.; Yang, J.; and Beijbom, O. 2019. PointPillars: Fast Encoders for Object Detection From Point Clouds. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12689-12697.
[14]
Ma, Y.; Hao, X.; Hao, J.; Lu, J.; Liu, X.; Xialiang, T.; Yuan, M.; Li, Z.; Tang, J.; and Meng, Z. 2021. A Hierarchical Reinforcement Learning Based Optimization Framework for Large-Scale Dynamic Pickup and Delivery Problems. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 23609-23620. Curran Associates, Inc.
[15]
Mariescu-Istodor, R.; and Fränti, P. 2021. Solving the Large-Scale TSP Problem in 1 h: Santa Claus Challenge 2020. Frontiers in robotics and AI, 8: 689908-689908.
[16]
Nowak, A.; Villar, S.; Bandeira, A. S.; and Bruna, J. 2017. A Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks. ArXiv e-prints, 1706: arXiv:1706.07450.
[17]
Padberg, M.; and Rinaldi, G. 1991. A Branch-and-Cut Algorithm for the Resolution of Large-Scale Symmetric Traveling Salesman Problems. SIAM review, 33(1): 60-100.
[18]
Papadimitriou, C. H. 1977. The Euclidean travelling salesman problem is NP-complete. Theoretical Computer Science, 4: 237-244.
[19]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; and Chintala, S. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
[20]
Rosenkrantz, D. J.; Stearns, R. E.; and Lewis, P. M. 1974. Approximate algorithms for the traveling salesperson problem. In 15th Annual Symposium on Switching and Automata Theory (swat 1974), 33-42. IEEE.
[21]
Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; and Klimov, O. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
[22]
Taillard, É. D.; and Helsgaun, K. 2019. POPMUSIC for the travelling salesman problem. European Journal of Operational Research, 272(2): 420-429.
[23]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. Advances in Neural Information Processing Systems, 2017-December: 5999-6009.
[24]
Vinyals, O.; Fortunato, M.; and Jaitly, N. 2015. Pointer Networks. In Cortes, C.; Lawrence, N.; Lee, D.; Sugiyama, M.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 28 of NIPS'15. Cambridge, MA, USA: Curran Associates, Inc.
[25]
Williams, R. J. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine learning, 8(3): 229-256.
[26]
Xu, Z.; Li, Z.; Guan, Q.; Zhang, D.; Li, Q.; Nan, J.; Liu, C.; Bian, W.; and Ye, J. 2018. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 905-913.
[27]
Zheng, J.; He, K.; Zhou, J.; Jin, Y.; and Li, C.-M. 2021. Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14): 12445-12452.

Cited By

View all
  • (2023)Neural combinatorial optimization with heavy decoderProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666509(8845-8864)Online publication date: 10-Dec-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence
February 2023
16496 pages
ISBN:978-1-57735-880-0

Sponsors

  • Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 February 2023

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Neural combinatorial optimization with heavy decoderProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666509(8845-8864)Online publication date: 10-Dec-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media