Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Distributed-GPU Deep Reinforcement Learning System for Solving Large Graph Optimization Problems

Published: 20 June 2023 Publication History

Abstract

Graph optimization problems (such as minimum vertex cover, maximum cut, traveling salesman problems) appear in many fields including social sciences, power systems, chemistry, and bioinformatics. Recently, deep reinforcement learning (DRL) has shown success in automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised the ability of reinforcement learning in solving large-scale graph optimization problems due to lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop RL4GO, a high-performance distributed-GPU DRL framework for solving graph optimization problems. RL4GO focuses on a class of computationally demanding RL problems, where both the RL environment and policy model are highly computation intensive. Traditional reinforcement learning systems often assume either the RL environment is of low time complexity or the policy model is small.
In this work, we distribute large-scale graphs across distributed GPUs and use the spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of the spatial parallelism and data parallelism and show their differences. To support graph neural network (GNN) layers that take as input data samples partitioned across distributed GPUs, we design parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design a parallel graph environment to scale up all RL-environment-related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high-performance RL4GO training and inference algorithms in parallel. Furthermore, we propose two optimization techniques—replay buffer on-the-fly graph generation and adaptive multiple-node selection—to minimize the spatial cost and accelerate reinforcement learning. This work also conducts in-depth analyses of parallel efficiency and memory cost and shows that the designed RL4GO algorithms are scalable on numerous distributed GPUs. Evaluations on large-scale graphs show that (1) RL4GO training and inference can achieve good parallel efficiency on 192 GPUs, (2) its training time can be 18 times faster than the state-of-the-art Gorila distributed RL framework [34], and (3) its inference performance achieves a 26 times improvement over Gorila.

References

[1]
Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, and Henryk Michalewski. 2018. Distributed deep reinforcement learning: Learn how to play Atari games in 21 minutes. In International Conference on High Performance Computing. Springer, 370–388.
[2]
Réka Albert, István Albert, and Gary L. Nakarado. 2004. Structural vulnerability of the north American power grid. Physical Review E 69, 2 (2004), 025103.
[3]
Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Reviews of Modern Physics 74, 1 (2002), 47.
[4]
Thomas Barrett, William Clements, Jakob Foerster, and Alex Lvovsky. 2020. Exploratory combinatorial optimization with reinforcement learning. In AAAI Conference on Artificial Intelligence, Vol. 34. 3243–3250.
[5]
Sergey V. Buldyrev, Roni Parshani, Gerald Paul, H. Eugene Stanley, and Shlomo Havlin. 2010. Catastrophic cascade of failures in interdependent networks. Nature 464, 7291 (2010), 1025–1028.
[6]
Ed Bullmore and Olaf Sporns. 2009. Complex brain networks: Graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience 10, 3 (2009), 186–198.
[7]
Wei Chen, Chi Wang, and Yajun Wang. 2010. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1029–1038.
[8]
Vasek Chvatal, Vaclav Chvatal. 1983. Linear Programming. Macmillan.
[9]
Hanjun Dai, Bo Dai, and Le Song. 2016. Discriminative embeddings of latent variable models for structured data. In International Conference on Machine Learning. 2702–2711.
[10]
Hanjun Dai, Elias Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song. 2017. Learning combinatorial optimization algorithms over graphs. In Advances in Neural Information Processing Systems. 6348–6358.
[11]
Rodney G. Downey and Michael R. Fellows. 2013. Fundamentals of Parameterized Complexity. Vol. 4. Springer.
[12]
Paul Erdős and Alfréd Rényi. 1960. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5, 1 (1960), 17–60.
[13]
Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Vlad Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, et al. 2018. IMPALA: Scalable distributed deep-RL with importance weighted actor-learner architectures. In International Conference on Machine Learning. PMLR, 1407–1416.
[14]
Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. (2019). https://arxiv.org/abs/1903.02428.
[15]
Yoav Goldberg and Omer Levy. 2014. word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. (2014). https://arxiv.org/pdf/1402.3722.
[16]
Graph Nets. 2019. https://github.com/deepmind/graph_nets.
[17]
Aric Hagberg, Pieter Swart, and Daniel S. Chult. 2008. Exploring Network Structure, Dynamics, and Function Using NetworkX. Technical Report. Los Alamos National Lab (LANL), Los Alamos, NM.
[18]
Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Andrew Cowie, Ziyu Wang, Bilal Piot, and Nando de Freitas. 2020. Acme: A Research Framework for Distributed Reinforcement Learning. (2020). https://arxiv.org/abs/2006.00979.
[19]
Christian D. Hubbs, Hector D. Perez, Owais Sarwar, Nikolaos V. Sahinidis, Ignacio E. Grossmann, and John M. Wassick. 2020. OR-Gym: A Reinforcement Learning Library for Operations Research Problem. (2020). https://arxiv.org/abs/2008.06319.
[20]
IBM ILOG CPLEX. 2020. V20.1.0: User’s Manual for CPLEX.
[21]
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4 (1996), 237–285.
[22]
George Karypis and Vipin Kumar. 1995. METIS–Unstructured graph partitioning and sparse matrix ordering system, version 2.0. (1995).
[23]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 137–146.
[24]
Jure Leskovec, Lada A. Adamic, and Bernardo A. Huberman. 2007. The dynamics of viral marketing. ACM Transactions on the Web 1, 1 (2007), 5.es.
[25]
Jure Leskovec and Andrej Krevl. 2023. SNAP Datasets: Stanford Large Network Dataset Collection. (Jan.2023). http://snap.stanford.edu/data.
[26]
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics 6 (2009), 29–123.
[27]
Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Joseph Gonzalez, Ken Goldberg, and Ion Stoica. 2017. Ray RLlib: A composable and scalable reinforcement learning library. (2017). https://arxiv.org/abs/1712.09381.
[28]
Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. PaGraph: Scaling GNN training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing (SoCC’20). ACM, 401–415. DOI:
[29]
Dean Lusher, Johan Koskinen, and Garry Robins. 2013. Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Vol. 35. Cambridge University Press.
[30]
Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, and Yafei Dai. 2019. NeuGraph: Parallel deep neural network computation on large graphs. In 2019 USENIX Annual Technical Conference. 443–458. https://www.usenix.org/conference/atc19/presentation/ma.
[31]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, 43–52. DOI:
[32]
Julian J. McAuley and Jure Leskovec. 2012. Learning to discover social circles in ego networks. In Advances in Neural Information Processing Systems, Vol. 25.
[33]
Alessio Micheli. 2009. Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks 20, 3 (2009), 498–511.
[34]
Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, et al. 2015. Massively parallel methods for deep reinforcement learning. (2015). https://arxiv.org/abs/1507.04296.
[35]
PaddlePaddle PARL. 2020. https://github.com/PaddlePaddle/PARL.
[36]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 8026–8037.
[37]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 701–710.
[38]
Antoine Prouvost, Justin Dumouchelle, Lara Scavuzzo, Maxime Gasse, Didier Chételat, and Andrea Lodi. 2020. Ecole: A Gym-like Library for Machine Learning in Combinatorial Optimization Solvers. (2020). https://arxiv.org/abs/2011.06069.
[39]
Yunhao Tang, Shipra Agrawal, and Yuri Faenza. 2020. Reinforcement learning for integer programming: Learning to cut. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 119. PMLR, 9367–9376.
[40]
Atsushi Tero, Seiji Takagi, Tetsu Saigusa, Kentaro Ito, Dan P. Bebber, Mark D. Fricker, Kenji Yumiki, Ryo Kobayashi, and Toshiyuki Nakagaki. 2010. Rules for biologically inspired adaptive network design. Science 327, 5964 (2010), 439–442.
[41]
Nenad Trinajstic. 2018. Chemical Graph Theory. CRC Press.
[42]
Alok Tripathy, Katherine Yelick, and Aydın Buluç. 2020. Reducing communication in graph neural network training. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC’20). IEEE, 1–14.
[43]
Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, et al. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. (2019). https://arxiv.org/abs/1909.01315.
[44]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S. Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020).
[45]
Danfei Xu, Yuke Zhu, Christopher B. Choy, and Li Fei-Fei. 2017. Scene graph generation by iterative message passing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5410–5419.
[46]
Hongxia Yang. 2019. Aligraph: A comprehensive graph neural network platform. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3165–3166.
[47]
Jaewon Yang and Jure Leskovec. 2011. Patterns of temporal variation in online media. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, 177–186. DOI:
[48]
Jiaxuan You, Zhitao Ying, and Jure Leskovec. 2020. Design space for graph neural networks. Advances in Neural Information Processing Systems (2020).
[49]
Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. DistDGL: Distributed graph neural network training for billion-scale graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3’20). IEEE, 36–44.
[50]
Weijian Zheng, Dali Wang, and Fengguang Song. 2020. OpenGraphGym: A parallel reinforcement learning framework for graph optimization problems. In International Conference on Computational Science. Springer.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 10, Issue 2
June 2023
285 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3604626
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2023
Online AM: 23 March 2023
Accepted: 22 March 2023
Revised: 09 January 2023
Received: 13 September 2022
Published in TOPC Volume 10, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Parallel machine learning system
  2. high performance computing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 310
    Total Downloads
  • Downloads (Last 12 months)128
  • Downloads (Last 6 weeks)8
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media