Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

DynaHB: A Communication-Avoiding Asynchronous Distributed Framework with Hybrid Batches for Dynamic GNN Training

Published: 01 July 2024 Publication History

Abstract

Dynamic Graph Neural Networks (DGNNs) have demonstrated exceptional performance at dynamic-graph analysis tasks. However, the costs exceed those incurred by other learning tasks, to the point where deployment on large-scale dynamic graphs is infeasible. Existing distributed frameworks that facilitate DGNN training are in their early stages and experience challenges such as communication bottlenecks, imbalanced workloads, and GPU memory overflow.
We introduce DynaHB, a distributed framework for DGNN training using so-called Hybrid Batches. DynaHB reduces communication by means of vertex caching, and it ensures even data and workload distribution by means of load-aware vertex partitioning. DyanHB also features a novel hybrid-batch training mode that combines vertex-batch and snapshot-batch techniques, thereby reducing training time and GPU memory usage. Next, to further enhance the hybrid batch based approach, DynaHB integrates a reinforcement learning-based batch adjuster and a pipelined batch generator with a batch reservoir to reduce the cost of generating hybrid batches. Extensive experiments show that DynaHB is capable of up to a 93× and an average of 8.06× speedups over the state-of-the-art training framework.

References

[1]
Stefan Arnborg, Derek G Corneil, and Andrzej Proskurowski. 1987. Complexity of finding embeddings in ak-tree. SIAM Journal on Algebraic Discrete Methods 8, 2 (1987), 277--284.
[2]
Anne Berry, Pinar Heggernes, and Genevieve Simonet. 2003. The minimum degree heuristic and the minimal triangulation process. In International Workshop on Graph-Theoretic Concepts in Computer Science. Springer, 58--70.
[3]
Hans L Bodlaender et al. 1992. A tourist guide through treewidth. (1992).
[4]
Lijun Chang, Jeffrey Xu Yu, Lu Qin, Hong Cheng, and Miao Qiao. 2012. The exact distance to destination in undirected world. The VLDB Journal 21 (2012), 869--888.
[5]
Zitong Chen, Ada Wai-Chee Fu, Minhao Jiang, Eric Lo, and Pengfei Zhang. 2021. P2H: efficient distance querying on road networks by projected vertex separators. In Proceedings of the 2021 International Conference on Management of Data. 313--325.
[6]
Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. 2003. Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32, 5 (2003), 1338--1355.
[7]
Ugur Demiryurek, Farnoush Banaei-Kashani, and Cyrus Shahabi. 2010. Efficient k-nearest neighbor search in time-dependent spatial networks. In International Conference on Database and Expert Systems Applications. Springer, 432--449.
[8]
Ugur Demiryurek, Farnoush Banaei-Kashani, and Cyrus Shahabi. 2010. Towards k-nearest neighbor search in time-dependent spatial network databases. In International workshop on databases in networked information systems. Springer, 296--310.
[9]
Edsger W Dijkstra. 2022. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy. 287--290.
[10]
Alexandros Efentakis. 2016. Scalable Public Transportation Queries on the Database. In EDBT. 527--538.
[11]
Qingshuai Feng, You Peng, Wenjie Zhang, Ying Zhang, and Xuemin Lin. 2022. Towards real-time counting shortest cycles on dynamic graphs: A hub labeling approach. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 512--524.
[12]
Cyril Gavoille, David Peleg, Stéphane Pérennes, and Ran Raz. 2004. Distance labeling in graphs. Journal of algorithms 53, 1 (2004), 85--112.
[13]
Dan He, Sibo Wang, Xiaofang Zhou, and Reynold Cheng. 2019. An efficient framework for correctness-aware kNN queries on road networks. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 1298--1309.
[14]
Wuwei Huang, Genan Dai, Youming Ge, and Yubao Liu. 2019. Top-k nearest keyword search in public transportation networks. In 2019 15th International Conference on Semantics, Knowledge and Grids (SKG). IEEE, 67--74.
[15]
Yuka Komai, Duong Hong Nguyen, Takahiro Hara, and Shojiro Nishio. 2014. KNN search utilizing index of the minimum road travel time in time-dependent road networks. In 2014 IEEE 33rd International Symposium on Reliable Distributed Systems Workshops. IEEE, 131--137.
[16]
Ken CK Lee, Wang-Chien Lee, Baihua Zheng, and Yuan Tian. 2010. ROAD: A new spatial object search framework for road networks. IEEE transactions on knowledge and data engineering 24, 3 (2010), 547--560.
[17]
Jiajia Li, Cancan Ni, Dan He, Lei Li, Xiufeng Xia, and Xiaofang Zhou. 2023. Efficient k NN query for moving objects on time-dependent road networks. The VLDB Journal 32, 3 (2023), 575--594.
[18]
Jiajia Li, Lingyun Zhang, Cancan Ni, Yunzhe An, Chuanyu Zong, and Anzhen Zhang. 2021. Efficient k Nearest Neighbor Query Processing on Public Transportation Network. In 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 1108--1115.
[19]
Lei Li, Sibo Wang, and Xiaofang Zhou. 2020. Fastest path query answering using time-dependent hop-labeling in road network. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 300--313.
[20]
Wentao Li, Miao Qiao, Lu Qin, Ying Zhang, Lijun Chang, and Xuemin Lin. 2020. Scaling up distance labeling on graphs with core-periphery properties. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1367--1381.
[21]
Siqiang Luo, Ben Kao, Guoliang Li, Jiafeng Hu, Reynold Cheng, and Yudian Zheng. 2018. Toain: a throughput optimizing adaptive index for answering dynamic k nn queries on road networks. Proceedings of the VLDB Endowment 11, 5 (2018), 594--606.
[22]
Dian Ouyang, Lu Qin, Lijun Chang, Xuemin Lin, Ying Zhang, and Qing Zhu. 2018. When hierarchy meets 2-hop-labeling: Efficient shortest distance queries on road networks. In Proceedings of the 2018 International Conference on Management of Data. 709--724.
[23]
Dian Ouyang, Dong Wen, Lu Qin, Lijun Chang, Ying Zhang, and Xuemin Lin. 2020. Progressive top-k nearest neighbors search in large road networks. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1781--1795.
[24]
Dimitris Papadias, Jun Zhang, Nikos Mamoulis, and Yufei Tao. 2003. Query processing in spatial network databases. In Proceedings 2003 VLDB Conference. Elsevier, 802--813.
[25]
Neil Robertson and Paul D Seymour. 1984. Graph minors. III. Planar tree-width. Journal of Combinatorial Theory, Series B 36, 1 (1984), 49--64.
[26]
Hanan Samet, Jagan Sankaranarayanan, and Houman Alborzi. 2008. Scalable network distance browsing in spatial databases. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 43--54.
[27]
David Tedjopurnomo, Zhifeng Bao, Farhana Choudhury, Hui Luo, and AK Qin. 2022. Equitable Public Bus Network Optimization for Social Good: A Case Study of Singapore. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 278--288.
[28]
Sibo Wang, Wenqing Lin, Yi Yang, Xiaokui Xiao, and Shuigeng Zhou. 2015. Efficient route planning on public transportation networks: A labelling approach. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 967--982.
[29]
Sheng Wang, Yuan Sun, Christopher Musco, and Zhifeng Bao. 2021. Public transport planning: When transit network connectivity meets commuting demand. In Proceedings of the 2021 International Conference on Management of Data. 1906--1919.
[30]
Fang Wei. 2010. TEDI: efficient shortest path query answering on graphs. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. 99--110.
[31]
Fang Wei-Kleiner. 2016. Tree decomposition-based indexing for efficient shortest path and nearest neighbors query answering on graphs. J. Comput. System Sci. 82, 1 (2016), 23--44.
[32]
Jinbo Xu, Feng Jiao, and Bonnie Berger. 2005. A tree-decomposition approach to protein structure prediction. In 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05). IEEE, 247--256.
[33]
Yajun Yang, Hanxiao Li, Junhu Wang, Qinghua Hu, Xin Wang, Muxi Leng, et al. 2019. A novel index method for k nearest object query over time-dependent road networks. Complexity 2019 (2019).
[34]
Junhua Zhang, Wentao Li, Long Yuan, Lu Qin, Ying Zhang, and Lijun Chang. 2022. Shortest-path queries on complex networks: experiments, analyses, and improvement. Proceedings of the VLDB Endowment 15, 11 (2022), 2640--2652.
[35]
Junhua Zhang, Long Yuan, Wentao Li, Lu Qin, Ying Zhang, and Wenjie Zhang. 2024. Label-constrained shortest path query processing on road networks. The VLDB Journal 33, 3 (2024), 569--593.
[36]
Ruicheng Zhong, Guoliang Li, Kian-Lee Tan, Lizhu Zhou, and Zhiguo Gong. 2015. G-tree: An efficient and scalable index for spatial search on road networks. IEEE Transactions on Knowledge and Data Engineering 27, 8 (2015), 2175--2189.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 17, Issue 11
July 2024
1039 pages
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2024
Published in PVLDB Volume 17, Issue 11

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 99
    Total Downloads
  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)26
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media