Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3577193.3593722acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article
Public Access

Use Only What You Need: Judicious Parallelism For File Transfers in High Performance Networks

Published: 21 June 2023 Publication History

Abstract

Parallelism is key to efficiently utilizing high-speed research networks when transferring large volumes of data. However, the monolithic design of existing transfer applications requires the same level of parallelism to be used for read, write, and network operations for file transfers. This, in turn, overburdens system resources since setting the parallelism level for the slowest component results in unnecessarily high parallelism for other components. Using more than necessary parallelism lead to increased overhead on system resources and unfair resource allocation among competing transfers. In this paper, we introduce modular file transfer architecture, Marlin, to separate I/O and network operations for file transfers so that parallelism can be independently adjusted for each component. Marlin adopts online gradient descent algorithm to swiftly search the solution space and find the optimal level of parallelism for read, transfer, and write operations. Experimental results collected under various network settings show that Marlin can identify and use a minimum parallelism level for each component, improving fairness among competing transfers and CPU utilization. Finally, separating network transfers from write operations allows Marlin to outperform the state-of-the-art solutions by more than 2x when transferring small datasets.

References

[1]
2023. Bridges-2. https://www.psc.edu/resources/bridges-2/.
[2]
2023. Expanse. https://www.sdsc.edu/services/hpc/expanse/.
[3]
2023. Fast Data Transfer. http://monalisa.cern.ch/FDT/.
[4]
2023. Globus. https://www.globus.org.
[5]
2023. Lighting up the LSST Fiber Optic Network: From Summit to Base to Archive. lsst.org/news/lighting-lsst-fiber-optic-network-summit-base-archive.
[6]
2023. The network challenge. https://home.cern/science/computing/network.
[7]
William Allcock, John Bresnahan, Rajkumar Kettimuthu, Michael Link, Catalin Dumitrescu, Ioan Raicu, and Ian Foster. 2005. The Globus striped GridFTP framework and server. In Proceedings of the 2005 ACM/IEEE conference on Super-computing. IEEE Computer Society, 54.
[8]
B. Allen, J. Bresnahan, L. Childers, I. Foster, G. Kandaswamy, R. Kettimuthu, J. Kordas, M. Link, S. Martin, K. Pickett, and S. Tuecke. 2012. Software as a Service for Data Scientists. Commun. ACM 55:2 (2012), 81--88.
[9]
Md Arifuzzaman and Engin Arslan. 2021. Online Optimization of File Transfers in High-Speed Networks. In High Performance Computing, Networking, Storage and Analysis, SC21: International Conference for. IEEE.
[10]
Engin Arslan, Kemal Guner, and Tevfik Kosar. 2016. HARP: predictive transfer optimization based on historical analysis and real-time probing. In High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for. IEEE, 288--299.
[11]
Engin Arslan and Tevfik Kosar. 2018. High-Speed Transfer Optimization Based on Historical Analysis and Real-Time Tuning. IEEE Transactions on Parallel and Distributed Systems 29, 6 (2018), 1303--1316.
[12]
Engin Arslan, Bahadir A Pehlivan, and Tevfik Kosar. 2018. Big data transfer optimization through adaptive parameter tuning. J. Parallel and Distrib. Comput. 120 (2018), 89--100.
[13]
Engin Arslan, Brandon Ross, and Tevfik Kosar. 2013. Dynamic protocol tuning algorithms for high performance data transfers. In European Conference on Parallel Processing. Springer, 725--736.
[14]
P. Balaprakash, V. Morozov, R. Kettimuthu, K. Kumaran, and I. Foster. 2016. Improving Data Transfer Throughput with Direct Search Optimization. In 2016 45th International Conference on Parallel Processing (ICPP). 248--257.
[15]
John Bresnahan, Michael Link, Rajkumar Kettimuthu, Dan Fraser, Ian Foster, et al. 2007. Gridftp pipelining. In Proceedings of the 2007 TeraGrid Conference.
[16]
Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2016. BBR: Congestion-based congestion control. Queue 14, 5 (2016), 50.
[17]
Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. 2018. {PCC} Vivace: Online-Learning Congestion Control. In 15th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 18). 343--356.
[18]
T. J. Hacker, B. D. Noble, and B. D. Atley. 2005. Adaptive Data Block Scheduling for Parallel Streams. In Proceedings of HPDC '05. ACM/IEEE, 265--275.
[19]
Elad Hazan. 2016. Introduction to online convex optimization. Foundations and Trends® in Optimization 2, 3--4 (2016), 157--325.
[20]
Yuanlai Liu, Zhengchun Liu, Rajkumar Kettimuthu, Nageswara Rao, Zizhong Chen, and Ian Foster. 2019. Data transfer between scientific facilities-bottleneck analysis, insights and optimizations. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 122--131.
[21]
Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster, and Nageswara SV Rao. 2018. Cross-geography scientific data transferring trends and behavior. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 267--278.
[22]
MD SQ Zulkar Nine and Tevfik Kosar. 2020. A Two-Phase Dynamic Throughput Optimization Model for Big Data Transfers. IEEE Transactions on Parallel and Distributed Systems 32, 2 (2020), 269--280.
[23]
Pratiksha Thaker, Matei Zaharia, and Tatsunori Hashimoto. [n.d.]. Learning and utility in multi-agent congestion control. optimization 24, 10 ([n. d.]), 11--18.
[24]
Esma Yildirim, Engin Arslan, Jangyoung Kim, and Tevfik Kosar. 2016. Application-level optimization of big data transfers through pipelining, parallelism and concurrency. IEEE Transactions on Cloud Computing 4, 1 (2016), 63--75.
[25]
Daqing Yun, Chase Q Wu, Nageswara SV Rao, Qiang Liu, Rajkumar Kettimuthu, and Eun-Sung Jung. 2017. Data Transfer Advisor with Transport Profiling Optimization. In Local Computer Networks (LCN), 2017 IEEE 42nd Conference on. IEEE, 269--277.
[26]
Liang Zhang, Phil Demar, Bockjoo Kim, and Wenji Wu. 2017. MDTM: Optimizing data transfer using multicore-aware I/O scheduling. In 2017 IEEE 42nd Conference on Local Computer Networks (LCN). IEEE, 104--111.
[27]
Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML-03). 928--936.

Cited By

View all
  • (2023)Demystifying the Performance of Data Transfers in High-Performance Research Networks2023 IEEE 19th International Conference on e-Science (e-Science)10.1109/e-Science58273.2023.10254940(1-11)Online publication date: 9-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '23: Proceedings of the 37th ACM International Conference on Supercomputing
June 2023
505 pages
ISBN:9798400700569
DOI:10.1145/3577193
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. wide-area file transfers
  2. i/o parallelism
  3. online optimization
  4. high performance networks

Qualifiers

  • Research-article

Funding Sources

  • NSF

Conference

ICS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)92
  • Downloads (Last 6 weeks)13
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Demystifying the Performance of Data Transfers in High-Performance Research Networks2023 IEEE 19th International Conference on e-Science (e-Science)10.1109/e-Science58273.2023.10254940(1-11)Online publication date: 9-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media