research-article

An evaluation of BitTorrent's performance in HPC environments

Authors:

Matthew G. F. Dosanjh,

Patrick G. Bridges,

Suzanne M. Kelly,

James H. Laros, III, and

Courtenay T. VaughanAuthors Info & Claims

ROSS '14: Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers

June 2014

Article No.: 8, Pages 1 - 8

https://doi.org/10.1145/2612262.2612269

Published: 10 June 2014 Publication History

Abstract

A number of novel decentralized systems have recently been developed to address challenges of scale in large distributed systems. The suitability of such systems for meeting the challenges of scale in high performance computing (HPC) systems is unclear, however. In this paper, we begin to answer this question by examining the suitability of the popular BitTorrent protocol to handle dynamic shared library distribution in HPC systems. To that end, we describe the architecture and implementation of a system that uses BitTorrent to distribute shared libraries in HPC systems, evaluate and optimize BitTorrent protocol usage for the HPC environment, and measure the performance of the resulting system. Our results demonstrate the potential viability of BitTorrent-style protocols in HPC systems, but also highlight the challenges of these protocols. In particular, our results show that the protocol mechanisms meant to enforce fairness in a distributed computing environment can have a significant impact on system performance if not properly taken into account in system design and implementation.

References

[1]

D. H. Ahn, M. J. Brim, B. R. de Supinski, T. Gamblin, G. L. Lee, M. P. LeGendre, B. P. Miller, A. Moody, and M. Schulz, Efficient and scalable retrieval techniques for global file properties, in Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, IEEE, 2013, pp. 369--380.

Digital Library

[2]

N. Ali, P. Carns, K. Iskra, D. Kimpe, S. Lang, R. Latham, R. Ross, L. Ward, and P. Sadayappan, Scalable I/O forwarding framework for high-performance computing systems, in International Conference on Cluster Computing, IEEE, Sept. 2009.

[3]

B. Barrett, R. Barrett, J. Brandt, R. Brightwell, M. Curry, N. Fabian, K. Ferreira, A. Gentile, S. Hemmert, S. Kelly, R. Klundt, J. H. Laros III, V. Leung, M. Levenhagen, G. Lofstead, K. Moreland, R. Oldfield, K. Pedretti, A. Rodrigues, D. Thompson, T. Tucker, L. Ward, J. V. Dyke, C. Vaughan, and K. Wheeler, Report of Experiments and Evidence for ASC L2 Milestone 4467 - Demonstration of a Legacy Application's Path to Exascale, Technical Report SAND2012-1750, Sandia National Laboratories, March 2012.

[4]

B. Cohen, The bittorrent protocol specification, 2008.

[5]

M. G. Dosanjh, P. G. Bridges, S. M. Kelly, and J. H. Laros III, A peer-to-peer architecture for supporting dynamic shared libraries in large-scale systems, in Parallel Processing Workshops (ICPPW), 2012 41st International Conference on, IEEE, 2012, pp. 55--61.

Digital Library

[6]

D. Engling, opentracker--an open and free bittorrent tracker, Web, 2010.

[7]

W. Frings, D. H. Ahn, M. P. LeGendre, T. Gamblin, B. R. de Supinski, and F. Wolf, Massively parallel loading., in ICS, 2013, pp. 389--398.

Digital Library

[8]

H. N. Greenberg, L. Ionkov, and R. Minnich, XGet: A Highly Scalable and Efficient File Transfer Tool for Clusters, in LCI International Conference on High-Performance Clustered Computing, January 2009.

[9]

D. Holmes, Enhanced ctorrent, http://www.rahul.net/dholmes/ctorrent.

[10]

S. M. Kelly, R. Klundt, and J. H. Laros III, Shared Libraries on a Capability Class Computer, in Cray User Group Annual Technical Conference, May 2011.

[11]

J. H. Laros III, S. M. Kelly, M. J. Levenhagen, and K. T. Pedretti, Investigating Methods of Supporting Dynamically Linked Executables on High Performance Computing Platforms, Technical Report SAND2009-5515, Sandia National Laboratories, 2009.

[12]

G. L. Lee, D. H. Ahn, B. R. de Supinski, J. Gyllenhaal, and P. Miller, Pynamic: the python dynamic benchmark, in Proceedings of the IEEE 10th International Symposium on Workload Characterization, Sept. 2007, pp. 101--106.

Digital Library

[13]

Magic Ermine. http://www.magicermine.com/erk/.

[14]

K. Ohta, D. Kimpe, J. Cope, K. Iskra, R. Ross, and Y. Ishikawa, Optimization Techniques at the I/O Forwarding Layer, in International Conference on Cluster Computing, IEEE, Sept. 2010.

Digital Library

[15]

S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, A Scalable Content-Addressable Network, in Special Interest Group on Data Communication (SIGCOMM), August 2001.

Digital Library

[16]

A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems, in IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001.

Digital Library

[17]

P. Soltero, P. Bridges, D. Arnold, and M. Lang, A gossip-based approach to exascale system services, in Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ACM, 2013, p. 3.

Digital Library

[18]

I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications, in Special Interest Group on Data Communication (SIGCOMM), August 2001.

Digital Library

[19]

S. Sugiyama and D. Wallace, Cray DVS: Data Virtualization Service, in Cray User Group Annual Technical Conference, May 2008.

[20]

V. Vishwanath, M. Hereld, K. Iskra, D. Kimpe, V. Morozov, M. Papka, R. Ross, and K. Yoshii, Accelerating I/O Forwarding in IBM Blue Gene/P Systems, in Internatioinal Conference for High Performance Computing, Networking, Storage and Analysis (SC), ACM, Nov. 2010.

Digital Library

[21]

B. Welton, D. Kimpe, J. Cope, C. Patrick, K. Iskra, and R. Ross, Improving I/O Forwarding Throughput with Data Compression, in International Conference on Cluster Computing, IEEE, Sept. 2011.

Digital Library

[22]

B. Y. Zhao, K. J. D., and A. D. Joseph, Tapestry: a fault-tolerant wide-area application infrastructure, SIGCOMM Comput. Commun. Rev., 32 (2002).

Digital Library

[23]

Z. Zhao, M. Davis, K. Antypas, Y. Yao, R. Lee, and T. Butler, Shared library performance on Hopper, in Cray User Group Annual Technical Conference, May 2012.

Cited By

Sly-Delgado BPhung TThomas CSimonetti DHennessee ATovar BThain D(2023)TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive WorkflowsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624277(1978-1988)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624277

Recommendations

Performance Evaluation in Computational Grid Environments
HPCASIA '04: Proceedings of the High Performance Computing and Grid in Asia Pacific Region, Seventh International Conference

Grid computing has been developed extensive in recently years and is becoming an important platform for high performance computing in scientific areas. Grid performance evaluation is an important approach to improve the performance of Grid systems and ...
Read More
A Survey of BitTorrent Performance

Since its inception, BitTorrent has proved to be the most popular approach for sharing large files using the peer-to-peer paradigm. BitTorrent introduced several innovative mechanisms such as tit-for-tat (TFT) and rarest first to enable efficient ...
Read More
Improving the bittorrent protocol using different incentive techniques
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ROSS '14: Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers

June 2014

76 pages

ISBN:9781450329507

DOI:10.1145/2612262

Conference Chairs:
Kamil Iskra
Argonne National Laboratory
,
Torsten Hoefler
ETH Zurich, Switzerland

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SPCL: Scalable Parallel Computing Laboratory

In-Cooperation

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

National Nuclear Security Administration

Conference

ROSS '14

Sponsor:

SPCL

ROSS '14: Runtime and Operating Systems for Supercomputers

June 10, 2014

Munich, Germany

Acceptance Rates

ROSS '14 Paper Acceptance Rate 9 of 16 submissions, 56%;

Overall Acceptance Rate 58 of 169 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
85
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Sly-Delgado BPhung TThomas CSimonetti DHennessee ATovar BThain D(2023)TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive WorkflowsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624277(1978-1988)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624277

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents