Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1519138.1519139acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Adding the easy button to the cloud with SnowFlock and MPI

Published: 31 March 2009 Publication History

Abstract

Cloud computing promises to provide researchers with the ability to perform parallel computations using large pools of virtual machines (VMs), without facing the burden of owning or maintaining physical infrastructure. However, with ease of access to hundreds of VMs, comes also an increased management burden. Cloud users today must manually instantiate, configure and maintain the virtual hosts in their cluster. They must learn new cloud APIs that are not germane to the problem of parallel processing. Those APIs usually take several minutes to perform their VM-management tasks, forcing users to keep VMs idling and pay for unused processing time, rather than shut VMs down and power them on as needed. Furthermore, users must still configure their cluster management framework to launch their parallel jobs.
In this paper we show that all this management pain is unnecessary. We show how to combine a cloud API -- SnowFlock -- and a parallel processing framework -- MPI -- to truly realize the potential of the cloud. SnowFlock allows users to fork VMs as if they were processes, occupying in sub-second time multiple physical hosts. We exploit the synergy between this paradigm and MPI's job management to completely hide all details of cloud management from the user. Maintaining a single VM and starting unmodified applications with familiar MPI commands, a user can instantaneously leverage hundreds of processors to perform a parallel computation. Besides making use of cloud resources trivial, we also eliminate the cost of idling -- VMs exist only for as long as they are involved in computation.

References

[1]
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. Gapped BLAST and PSI--BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25 (1997), 3389--3402.
[2]
Amazon.com. Amazon Elastic Compute Cloud (Amazon EC2). http://aws.amazon.com/ec2/.
[3]
Amazon.com. Amazon Elastic Compute Cloud Developers Guide. http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/.
[4]
Argonne National Laboratory. Mpich2. http://www.mcs.anl.gov/research/projects/mpich2/.
[5]
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., and Warfield, A. Xen and the Art of Virtualization. In Proc. of the 17th Symposium on Operating Systems Principles (SOSP) (Bolton Landing, NY, Oct. 2003).
[6]
Burns, G., Daoud, R., and Vaigl, J. LAM: An Open Cluster Environment for MPI. In Proc. Supercomputing (1994), pp. 379--386.
[7]
Chandra, R., Menon, R., Dagum, L., Kohr, D., Maydan, D., and McDonald, J. Parallel Programming in OpenMP. Elsevier, 2000.
[8]
Chase, J. S., Irwin, D. E., Grit, L. E., Moore, J. D., and Sprenkle, S. E. Dynamic Virtual Clusters in a Grid Site Manager. In Proc. 12th IEEE International Symposium on High Performance Distributed Computing (HPDC) (Washington, DC, 2003).
[9]
Clark, C., Fraser, K., Hand, S., Hansen, J. G., Jul, E., Limpach, C., Pratt, I., and Warfield, A. Live Migration of Virtual Machines. In Proc. 2nd Symposium on Networked Systems Design and Implementation (NSDI) (Boston, MA, May 2005).
[10]
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., and Warfield, A. Remus: High Availability via Asynchronous Virtual Machine Replication. In Proc. 5th NSDI (San Francisco, CA, Apr. 2008).
[11]
Darling, A., Carey, L., and Feng, W.-C. The Design, Implementation, and Evaluation of mpiBLAST. In Proc. 4th International Conference on Linux Clusters: The HPC Revolution 2003 (San Jose, CA, June 2003). http://www.mpiblast.org/.
[12]
Dean, J., and Ghemawat, S. MapReduce: Simplified Data Processing on Large Clusters. In Proc. 6th Symposium on Operating System Design and Implementation (OSDI) (Dec. 2004).
[13]
Emeneker, W., and Stanzione, D. Dynamic Virtual Clustering. In Proc. Cluster (Austin, TX, Sept. 2007).
[14]
Eucalyptus. http://eucalyptus.cs.ucsb.edu/.
[15]
European Bioinformatics Institute - ClustalW2. http://www.ebi.ac.uk/Tools/clustalw2/index.html.
[16]
Foster, I., Freeman, T., Keahey, K., Scheftner, D., Sotomayor, B., and Zhang, X. Virtual Clusters for Grid Communities. In Proc. Cluster Computing and the Grid (Singapore, May 2006).
[17]
Gabriel, E., Fagg, G. E., Bosilca, G., Angskun, T., Dongarra, J. J., Squyres, J. M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R. H., Daniel, D. J., Graham, R. L., and Woodall, T. S. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proc., 11th European PVM/MPI Users' Group Meeting (Budapest, Hungary, September 2004), pp. 97--104.
[18]
Geist, A., Beguelin, A., Dongarra, J., Jiang, W., Manchek, R., and Sunderam, V. PVM: Parallel Virtual Machine -- A Users' Guide and Tutorial for Networked Parallel Computing. MIT Press, 1994.
[19]
Gropp, W., and Lusk, E. Fault Tolerance in MPI Programs. International Journal of High Performance Computing Applications 18, 3 (2004), 363--372. http://www-unix.mcs.anl.gov/~gropp/bib/papers/2002/mpi-fault.ps.
[20]
Higgins, D., Thompson, J., and Gibson, T. Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22 (1994), 4673--4680.
[21]
Huelsenbeck, J. P., and Ronquist, F. Mrbayes: Bayesian inference of phylogenetic trees. Bioinformatics 17, 8 (2001), 754--755. http://mrbayes.csit.fsu.edu/.
[22]
Lagar-Cavilla, H. A., Whitney, J. A., Scannell, A., Patchin, P., Rumble, S. M., de Lara, E., Brudno, M., and Satyanarayanan, M. SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing. In Proc. of Eurosys 2009 (Nüremberg, Germany, Apr. 2009). To appear.
[23]
Li, K.-B. ClustalW-MPI: ClustalW Analysis Using Distributed and Parallel Computing. Bioinformatics 19, 12 (2003), 1585--1586. http://www.bii.a-star.edu.sg/achievements/applications/clustalw/index.php.
[24]
Microsoft Azure. http://www.microsoft.com/azure/.
[25]
Microsoft .Net. http://www.microsoft.com/NET/.
[26]
Moab. Moab Cluster Suite, Cluster Resources Inc., 2008. http://www.clusterresources.com/pages/products/moab-cluster-suite.php.
[27]
Open Cirrus (TM). http://opencirrus.org/.
[28]
RPS-BLAST. http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml.
[29]
Tachyon Parallel / Multiprocessor Ray Tracing System. http://jedi.ks.uiuc.edu/~johns/raytracer/.
[30]
University of Toronto. SnowFlock Project Webpage. http://sysweb.cs.toronto.edu/snowflock.
[31]
VASP -- Vienna Ab initio Simulation Package. http://cms.mpi.univie.ac.at/vasp/.
[32]
Vrable, M., MA, J., Chen, J., Moore, D., Vandekieft, E., Snoeren, A., Voelker, G., and Savage, S. Scalability, Fidelity and Containment in the Potemkin Virtual Honeyfarm. In Proc. 20th Symposium on Operating Systems Principles (SOSP) (Oct. 2005).
[33]
Whitaker, A., Shaw, M., and Gribble, S. D. Scale and Performance in the Denali Isolation Kernel. In Proc. 5th Symposium on Operating System Design and Implementation (OSDI) (Dec. 2002).
[34]
Youseff, L., Wolski, R., Gorda, B., and Krintz, C. Evaluating the Performance Impact of Xen on MPI and Process Execution For HPC Systems. In Proc. 1st International Workshop on Virtualization Technology in Distributed Computing (VTDC) (Washington, DC, Nov. 2006).

Cited By

View all
  • (2024)MCMPI: A library with elasticity for multi‐domain and public cloud environmentsConcurrency and Computation: Practice and Experience10.1002/cpe.814936:18Online publication date: 14-May-2024
  • (2018)MvmotionCluster Computing10.1007/s10586-013-0245-z17:2(441-452)Online publication date: 24-Dec-2018
  • (2009)Optimizing Live Migration of Virtual Machines in SMP Clusters for HPC ApplicationsProceedings of the 2009 Sixth IFIP International Conference on Network and Parallel Computing10.1109/NPC.2009.32(51-58)Online publication date: 19-Oct-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPCVirt '09: Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing
March 2009
42 pages
ISBN:9781605584652
DOI:10.1145/1519138
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 March 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

EuroSys '09
Sponsor:
EuroSys '09: Fourth EuroSys Conference 2009
March 31, 2009
Nuremburg, Germany

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MCMPI: A library with elasticity for multi‐domain and public cloud environmentsConcurrency and Computation: Practice and Experience10.1002/cpe.814936:18Online publication date: 14-May-2024
  • (2018)MvmotionCluster Computing10.1007/s10586-013-0245-z17:2(441-452)Online publication date: 24-Dec-2018
  • (2009)Optimizing Live Migration of Virtual Machines in SMP Clusters for HPC ApplicationsProceedings of the 2009 Sixth IFIP International Conference on Network and Parallel Computing10.1109/NPC.2009.32(51-58)Online publication date: 19-Oct-2009

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media