Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2287076.2287116acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

VNET/P: bridging the cloud and high performance computing through fast overlay networking

Published: 18 June 2012 Publication History

Abstract

It is now possible to allow VMs hosting HPC applications to seamlessly bridge distributed cloud resources and tightly-coupled supercomputing and cluster resources. However, to achieve the application performance that the tightly-coupled resources are capable of, it is important that the overlay network not introduce significant overhead relative to the native hardware, which is not the case for current user-level tools, including our own existing VNET/U system. In response, we describe the design, implementation, and evaluation of a layer 2 virtual networking system that has negligible latency and bandwidth overheads in 1--10 Gbps networks. Our system, VNET/P, is directly embedded into our publicly available Palacios virtual machine monitor (VMM). VNET/P achieves native performance on 1 Gbps Ethernet networks and very high performance on 10 Gbps Ethernet networks and InfiniBand. The NAS benchmarks generally achieve over 95% of their native performance on both 1 and 10 Gbps. These results suggest it is feasible to extend a software-based overlay network designed for computing at wide-area scales into tightly-coupled environments.

References

[1]
Abu-Libdeh, H., Costa, P., Rowstron, A., O'Shea, G., and Donnelly, A. Symbiotic routing in future data centers. In Proceedings of SIGCOMM (August 2010).
[2]
Andersen, D., Balakrishnan, H., Kaashoek, F., and Morris, R. Resilient overlay networks. In Proceedings of SOSP (March 2001).
[3]
Bavier, A. C., Feamster, N., Huang, M., Peterson, L. L., and Rexford, J. In vini veritas: realistic and controlled network experimentation. In Proceedings of SIGCOMM (September 2006).
[4]
Dinda, P., Sundararaj, A., Lange, J., Gupta, A., and Lin, B. Methods and systems for automatic inference and adaptation of virtualized computing environments, March 2012. United States Patent Number 8,145,760.
[5]
Evangelinos, C., and Hill, C. Cloud computing for parallel scientific hpc applications: Feasibility of running coupled atmosphere-ocean climate models on amazon's ec2. In Proceedings of Cloud Computing and its Applications (CCA) (October 2008).
[6]
Figueiredo, R., Dinda, P. A., and Fortes, J. A case for grid computing on virtual machines. In Proceedings of the 23rd International Conference on Distributed Computing Systems (ICDCS 2003) (May 2003).
[7]
Gabriel, E., Fagg, G. E., Bosilca, G., Angskun, T., Dongarra, J. J., Squyres, J. M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R. H., Daniel, D. J., Graham, R. L., and Woodall, T. S. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVM/MPI Users' Group Meeting (September 2004).
[8]
Ganguly, A., Agrawal, A., Boykin, P. O., and Figueiredo, R. IP over P2P: Enabling self-configuring virtual ip networks for grid computing. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (April 2006).
[9]
Greenberg, A., Hamilton, J. R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D. A., Patel, P., and Sengupta, S. VL2: A scalable and flexible data center network. In Proceedings of SIGCOMM (August 2009).
[10]
Guo, C., Lu, G., Li, D., Wu, H., Zhang, X., Shi, Y., Tian, C., Zhang, Y., and Lu, S. Bcube: A high performance, server-centric network architecture for modular data centers. In Proceedings of SIGCOMM (August 2009).
[11]
Gupta, A., and Dinda, P. A. Inferring the topology and traffic load of parallel programs running in a virtual machine environment. In Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP) (June 2004).
[12]
Gupta, A., Zangrilli, M., Sundararaj, A., Huang, A., Dinda, P., and Lowekamp, B. Free network measurement for virtual machine distributed computing. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2006).
[13]
hua Chu, Y., Rao, S., Sheshan, S., and Zhang, H. Enabling conferencing applications on the internet using an overlay multicast architecture. In Proceedings of ACM SIGCOMM (August 2001).
[14]
Huang, W., Liu, J., Abali, B., and Panda, D. A case for high performance computing with virtual machines. In Proceedings of the 20th ACM International Conference on Supercomputing (ICS) (June-July 2006).
[15]
Innovative Computing Laboratory. Hpc challenge benchmark. http://icl.cs.utk.edu/hpcc/.
[16]
Intel. Intel cluster toolkit 3.0 for linux. http://software.intel.com/en-us/articles/intel-mpi-benchmarks/.
[17]
Jiang, X., and Xu, D. Violin: Virtual internetworking on overlay infrastructure. Tech. Rep. CSD TR 03-027, Department of Computer Sciences, Purdue University, July 2003.
[18]
Joseph, D. A., Kannan, J., Kubota, A., Lakshminarayanan, K., Stoica, I., and Wehrle, K. Ocala: An architecture for supporting legacy applications over overlays. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI) (May 2006).
[19]
Kallahalla, M., Uysal, M., Swaminathan, R., Lowell, D. E., Wray, M., Christian, T., Edwards, N., Dalton, C. I., and Gittler, F. Softudc: A software-based data center for utility computing. IEEE Computer 37, 11 (2004), 38--46.
[20]
Kim, C., Caesar, M., and Rexford, J. Floodless in seattle: a scalable ethernet architecture for large enterprises. In Proceedings of SIGCOMM (August 2008).
[21]
Kumar, S., Raj, H., Schwan, K., and Ganev, I. Re-architecting vmms for multicore systems: The sidecore approach. In Proceedings of the 2007 Workshop on the Interaction between Operating Systems and Computer Architecture (June 2007).
[22]
Lange, J., and Dinda, P. Transparent network services via a virtual traffic layer for virtual machines. In Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC) (June 2007).
[23]
Lange, J., Dinda, P., Hale, K., and Xia, L. An introduction to the palacios virtual machine monitor--release 1.3. Tech. Rep. NWU-EECS-11-10, Department of Electrical Engineering and Computer Science, Northwestern University, October 2011.
[24]
Lange, J., Pedretti, K., Dinda, P., Bae, C., Bridges, P., Soltero, P., and Merritt, A. Minimal-overhead virtualization of a large scale supercomputer. In Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE) (March 2011).
[25]
Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., Bridges, P., Gocke, A., Jaconette, S., Levenhagen, M., and Brightwell, R. Palacios and kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (April 2010).
[26]
Lange, J., Sundararaj, A., and Dinda, P. Automatic dynamic run-time optical network reservations. In Proceedings of the 14th International Symposium on High Performance Distributed Computing (HPDC) (July 2005).
[27]
Lin, B., and Dinda, P. Vsched: Mixing batch and interactive virtual machines using periodic real-time scheduling. In Proceedings of ACM/IEEE SC (Supercomputing) (November 2005).
[28]
Lin, B., Sundararaj, A., and Dinda, P. Time-sharing parallel applications with performance isolation and control. In Proceedings of the 4th IEEE International Conference on Autonomic Computing (ICAC) (June 2007).
[29]
Liu, J., Huang, W., Abali, B., and Panda, D. High performance vmm-bypass i/o in virtual machines. In Proceedings of the USENIX Annual Technical Conference (May 2006).
[30]
Menon, A., Cox, A. L., and Zwaenepoel, W. Optimizing network virtualization in xen. In Proceedings of the USENIX Annual Technical Conference (USENIX) (May 2006).
[31]
Mergen, M. F., Uhlig, V., Krieger, O., and Xenidis, J. Virtualization for high-performance computing. Operating Systems Review 40, 2 (2006), 8--11.
[32]
Mysore, R. N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., and Vahdat, A. Portland: A scalable fault-tolerant layer 2 data center network fabric. In Proceedings of SIGCOMM (August 2009).
[33]
Nurmi, D., Wolski, R., Grzegorzyk, C., Obertelli, G., Soman, S., Youseff, L., and Zagorodnov, D. The eucalyptus open-source cloud-computing system. In Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid) (May 2009).
[34]
Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., and Epema, D. An early performance analysis of cloud computing services for scientific computing. Tech. Rep. PDS2008-006, Delft University of Technology, Parallel and Distributed Systems Report Series, December 2008.
[35]
Raj, H., and Schwan, K. High performance and scalable i/o virtualization via self-virtualized devices. In Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC) (July 2007).
[36]
Russell, R. virtio: towards a de-facto standard for virtual i/o devices. Operating Systems Review 42, 5 (2008), 95--103.
[37]
Ruth, P., Jiang, X., Xu, D., and Goasguen, S. Towards virtual distributed environments in a shared infrastructure. IEEE Computer (May 2005).
[38]
Ruth, P., McGachey, P., Jiang, X., and Xu, D. Viocluster: Virtualization for dynamic computational domains. In Proceedings of the IEEE International Conference on Cluster Computing (Cluster) (September 2005).
[39]
Shafer, J., Carr, D., Menon, A., Rixner, S., Cox, A. L., Zwaenepoel, W., and Willmann, P. Concurrent direct network access for virtual machine monitors. In Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA) (February 2007).
[40]
Stoica, I., Morris, R., Karger, D., Kaashoek, F., and Balakrishnan, H. Chord: A scalable Peer-To-Peer lookup service for internet applications. In Proceedings of ACM SIGCOMM 2001 (2001), pp. 149--160.
[41]
Sugerman, J., Venkitachalan, G., and Lim, B.-H. Virtualizing I/O devices on VMware workstation's hosted virtual machine monitor. In Proceedings of the USENIX Annual Technical Conference (June 2001).
[42]
Sundararaj, A., and Dinda, P. Towards virtual networks for virtual machine grid computing. In Proceedings of the 3rd USENIX Virtual Machine Research And Technology Symposium (VM 2004) (May 2004). Earlier version available as Technical Report NWU-CS-03-27, Department of Computer Science, Northwestern University.
[43]
Sundararaj, A., Gupta, A., and Dinda, P. Increasing application performance in virtual environments through run-time inference and adaptation. In Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC) (July 2005).
[44]
Sundararaj, A., Sanghi, M., Lange, J., and Dinda, P. An optimization problem in adaptive virtual environmnets. In Proceedings of the seventh Workshop on Mathematical Performance Modeling and Analysis (MAMA) (June 2005).
[45]
Tsugawa, M. O., and Fortes, J. A. B. A virtual network (vine) architecture for grid computing. In 20th International Parallel and Distributed Processing Symposium (IPDPS) (April 2006).
[46]
Van der Wijngaart, R. NAS parallel benchmarks version 2.4. Tech. Rep. NAS-02-007, NASA Advanced Supercomputing (NAS Division), NASA Ames Research Center, October 2002.
[47]
Wolinsky, D., Liu, Y., Juste, P. S., Venkatasubramanian, G., and Figueiredo, R. On the design of scalable, self-configuring virtual networks. In Proceedings of 21st ACM/IEEE International Conference of High Performance Computing, Networking, Storage, and Analysis (SuperComputing / SC) (November 2009).
[48]
Xia, L., Lange, J., Dinda, P., and Bae, C. Investigating Virtual Passthrough I/O on Commodity Devices. Operating Systems Review 43, 3 (July 2009). Initial version appeared at WIOV 2008.

Cited By

View all
  • (2020)On the Design and Implementation of IP-over-P2P Overlay Virtual Private NetworksIEICE Transactions on Communications10.1587/transcom.2019CPI0001E103.B:1(2-10)Online publication date: 1-Jan-2020
  • (2019)Harnessing Data Movement in Virtual Clusters for In-Situ ExecutionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286787930:3(615-629)Online publication date: 1-Mar-2019
  • (2016)Resource Scheduling for Energy-Aware Reconfigurable Internet Data CentersInnovative Research and Applications in Next-Generation High Performance Computing10.4018/978-1-5225-0287-6.ch002(21-46)Online publication date: 2016
  • Show More Cited By

Index Terms

  1. VNET/P: bridging the cloud and high performance computing through fast overlay networking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
    June 2012
    308 pages
    ISBN:9781450308052
    DOI:10.1145/2287076
    • General Chair:
    • Dick Epema,
    • Program Chairs:
    • Thilo Kielmann,
    • Matei Ripeanu
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hpc
    2. overlay networks
    3. scalability
    4. virtualization

    Qualifiers

    • Research-article

    Conference

    HPDC'12
    Sponsor:

    Acceptance Rates

    HPDC '12 Paper Acceptance Rate 23 of 143 submissions, 16%;
    Overall Acceptance Rate 166 of 966 submissions, 17%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)On the Design and Implementation of IP-over-P2P Overlay Virtual Private NetworksIEICE Transactions on Communications10.1587/transcom.2019CPI0001E103.B:1(2-10)Online publication date: 1-Jan-2020
    • (2019)Harnessing Data Movement in Virtual Clusters for In-Situ ExecutionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286787930:3(615-629)Online publication date: 1-Mar-2019
    • (2016)Resource Scheduling for Energy-Aware Reconfigurable Internet Data CentersInnovative Research and Applications in Next-Generation High Performance Computing10.4018/978-1-5225-0287-6.ch002(21-46)Online publication date: 2016
    • (2016)Energy-Saving QoS Resource Management of Virtualized Networked Data Centers for Big Data Stream ComputingBig Data10.4018/978-1-4666-9840-6.ch040(848-886)Online publication date: 2016
    • (2016)Self-configuring Software-defined Overlay Bypass for Seamless Inter- and Intra-cloud Virtual NetworkingProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907318(153-164)Online publication date: 31-May-2016
    • (2016)Wireless network virtualization for enhancing security: Status, challenges and perspectivesSoutheastCon 201610.1109/SECON.2016.7506769(1-8)Online publication date: Mar-2016
    • (2015)Energy-Saving QoS Resource Management of Virtualized Networked Data Centers for Big Data Stream ComputingEmerging Research in Cloud Distributed Computing Systems10.4018/978-1-4666-8213-9.ch004(122-155)Online publication date: 2015
    • (2014)llamaOSProceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops10.1109/IPDPSW.2014.129(1140-1149)Online publication date: 19-May-2014
    • (2014)Energy-efficient adaptive networked datacenters for the QoS support of real-time applicationsThe Journal of Supercomputing10.1007/s11227-014-1305-871:2(448-478)Online publication date: 17-Oct-2014
    • (2014)Cloud Networking to Support Data Intensive ApplicationsCloud Computing for Data-Intensive Applications10.1007/978-1-4939-1905-5_3(61-81)Online publication date: 15-Nov-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media