Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/263764.263794acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article
Free access

Performance implications of communication mechanisms in all-software global address space systems

Published: 21 June 1997 Publication History
  • Get Citation Alerts
  • Abstract

    Global addressing of shared data simplifies parallel programming and complements message passing models commonly found in distributed memory machines. A number of programming systems have been designed that synthesize global addressing purely in software on such machines. These systems provide a number of communication mechanisms to mitigate the effect of high communication latencies and overheads. This study compares the mechanisms in two representative all-software systems: CRL and Split-C. CRL uses region-based caching while Split-C uses split-phase and push-based data transfers for optimizing communication performance. Both systems take advantage of bulk data transfers.By implementing a set of parallel applications in both CRL and Split-C, and running them on the IBM SP2, Meiko CS-2 and two simulated architectures, we find that split-phase and push-based bulk data transfers are essential for good performance. Region-based caching benefits applications with irregular structure and with sufficient temporal locality, especially under high communication latencies. However, caching also hurts performance when there is insufficient data reuse or when the size of caching granularity is mismatched with the communication granularity. We find the programming complexity of the communication mechanisms in both languages to be comparable. Based on our results, we recommend that an ideal system intended to support diverse applications on parallel platforms should incorporate the communication mechanisms in CRL and Split-C.

    References

    [1]
    T. Agerwala, J. L. Martin, J. Mirza, D. Sadler, D. Dias, and M. Snir. SP2 System Architecture. IBM Systems Journal, 34(2):152-184, 1995.]]
    [2]
    H. E. Bal, M. E Kaashoek, and A. S. Tanenbaum. Orca: A Language for Parallel Programming of Distributed Systems. IEEE Transactions on Software Engineering, pages 190--205, March 1992.]]
    [3]
    S. Chandra, J. Larus, and A. Rogers. Where is Time Spent in Message-Passing and Shared-Memory Programs? In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI). ACM, October 1994.]]
    [4]
    K. M. Chandy and C. Kesselman. CC++: A Declarative Concurrent Object-Oriented Programming Notation. In Research Directions in Concurrent Object-Oriented Programming. MIT Press, 1993.]]
    [5]
    C. Chang, G. Czajkowski, C. Hawblitzel, and T. von Eicken. Low-Latency Communication on the IBM RISC Systern/6000 SP2. In Proceedings of Supercomputing '96, Pittsburgh, PA, November 1996. IEEE.]]
    [6]
    D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a Realistic Model of Parallel Computation. In Proceedings of the Fourth A CM Symposium on Principles and Practice of Parallel Programming, pages 1-12, San Diego, May 1993.]]
    [7]
    D. E. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumeta, and T. von Eicken. Introduction to Split-C. In Proceedings of Supercomputing '93, 1993.]]
    [8]
    High Performance Fortran Forum. High Performance Fortran Language Specification Version 1.0, May 1993.]]
    [9]
    M. Homewood and M. McLaren. Meiko CS-2 Interconnect Elan-Elite Design. In Proceedings of Hot Interconnects, August 1993.]]
    [10]
    K. L. Johnson, M. E Kaashoek, and D. A. Wallach. CRL: High-Performance All-Software Distributed Shared Memory. in Proceedings of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain, CO, December 1995.]]
    [11]
    D. Kranz, K. Johnson, A. Agarwal, J. Kubiatowicz, and B.- H. Lira. Integrating Message-Passing and Shared-Memory: Early Experience. In Proceedings of the Fourth ACM Symposium on Principles and Practice of Parallel Programming, pages 54-63, San Diego, May 1993.]]
    [12]
    B.-H. Lim, E Heidelberger, E Pattnaik, and M. Snir. Message Proxies for Efficient, Protected Communication on SMP Clusters. In Proceedings of the 3rd International Symposium on High Performance Computer Architecture, San Antonio, TX, February 1997. IEEE.]]
    [13]
    H. Lui, S. Dwarkadas, A. Cox, and W. Zwaenepoel. Message Passing Versus Distributed Shared Memory on Networks of Workstations. In Proceedings of Supercomputing '95, San Diego, CA, 1995. ACM.]]
    [14]
    J. Nieplocha, R. J. Harrison, and R. J. Litdefield. Global Arrays: A Portable 'Shared-Memory' Programming Model for Distributed Memory Computers. In Proceedings of Supercomputing '94, pages 340-349, Washington, DC, 1994. IEEE.]]
    [15]
    E. Rothberg, J. P. Singh, and A. Gupta. Working Sets, Cache Sizes and Node Granularity Issues for Large-Scale Multiprocessors. In Proceedings of the 20th International Symposium on Computer Architecture, San Diego, CA, May 1993.]]
    [16]
    D. J. Scales and M. S. Lam. The Design and Evaluation of a Shared Object System of Distributed Memory Machines. In Proceedings of the First Symposium on Operating Systems Design and Implementation, pages 101-114, November 1994.]]
    [17]
    K. E. Schauser and C. J. Scheiman. Experience with Active Messages on the Meiko CS-2. In Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, CA, April 1995.]]
    [18]
    J.P. Singh, A. Gupta, and J. L. Hennessy. Implications of Hierarchical N-Body Techniques for Multi processor Architecture. In ACM Transactions on Computer Systems, May 1995.]]
    [19]
    T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A Mechanism for Integrated Communication and Computation. In Proceedings of the 19th International Symposium in Computer Architecture, pages 256--266, Gold Coast, Australia, May 1992.]]
    [20]
    S. Woo, J. P. Singh, and J. Hennessy. The Performance Advantages of Integrating Message Passing in Cache-Coherent Multiprocessors. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI). ACM, October 1994.]]
    [21]
    M. J. Zekauskas, W. A. Sawdon, and B. N. Bershad. Software Write Detection for a Distributed Shared Memory. In Proceedings of the First Symposium on Operating Systems Design and Implementation, pages 87-100, November 1994.]]

    Cited By

    View all
    • (1998)Building parallel runtime systems with Active MessagesProceedings International Symposium on Software Engineering for Parallel and Distributed Systems10.1109/PDSE.1998.668161(83-93)Online publication date: 1998
    • (1997)Evaluating the performance limitations of MPMD communicationProceedings of the 1997 ACM/IEEE conference on Supercomputing10.1145/509593.509604(1-10)Online publication date: 15-Nov-1997

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
    June 1997
    287 pages
    ISBN:0897919068
    DOI:10.1145/263764
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 June 1997

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    PPoPP97
    Sponsor:
    PPoPP97: Principles & Practices of Parallel Programming
    June 18 - 21, 1997
    Nevada, Las Vegas, USA

    Acceptance Rates

    PPOPP '97 Paper Acceptance Rate 26 of 86 submissions, 30%;
    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)50
    • Downloads (Last 6 weeks)12

    Other Metrics

    Citations

    Cited By

    View all
    • (1998)Building parallel runtime systems with Active MessagesProceedings International Symposium on Software Engineering for Parallel and Distributed Systems10.1109/PDSE.1998.668161(83-93)Online publication date: 1998
    • (1997)Evaluating the performance limitations of MPMD communicationProceedings of the 1997 ACM/IEEE conference on Supercomputing10.1145/509593.509604(1-10)Online publication date: 15-Nov-1997

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media