Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1065944.1065950acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article

An evaluation of global address space languages: co-array fortran and unified parallel C

Published: 15 June 2005 Publication History
  • Get Citation Alerts
  • Abstract

    Co-array Fortran (CAF) and Unified Parallel C (UPC) are two emerging languages for single-program, multiple-data global address space programming. These languages boost programmer productivity by providing shared variables for inter-process communication instead of message passing. However, the performance of these emerging languages still has room for improvement. In this paper, we study the performance of variants of the NAS MG, CG, SP, and BT benchmarks on several modern architectures to identify challenges that must be met to deliver top performance. We compare CAF and UPC variants of these programs with the original Fortran+MPI code. Today, CAF and UPC programs deliver scalable performance on clusters only when written to use bulk communication. However, our experiments uncovered some significant performance bottlenecks of UPC codes on all platforms. We account for the root causes limiting UPC performance such as the synchronization model, the communication efficiency of strided data, and source-to-source translation issues. We show that they can be remedied with language extensions, new synchronization constructs, and, finally, adequate optimizations by the back-end C compilers.

    References

    [1]
    D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, Dec. 1995.
    [2]
    D. Bonachea. Gasnet specification, v1.1. Technical Report CSD-02-1207, U.C. Berkeley, October 2002.
    [3]
    D. Bonachea. Proposal for extending the upc memory copy library functions and supporting extensions to gasnet, v1.0. Technical Report LBNL-56495, Lawrence Berkeley National, October 2004.
    [4]
    F. Cantonnet, Y. Yao, S. Annareddy, A. Mohamed, and T. El-Ghazawi. Performance monitoring and evaluation of a UPC implementation on a NUMA architecture. In Proceedings of the International Parallel and Distributed Processing Symposium, Nice, France, Apr. 2003.
    [5]
    W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, and K. Yelick. A performance analysis of the Berkeley UPC compiler. In Proceedings of the 17th ACM International Conference on Supercomputing, San Francisco, California, June 2003.
    [6]
    C. Coarfa, Y. Dotsenko, J. Eckhardt, and J. Mellor-Crummey. Co-array Fortran Performance and Potential: An NPB Experimental Study. In Proc. of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing, number 2958 in LNCS. Springer-Verlag, October 2-4, 2003.
    [7]
    Y. Dotsenko, C. Coarfa, and J. Mellor-Crummey. A Multiplatform Co-Array Fortran Compiler. In Proceedings of the 13th Intl. Conference of Parallel Architectures and Compilation Techniques, Antibes Juan-les-Pins, France, September 29 - October 3 2004.
    [8]
    Y. Dotsenko, C. Coarfa, J. Mellor-Crummey, and D. Chavarrça-Miranda. Experiences with Co-Array Fortran on Hardware Shared Memory Platforms. In Proceedings of the 17th International Workshop on Languages and Compilers for Parallel Computing, September 2004.
    [9]
    T. El-Ghazawi, F. Cantonne, P. Saha, R. Thakur, R. Ross, and D. Bonachea. UPC-IO: A Parallel I/O API for UPC v1.0, July 2004. Available at http://upc.gwu.edu/docs/UPC-IOv1.0.pdf.
    [10]
    T. A. El-Ghazawi and F. Cantonnet. UPC performance and potential: A NPB experimental study. In Proceedings of the 2002 ACM/IEEE conference on Supercomputing (CDROM), Baltimore, MD, Nov. 2002. IEEE Computer Society.
    [11]
    T. A. El-Ghazawi, W. W. Carlson, and J. M. Draper. UPC Language Specifications v1.1.1, October 2003.
    [12]
    Intrepid Technology Inc. GCC Unified Parallel C. http://www.intrepid.com/upc.
    [13]
    J. Mellor-Crummey, R. Fowler, G. Marin, and N. Tallent. HPCView: A tool for top-down analysis of node performance. The Journal of Supercomputing, 23:81--101, 2002. Special Issue with selected papers from the Los Alamos Computer Science Institute Symposium.
    [14]
    V. Naik. A scalable implementation of the NAS parallel benchmark BT on distributed memory systems. IBM Systems Journal, 34(2), 1995.
    [15]
    J. Nieplocha and B. Carpenter. ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-Time Systems, volume 1586 of Lecture Notes in Computer Science, pages 533--546. Springer-Verlag, 1999.
    [16]
    R. W. Numrich and J. K. Reid. Co-Array Fortran for parallel programming. ACM Fortran Forum, 17(2):1--31, August 1998.
    [17]
    Open64 Developers. Open64 compiler and tools. http://sourceforge.net/projects/open64, Sept. 2001.
    [18]
    Open64/SL Developers. Open64/SL compiler and tools. http://hipersoft.cs.rice.edu/open64, July 2002.
    [19]
    Rice University. HPCToolkit performance analysis tools. http://www.hipersoft.rice.edu/hpctoolkit.
    [20]
    Rice University. cafc - A Multiplatform, Open Source Co-Array Fortran Compiler. http://www.hipersoft.rice.edu/caf, Apr. 2005.
    [21]
    E. Wiebel, D. Greenberg, and S. Seidel. UPC Collective Operations Specifications v1.0, December 2003. Available at http://upc.gwu.edu/docs/UPC Coll Spec V1.0.pdf.

    Cited By

    View all
    • (2024)Parallel SnowModel (v1.0): a parallel implementation of a distributed snow-evolution modeling system (SnowModel)Geoscientific Model Development10.5194/gmd-17-4135-202417:10(4135-4154)Online publication date: 22-May-2024
    • (2024)TrackFM: Far-out Compiler Support for a Far Memory WorldProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624856(401-419)Online publication date: 27-Apr-2024
    • (2022)An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous NodesACM Transactions on Computer Systems10.1145/350522439:1-4(1-30)Online publication date: 5-Jul-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
    June 2005
    310 pages
    ISBN:1595930809
    DOI:10.1145/1065944
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 June 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CAF
    2. UPC
    3. co-array fortran
    4. compilers
    5. global address space languages
    6. parallel languages
    7. performance
    8. scalability
    9. unified parallel C

    Qualifiers

    • Article

    Conference

    PPoPP05
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)3

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Parallel SnowModel (v1.0): a parallel implementation of a distributed snow-evolution modeling system (SnowModel)Geoscientific Model Development10.5194/gmd-17-4135-202417:10(4135-4154)Online publication date: 22-May-2024
    • (2024)TrackFM: Far-out Compiler Support for a Far Memory WorldProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624856(401-419)Online publication date: 27-Apr-2024
    • (2022)An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous NodesACM Transactions on Computer Systems10.1145/350522439:1-4(1-30)Online publication date: 5-Jul-2022
    • (2021)Fortran Coarray Implementation of Semi-Lagrangian Convected Air Particles within an Atmospheric ModelChemEngineering10.3390/chemengineering50200215:2(21)Online publication date: 6-May-2021
    • (2021)ClamorProceedings of the ACM Symposium on Cloud Computing10.1145/3472883.3486996(654-669)Online publication date: 1-Nov-2021
    • (2021)Sharing non‐cache‐coherent memory with bounded incoherenceConcurrency and Computation: Practice and Experience10.1002/cpe.641434:2Online publication date: Jun-2021
    • (2020)Code generation approaches for parallel geometric multigrid solversAnalele Universitatii "Ovidius" Constanta - Seria Matematica10.2478/auom-2020-003828:3(123-152)Online publication date: 28-Dec-2020
    • (2020)An OpenMP Runtime for Transparent Work Sharing Across Cache-Incoherent Heterogeneous NodesProceedings of the 21st International Middleware Conference10.1145/3423211.3425679(415-429)Online publication date: 7-Dec-2020
    • (2020)Bounded incoherenceProceedings of the Eleventh International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3380536.3380541(1-10)Online publication date: 22-Feb-2020
    • (2020)A Current Task-Based Programming Paradigms AnalysisComputational Science – ICCS 202010.1007/978-3-030-50426-7_16(203-216)Online publication date: 15-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media