Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/781498.781526acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article

The design and implementation of a parallel array operator for the arbitrary remapping of data

Published: 11 June 2003 Publication History
  • Get Citation Alerts
  • Abstract

    Gather and scatter are data redistribution functions of long-standing importance to high performance computing. In this paper, we present a highly-general array operator with powerful gather and scatter capabilities unmatched by other array languages. We discuss an efficient parallel implementation, introducing three new optimizations---schedule compression, dead array reuse, and direct communication---that reduce the costs associated with the operator's wide applicability. In our implementation of this operator in ZPL, we demonstrate performance comparable to the hand-coded Fortran + MPI versions of the NAS FT and CG benchmarks.

    References

    [1]
    J. C. Adams, W. S. Brainerd, J. T. Martin, B. T. Smith, and J. L. Wagener. Fortran 90 Handbook. McGraw-Hill, New York, NY, USA, 1992.
    [2]
    D. Bailey, T. Harris, W. Saphir, R. van~der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical report, NASA Ames Research Center (NAS-95-020), December 1995.
    [3]
    R. Barriuso and A. Knies. SHMEM user's guide. Technical report, Cray Research Inc., May 1994.
    [4]
    S. Benkner, P. Mehrotra, J. V. Rosendale, and H. Zima. High-level management of communication schedules in HPF-like languages. In Proceedings of the ACM International Conference on Supercomputing, pages 109--116, 1998.
    [5]
    S. Benkner and H. Zima. Compiling High Performance Fortran for distributed memory architectures. Parallel Computing, 25(13--14):1785--1825, 1999.
    [6]
    W. W. Carlson, J. M. Draper, D. E. Culler, K. Yelick, E. Brooks, and K. Warren. Introduction to UPC and language specification. Technical Report CCS-TR-99-157, Center for Computing Sciences, Bowie, MD, May 1999.
    [7]
    B. L. Chamberlain. The Design and Implementation of a Region-Based Parallel Language. PhD thesis, University of Washington, November 2001.
    [8]
    B. L. Chamberlain, S.-E. Choi, E. C. Lewis, C. Lin, L. Snyder, and W. D. Weathersby. ZPL's WYSIWYG performance model. In Proceedings of the IEEE Workshop on High-Level Parallel Programming Models and Supportive Environments, 1998.
    [9]
    B. L. Chamberlain, E. C. Lewis, and L. Snyder. Problem space promotion and its evaluation as a technique for efficient parallel computation. In Proceedings of the ACM International Conference on Supercomputing, 1999.
    [10]
    B. L. Chamberlain and L. Snyder. Array language support for parallel sparse computation. In Proceedings of the ACM International Conference on Supercomputing, 2001.
    [11]
    S. J. Deitz, B. L. Chamberlain, and L. Snyder. High-level language support for user-defined reductions. Journal of Supercomputing, 23(1):23--37, August 2002.
    [12]
    W. Gehrke. Fortran 95 Language Guide. Springer Verlag, October 1996.
    [13]
    M. Gupta, S. Midkiff, E. Schonberg, V. Seshadri, D. Shields, K.-Y. Wang, W.-M. Ching, and T. Ngo. An HPF compiler for the IBM SP2. In Proceedings of the ACM Conference on Supercomputing, December 1995.
    [14]
    High Performance Fortran Forum. High Performance Fortran Language Specification, Version 2.0. 1997.
    [15]
    K. E. Iverson. A Programming Language. Wiley, New York, NY, USA, 1968.
    [16]
    R. Mirchandany, J. Saltz, R. Smith, D. Nicol, and K. Crowley. Principles of runtime support for parallel processors. In Proceedings of the ACM International Conference on Supercomputing, pages 140--152, July 1988.
    [17]
    R. W. Numrich and J. K. Reid. Co-Array Fortran for parallel programming. Technical Report RAL-TR-1998-060, Rutherford Appleton Laboratory, Oxon, UK, August 1998.
    [18]
    M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. Dongarra. MPI: the complete reference. MIT Press, Cambridge, MA, USA, 1996.
    [19]
    L. Snyder. Programming Guide to ZPL. MIT Press, Cambridge, MA, USA, 1999.
    [20]
    M. Ujaldon, S. D. Sharma, J. Saltz, and E. L. Zapata. Run-time techniques for parallelizing sparse matrix problems. In Workshop on Parallel Algorithms for Irregularly Structured Problems, pages 43--57, 1995.
    [21]
    K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, and A. Aiken. Titanium: A high-performance Java dialect. In ACM 1998 Workshop on Java for High-Performance Network Computing, 1998.

    Cited By

    View all
    • (2019)A constraint-based approach to automatic data partitioning for distributed memory executionProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356199(1-24)Online publication date: 17-Nov-2019
    • (2017)Control replicationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126949(1-12)Online publication date: 12-Nov-2017
    • (2014)MSLProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2014.31(311-322)Online publication date: 16-Nov-2014
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPoPP '03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
    June 2003
    250 pages
    ISBN:1581135882
    DOI:10.1145/781498
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 38, Issue 10
      Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP 2003) and workshop on partial evaluation and semantics-based program manipulation (PEPM 2003)
      October 2003
      331 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/966049
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ZPL
    2. array languages
    3. gather
    4. parallel programming
    5. scatter

    Qualifiers

    • Article

    Conference

    PPoPP03
    Sponsor:

    Acceptance Rates

    PPoPP '03 Paper Acceptance Rate 20 of 45 submissions, 44%;
    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)A constraint-based approach to automatic data partitioning for distributed memory executionProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356199(1-24)Online publication date: 17-Nov-2019
    • (2017)Control replicationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126949(1-12)Online publication date: 12-Nov-2017
    • (2014)MSLProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2014.31(311-322)Online publication date: 16-Nov-2014
    • (2013)Algebraic program semantics for supercomputingTheories of Programming and Formal Methods10.5555/2554641.2554649(118-135)Online publication date: 1-Jan-2013
    • (2009)Tile ReductionProceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism10.1007/978-3-642-02303-3_12(140-153)Online publication date: 22-May-2009
    • (2007)Executing irregular scientific applications on stream architecturesProceedings of the 21st annual international conference on Supercomputing10.1145/1274971.1274987(93-104)Online publication date: 17-Jun-2007
    • (2007)The design and development of ZPLProceedings of the third ACM SIGPLAN conference on History of programming languages10.1145/1238844.1238852(8-1-8-37)Online publication date: 9-Jun-2007
    • (2006)Runtime address space computation for SDSM systemsProceedings of the 19th international conference on Languages and compilers for parallel computing10.5555/1757112.1757145(330-344)Online publication date: 2-Nov-2006
    • (2006)Analysis of two-level data mapping in an HPF compiler for distributed-memory machinesParallel Computing10.1016/j.parco.2005.11.00332:4(280-300)Online publication date: 1-Apr-2006
    • (2005)Parallelization of the NAS Conjugate Gradient Benchmark Using the Global Arrays Shared Memory Programming ModelProceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 0510.1109/IPDPS.2005.331Online publication date: 4-Apr-2005
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media