Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/647977.743232guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Case Study: Effects of WITH-Loop-Folding on the NAS Benchmark MG in SAC

Published: 09 September 1998 Publication History
  • Get Citation Alerts
  • Abstract

    SAC is a functional C variant with efficient support for high-level array operations. This paper investigates the applicability of a SAC specific optimization technique called with-loop-folding to real world applications. As an example program which originates from the Numerical Aerodynamic Simulation (NAS) Program developed at NASA Ames Research Center, the so-called NAS benchmark MG is chosen. It comprises a kernel from the NAS Program which implements 3-dimensional multigrid relaxation.
    Several run-time measurements exploit two different benefits of WITH-loop-folding: First, an overall speed-up of about 20% can be observed. Second, a comparison between the run-times of a hand-optimized specification and of APL-like specifications yields identical run-times, although a naive compilation that does not apply WITH-loop-folding leads to slowdowns of more than an order of magnitude. Furthermore, WITH-loop-folding makes a slight variation of the algorithm feasible which substantially simplifies the program specification and requires less memory during execution.
    Finally, the optimized run-times are compared against run-times gained from the original Fortran program, which shows that for different problem sizes, the code generated from the Sac program does not only reach the execution times of the code generated from the Fortran program but even outperforms them by about 10%.

    References

    [1]
    J.C. Adams, W.S. Brainerd, J.T. Martin, et al. Fortran90 Handbook - Complete ANSI/ISO Reference. McGraw-Hill, 1992. ISBN 0-07-000406-4.
    [2]
    D.F. Bacon, S.L. Graham, and O.J. Sharp. Compiler Transformations for High-Performance Computing. ACM Computing Surveys, 26(4):345-420, 1994.
    [3]
    D. Bailey, E. Barszcz, J. Barton, et al. The NAS Parallel Benchmarks. RNR 94-007, NASA Ames Research Center, 1994.
    [4]
    D. Braess. Finite Elemente. Springer, 1996. ISBN 3-540-61905-4.
    [5]
    A. Brandt. Multigrid Methods: 1984 Guide. Dept of applied mathematics, The Weizmann Institute of Science, Rehovot/Israel, 1984.
    [6]
    T. Budd. Composition and Compilation in Functional Programming Languages. Technical Report 88-60-14, Oregon State University, 1988.
    [7]
    C. Burke. J and APL. Iverson Software Inc., Toronto, Canada, 1996.
    [8]
    D.C. Cann. The Optimizing SISAL Compiler: Version 12.0. Lawrence Livermore National Laboratory, LLNL, Livermore California, 1993. part of the SISAL distribution.
    [9]
    W.-N. Chin. Safe Fusion of Functional Expressions II: Further Improvements. Journal of Functional Programming, 4(4):515-550, 1994.
    [10]
    High Performance Fortran Forum. High Performance Fortran language specification V1.1, 1994.
    [11]
    A. Gill. Cheap Deforestation for Non-strict Functional Languages. PhD thesis, Glasgow University, 1996.
    [12]
    J. Halen, P. Hammarlund, and B. Lisper. An Experimental Implementation of a Highly Abstract Model of Data Parallel Programming. TRITA-IT 97:2, Dept. of Teleinformatics, KTH, Stockholm, 1997.
    [13]
    P. Hammarlund and B. Lisper. On the Relation between Functional and Data Parallel Programming Languages. In Proc. 1993 ACM Conference on Functional Programming Languages and Computer Architecture (FPLCA'93), pages 210-222. ACM Press, 1993.
    [14]
    K.E. Iverson. A Programming Language. Wiley, New York, 1962.
    [15]
    M.A. Jenkins and W.H. Jenkins. The Q'Nial Language and Reference Manuals. Nial Systems Ltd., Ottawa, Canada, 1993.
    [16]
    E.C. Lewis, C. Lin, and L. Snyder. The Implementation and Evaluation of Fusion and Contraction in Array Languages. In Proc. 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '98). 1998.
    [17]
    C. Lin. ZPL Language Reference Manual. UW-CSE-TR 94-10-06, University of Washington, 1996.
    [18]
    J.R. McGraw, S.K. Skedzielewski, S.J. Allan, R.R. Oldehoeft, et al. Sisal: Streams and Iteration in a Single Assignment Language: Reference Manual Version 1.2. M 146, Lawrence Livermore National Laboratory, LLNL, Livermore California, 1985.
    [19]
    L. Nemeth and S. Peyton Jones. A Design for Warm Fusion. In K. Hammond, A.J.T. Davie, and C. Clack, editors, Draft Proc. 10th. International Workshop on Implementation of Functional Languages (IFL '98), London, England, pages 381-393. University College, London, 1998.
    [20]
    G. Roth and K. Kennedy. Loop Fusion in High Performance Fortran. CRPC TR98745, Rice University, Houston, Texas, 1998.
    [21]
    S.-B. Scholz. Single Assignment C - Entwurf und Implementierung einer funktionalen C-Variante mit spezieller Unterstützung shape-invarianter Array-Operationen . PhD thesis, Institut für Informatik und Praktische Mathematik, Universität Kiel, 1996.
    [22]
    S.-B. Scholz. On Programming Scientific Applications in Sac - A Functional Language Extended by a Subsystem for High-Level Array Operations. In W. Kluge, editor, Proc. 8th. International Workshop on the Implementation of Functional Languages (IFL'96), Bad Godesberg, Germany, September 1996, volume 1268 of LNCS, pages 85-104. Springer-Verlag, 1997.
    [23]
    S.-B. Scholz. With-loop-folding in sac-Condensing Consecutive Array Operations. In C. Clack, T. Davie, and K. Hammond, editors, Proc. 9th. International Workshop on Implementation of Functional Languages, St Andrews, Scotland, September 1997, volume 1467 of LNCS, pages 72-91. Springer-Verlag, 1998.
    [24]
    S.-B. Scholz. On Definin Application-Specific High-Level Operations by Means of Shape-Invariant Programming Facilities. In S. Picchi and M.Micocci, editors, Proc. Array Processing Language Conference 98, pages 40-45. ACM-SIGAPL, 1998.
    [25]
    P.L. Wadler. Deforestation: transforming programs to eliminate trees. Theoretical Computer Science, 73(2):231-248, 1990.
    [26]
    M.J. Wolfe. High-Performance Compilers for Parallel Computing. Addison-Wesley, 1995. ISBN 0-8053-2730-4.
    [27]
    H. Zima and B. Chapman. Supercompilers for Parallel and Vector Computers. Addison-Wesley, 1991.

    Cited By

    View all
    • (2014)SaC/C formulations of the all-pairs N-body problem and their performance on SMPs and GPGPUsConcurrency and Computation: Practice & Experience10.1002/cpe.307826:4(952-971)Online publication date: 25-Mar-2014
    • (2007)The design and development of ZPLProceedings of the third ACM SIGPLAN conference on History of programming languages10.1145/1238844.1238852(8-1-8-37)Online publication date: 9-Jun-2007
    • (2005)Shared memory multiprocessor support for functional array processing in SACJournal of Functional Programming10.1017/S095679680500553815:3(353-401)Online publication date: 1-May-2005
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IFL '98: Selected Papers from the 10th International Workshop on 10th International Workshop
    September 1998
    245 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 09 September 1998

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2014)SaC/C formulations of the all-pairs N-body problem and their performance on SMPs and GPGPUsConcurrency and Computation: Practice & Experience10.1002/cpe.307826:4(952-971)Online publication date: 25-Mar-2014
    • (2007)The design and development of ZPLProceedings of the third ACM SIGPLAN conference on History of programming languages10.1145/1238844.1238852(8-1-8-37)Online publication date: 9-Jun-2007
    • (2005)Shared memory multiprocessor support for functional array processing in SACJournal of Functional Programming10.1017/S095679680500553815:3(353-401)Online publication date: 1-May-2005
    • (2003)With-Loop scalarization – merging nested array operationsProceedings of the 15th international conference on Implementation of Functional Languages10.1007/978-3-540-27861-0_8(118-134)Online publication date: 8-Sep-2003
    • (2002)Implementing the NAS Benchmark MG in SACProceedings of the 16th International Parallel and Distributed Processing Symposium10.5555/645610.660911Online publication date: 15-Apr-2002
    • (2000)A comparative study of the NAS MG benchmark across parallel languages and architecturesProceedings of the 2000 ACM/IEEE conference on Supercomputing10.5555/370049.370452(46-es)Online publication date: 1-Nov-2000
    • (1999)Accelerating APL programs with SACProceedings of the conference on APL '99 : On track to the 21st century: On track to the 21st century10.1145/312627.312719(50-57)Online publication date: 1-Aug-1999
    • (1998)Accelerating APL programs with SACACM SIGAPL APL Quote Quad10.1145/379277.31271929:2(50-57)Online publication date: 1-Dec-1998

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media