Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/263764.263772acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article
Free access

Compiler and software distributed shared memory support for irregular applications

Published: 21 June 1997 Publication History
  • Get Citation Alerts
  • Abstract

    We investigate the use of a software distributed shared memory (DSM) layer to support irregular computations on distributed memory machines. Software DSM supports irregular computation through demand fetching of data in response to memory access faults. With the addition of a very limited form of compiler support, namely the identification of the section of the indirection array accessed by each processor, many of these on-demand page fetches can be aggregated into a single message, and prefetched prior to the access fault.We have measured the performance of this approach for two irregular applications, moldyn and nbf, using the Tread-Marks DSM system on an 8-processor IBM SP2. We find that it has similar performance to the inspector-executor method supported by the CHAOS run-time library, while requiring much simpler compile-time support. For moldyn, it is up to 23% faster than CHAOS, depending on the input problem's characteristics; and for nbf, it is no worse than 14% slower. If we include the execution time of the inspector, the software DSM-based approach is always faster than CHAOS. The advantage of this approach increases as the frequency of changes to the indirection array increases. The disadvantage of this approach is the potential for false sharing overhead when the data set is small or has poor spatial locality.

    References

    [1]
    G. AgarwM and J. SaJtz. Interprocedural compilation of irregular applications for distributed memory machines. In Proceedings of Supercomputing '95, December 1995.
    [2]
    C. Amza, A.L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. Tread- Marks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18-28, February 1996.
    [3]
    B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. Charmm: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry, 4:187, 1983.
    [4]
    D. Callahan and K. Kennedy. Analysis of interprocedural side effects in a parallel programming environment. Journal of Parallel and Distributed Computing, 5:517- 550, 1988.
    [5]
    3.B. Carter, 3.K. Bennett, and W. Zwaenepoel. Techniques for reducing consistency-related information in distributed shared memory systems. A CM Transactions on Computer Systems, 13(3):205-243, August 1995.
    [6]
    R. Das, P. Havlak, J. Saltz, and K. Kennedy. Index array flattening through program transformation. In Proceedings of Supercomputing '95, December 1995.
    [7]
    R. Das, M. Uysal, 3. Saltz, and Y.-S. Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22(3):462-479, September 1994.
    [8]
    S. Dwarkadas, A.L. Cox, and W. Zwaenepoel. An integrated compile-time/run-time software distributed shared memory system. In Proceedings of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems, October 1996.
    [9]
    K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessots. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 15-26, May 1990.
    [10]
    W.F. van Gunsteren and H.J.C. Berendsen. GROMOS: GROningen MOlecular Simulation software. Technical report, Laboratory of Physical Chemistry, University of Groningen, 1988.
    [11]
    P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350-360, July 1991.
    [12]
    K. Kennedy, K. S. McKinley, and C. Tseng. Analysis and transformation in an interactive parallel programming tool. Concurrency: Practice and Experience, 5(7), October 1993.
    [13]
    K. Li and P. Hudak. Memory coherence in shared virtual memory systems. A CM Transactions on Computer Systems, 7(4):321-359, November 1989.
    [14]
    H. Lu, S. Dwarkadas, A.L. Cox, and W. Zwaenepoel. Message passing versus distributed shared memory on networks of workstations. In Proceedings SuperComputing '95, December 1995.
    [15]
    T.C. Mowry, A.K. Demke, and O. Krieger. Automatic compiler-inserted I/O prefetching for out-of-core applications. In Proceedings of the Second USENIX Symposium on Operating System Design and Implementation, pages 3-17, November 1996.
    [16]
    S.S. Mukherjee, S.D. Sharma, M.D. Hill, 3.R. Larus, A. Rogers, and J. Saltz. Efficient support for irregular applications on distributed memory machines. In Proceedings of the 5th Symposium on the Principles and Practice of Parallel Programming, July 1995.
    [17]
    Steven K. Reinhardt, James R. Larus, and David A. Wood. Tempest and Typhoon: User-level shared memory. In Proceedings of the f21th Annual International Symposium on Computer Architecture, pages 325-337, April 1994.
    [18]
    J. Saltz, H. Berryman, and J. Wu. Multiprocessors and run-time compilation. Concurrency:Practice and Experience, 3(6):573-592, December 1991.
    [19]
    S. Sharma, R. Ponnusamy, B. Moon, Y. Hwang, R. Das, and J. Saltz. Interprocedural compilation of irregular applications for distributed memory machines. In Proceedings SuperComputing '95, dec 1995.
    [20]
    R. yon Hanxleden and K. Kennedy. Give-N-Take-a balanced code placement framework. In Proceedings of the A CM SIGPLAN 91 Conference on Proyramming Language Design and Implementation, June 1994.
    [21]
    R. yon Hanxleden, K. Kennedy, C. Koelbel, R. Das, and J. Saltz. Compiler analysis for irregular problems in Fortran D. In Proceedings of the 5th Workshop on Languages and Compilers for Parallel Computing, August 1992.
    [22]
    Reinhard yon Hanxleden. Handling irregular problems with Fortran D- a preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, December 1993.

    Cited By

    View all
    • (2024)TrackFM: Far-out Compiler Support for a Far Memory WorldProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624856(401-419)Online publication date: 27-Apr-2024
    • (2022)Cache-coherent CLAM (WIP)Proceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3519941.3535073(111-115)Online publication date: 14-Jun-2022
    • (2017)Optimizing locality in graph computations using reuse distance profiles2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2017.8280444(1-8)Online publication date: Dec-2017
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPOPP '97: Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
    June 1997
    287 pages
    ISBN:0897919068
    DOI:10.1145/263764
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 June 1997

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    PPoPP97
    Sponsor:
    PPoPP97: Principles & Practices of Parallel Programming
    June 18 - 21, 1997
    Nevada, Las Vegas, USA

    Acceptance Rates

    PPOPP '97 Paper Acceptance Rate 26 of 86 submissions, 30%;
    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)41
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)TrackFM: Far-out Compiler Support for a Far Memory WorldProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624856(401-419)Online publication date: 27-Apr-2024
    • (2022)Cache-coherent CLAM (WIP)Proceedings of the 23rd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3519941.3535073(111-115)Online publication date: 14-Jun-2022
    • (2017)Optimizing locality in graph computations using reuse distance profiles2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC)10.1109/PCCC.2017.8280444(1-8)Online publication date: Dec-2017
    • (2009)Programming matrix algorithms-by-blocks for thread-level parallelismACM Transactions on Mathematical Software10.1145/1527286.152728836:3(1-26)Online publication date: 23-Jul-2009
    • (2008)Optimizing irregular shared-memory applications for clustersProceedings of the 22nd annual international conference on Supercomputing10.1145/1375527.1375566(256-265)Online publication date: 7-Jun-2008
    • (2007)Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architecturesProceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures10.1145/1248377.1248397(116-125)Online publication date: 9-Jun-2007
    • (2007)Satisfying your dependencies with SuperMatrixProceedings of the 2007 IEEE International Conference on Cluster Computing10.1109/CLUSTR.2007.4629221(91-99)Online publication date: 17-Sep-2007
    • (2006)Runtime address space computation for SDSM systemsProceedings of the 19th international conference on Languages and compilers for parallel computing10.5555/1757112.1757145(330-344)Online publication date: 2-Nov-2006
    • (2006)Optimizing irregular shared-memory applications for distributed-memory systemsProceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/1122971.1122990(119-128)Online publication date: 29-Mar-2006
    • (2005)Shared Memory Parallelization of Data Mining AlgorithmsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2005.1817:1(71-89)Online publication date: 1-Jan-2005
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media