Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/77726.255176acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free access

Compiler-directed data prefetching in multiprocessors with memory hierarchies

Published: 01 June 1990 Publication History
  • Get Citation Alerts
  • Abstract

    Memory hierarchies are used by multiprocessor systems to reduce large memory access times. It is necessary to automatically manage such a hierarchy, to obtain effective memory utilization. In this paper, we discuss the various issues involved in obtaining an optimal memory management strategy for a memory hierarchy. We present an algorithm for finding the earliest point in a program that a block of data can be prefetched. This determination is based on the control and data dependencies in the program. Such a method is an integral part of more general memory management algorithms. We demonstrate our method's potential by using static analysis to estimate the performance improvement afforded by our prefetching strategy and to analyze the reference patterns in a set of Fortran benchmarks. We also study the effectiveness of prefetching in a realistic shared-memory system using an RTL-level simulator and real codes. This differs from previous studies by considering prefetching benefits in the presence of network contention.

    References

    [1]
    David Callahan, John Cocke, and Ken Kennedy. Estimating interlock and improving balance for pipelined architectures. In International Conference on Parallel Processin~, pages 295-304, August 198}'.
    [2]
    Cray Research Inc. The C~ay X-MP series of computers~ 1982.
    [3]
    J. J. Dongarra, J. It. Bunch, C. B. Mole~, and G. W. Stewart. I, inpack User's Guide. Siam P~ess, Philadelphia, 1979.
    [4]
    Allan Gottlieb, Ralph Grishman, Clyde P. Kruskal, Kevin P. McAuliffe, Larry Rudolph, and Marc Snir. The NYU UI- tracomputer- designing an MIMD shared memory parallel machine. In Tenth Annual Symposium on Computer Architecture, 1982.
    [5]
    Dennis Gannon, William Jalby, and Kyle Gallivan. Strategies for cache and local memory management by global program transformation. In Ig8 7 International Conference on Supercomputing, 1987.
    [6]
    Ky}e GaUivan, William Jalby, and Dennis Gannon. On the problem of optimizing data transfers for complex memory systems. in 1988 fnternational Conference on Supercomputing, July 1988.
    [7]
    Edward H. Gornish. Compile time analysis for data prefetching. Master's thesis, University of Illinois at Urbana-Champaign, December 1989.
    [8]
    Elana D. Granston, Stephen W. Turner, and Alexander V. Veidenbaum. Designing a sealable shared-memory system with support for burst traffic. To be published, 1990.
    [9]
    Harlan Husmann. Compiler Memory Management and Compound Function Definition for Multiprocessors. PhD thesis, University of illinois at Urbana-Champaign, August 1986.
    [10]
    Daniel Thomas Jackson. Data movement in doall loops. Master's thesis, University of IIlinois at Urbana-Champaign, May 1985.
    [11]
    David. J. Kuck, E. S. Davidson, D. H. Lawrle, and .~. H. Sameh. Parallel supercomputing today and the Cedar appxoach. Science, 231:967-974, February 1986.
    [12]
    David J. Kuck, R. H. Kuhn, B. Leasure, and M. Wolfe. The structure of an advanced veetorizer for pipelined processors. in Fourth International Computer Software and Applications Conference, October 1980.
    [13]
    Duncan H. Lawrie. Access and alignment of data in an array processor. IEEE Transactions on Computers, C-24(12)'173-183, Decembez 1975.
    [14]
    Kyungsook Yoon Lee, Interconnection Networks and Compiler Algorithms for Multiprocessors. PhD thesis, University of Illinois at Urbana-Champaign, 1983.
    [15]
    Roland Lun Lee. The E~ectiveness of Caches and Data Prefetch Buffer8 in Large.Scale Shared Memory Multiproces- 8ors. PhD thesis, University of Illinois at Urbans-Champaign, May 1987.
    [16]
    D.B. Loveman. Program improvement by source-to-source transformation. Joarnal of the ACM, 24(1):121-145, January 1977.
    [17]
    David M. Mareovitz. A multiprocessot cache performance metric. Master's thesis, University of Illinois at Urbana- Champaign, October 1988.
    [18]
    Gregory F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. K}einfeider, K. P. McAuliffe, E.4,. Melton, V. k. Norton, and J. Weiss. The IBM Research Parallel Processor Prototype (RP3): introduction and architecture. In International Conference on Parallel Processing, August 1985.
    [19]
    Allan K. Porterfield. Software Methods for Improvement of Cache Cache Performance on Supercomputer Applications. PhD thesis, Rice University, May 1989.
    [20]
    B.T. Smith, J. M. Boyle, J. J. Dongarra, B. S. Garbow, Y. Ikebe, V. C. Klema, and C. B. Moler. Matriz Eigensystem Routinea--Eispack Guide. Springer-Verlag, Heidelberg, 1976.
    [21]
    Stephen Wilson Turner. Shared memory and interconnection network performance for vector multiprocessors. Master's thesis, University of Illinois at Utbana- Champaign, May 1989.
    [22]
    Michael Joseph Wolfe. Optimizing Compilers for Supercomputers. PhD thesis, University of Illinois at Urbana-Champaign, October 1982.

    Cited By

    View all
    • (2024)RPG2: Robust Profile-Guided Runtime Prefetch GenerationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640396(999-1013)Online publication date: 27-Apr-2024
    • (2019)MACProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337867(1-10)Online publication date: 5-Aug-2019
    • (2019)Memory latency optimizations for the elementary functions on the Sunway architectureThe Journal of Supercomputing10.1007/s11227-018-02741-175:7(3917-3944)Online publication date: 1-Jul-2019
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '90: Proceedings of the 4th international conference on Supercomputing
    June 1990
    492 pages
    ISBN:0897913698
    DOI:10.1145/77726
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 18, Issue 3b
      Special Issue: Proceedings of the 4th international conference on Supercomputing
      Sept. 1990
      489 pages
      ISSN:0163-5964
      DOI:10.1145/255129
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 1990

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    IC'90
    Sponsor:
    IC'90: ACM SIGARCH International Conference on Supercomputing
    June 11 - 15, 1990
    Amsterdam, The Netherlands

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)62
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RPG2: Robust Profile-Guided Runtime Prefetch GenerationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640396(999-1013)Online publication date: 27-Apr-2024
    • (2019)MACProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337867(1-10)Online publication date: 5-Aug-2019
    • (2019)Memory latency optimizations for the elementary functions on the Sunway architectureThe Journal of Supercomputing10.1007/s11227-018-02741-175:7(3917-3944)Online publication date: 1-Jul-2019
    • (2018)Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUsACM Transactions on Architecture and Code Optimization10.1145/328085115:4(1-23)Online publication date: 29-Oct-2018
    • (2017)Adaptive Runtime-Assisted Block Prefetching on Chip-MultiprocessorsInternational Journal of Parallel Programming10.1007/s10766-016-0431-845:3(530-550)Online publication date: 1-Jun-2017
    • (2017)Predicting Access to Persistent Objects Through Static Code AnalysisNew Trends in Databases and Information Systems10.1007/978-3-319-67162-8_7(54-62)Online publication date: 9-Sep-2017
    • (2016)Towards high performance paged memory for GPUs2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446077(345-357)Online publication date: Mar-2016
    • (2013)Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor SystemsMobile and Handheld Computing Solutions for Organizations and End-Users10.4018/978-1-4666-2785-7.ch018(323-340)Online publication date: 2013
    • (2011)Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor SystemsInternational Journal of Handheld Computing Research10.4018/jhcr.20111001032:4(42-58)Online publication date: 1-Oct-2011
    • (2010)An Energy Efficient Instruction Prefetching Scheme for Embedded ProcessorsUbiquitous Computing and Multimedia Applications10.1007/978-3-642-13467-8_8(73-88)Online publication date: 2010
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media