Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Compiler-based prefetching for recursive data structures

Published: 01 September 1996 Publication History
  • Get Citation Alerts
  • Abstract

    Software-controlled data prefetching offers the potential for bridging the ever-increasing speed gap between the memory subsystem and today's high-performance processors. While prefetching has enjoyed considerable success in array-based numeric codes, its potential in pointer-based applications has remained largely unexplored. This paper investigates compiler-based prefetching for pointer-based applications---in particular, those containing recursive data structures. We identify the fundamental problem in prefetching pointer-based data structures and propose a guideline for devising successful prefetching schemes. Based on this guideline, we design three prefetching schemes, we automate the most widely applicable scheme (greedy prefetching) in an optimizing research compiler, and we evaluate the performance of all three schemes on a modern superscalar processor similar to the MIPS R10000. Our results demonstrate that compiler-inserted prefetching can significantly improve the execution speed of pointer-based codes---as much as 45% for the applications we study. In addition, the more sophisticated algorithms (which we currently perform by hand, but which might be implemented in future compilers) can improve performance by as much as twofold. Compared with the only other compiler-based pointer prefetching scheme in the literature, our algorithms offer substantially better performance by avoiding unnecessary overhead and hiding more latency.

    References

    [1]
    A. Agarwal, B.-H. Lim, D. Kranz, and J. Kubiatowicz. April: A processor architecture for multiprocessing. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 104-114, May 1990.
    [2]
    J.-L. Baer and T.-F. Chen. An effective on-chip preloading scheme to reduce data access penalty. In Proceedings of Supercomputing '91, 1991.
    [3]
    D. Callahan, K. Kennedy, and A. Porterfield. Software prefetching. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 40-52, April 1991.
    [4]
    S. Cart, K. S. McKinley, and C.-W. Tseng. Compiler optimizations for improving data locality. In Proceedings of the Sixth international Conference on Architectural Support for Programming Languages and Operating Systems, pages 252-262, October 1994.
    [5]
    W. Y. Chen, S. A. Mahlke, P. P. Chang, and W. W. Hwu. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching. In Proceedings o/Microcom- P~t~nR ~4, 1991.
    [6]
    A. Deutsch. A storeless model of aliasing and its abstractions using finite respresentation of right-regular equivalence relations, in Proceedings of the 1992 International Con/erence on Computer Languages, pages 2-13, April 1992.
    [7]
    M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proceedings of the A CM SiGPLAN'9~ Conference on Programming Language Design and Implementation, pages 242- 256, June 1994.
    [8]
    R. Ohiya ~nd L. J, Hendren, Is it a Tree, a DAG, or a Cyclic Graph? A shape analysis for heap-directed pointers in C. In Proceedings of the 23rd Annual A CM SIGPLAN-SIGA CT Symposium on Principles of Programming Languages, pages 1-15, january 1996.
    [9]
    R. H. Halstead, Jr. and T. Fujita. MASA: A multithreaded processor architecture for parallel symbolic computing. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages 443-451, June 1988.
    [10]
    L.J. Hendren, j. Hummel, and A. Nicolau. A general data dependence test for dynamic, pointer-based data structures. In Proceedings of the SIGPLAN'9~ Conference on Programming Language Design and Implementation, pages 218-229, June 1994.
    [11]
    J. S. Kowalik, editor. Parallel MiMD Computation: The HEP Supercomputer and Its Applications. MIT Press, 1985.
    [12]
    W. Landi, B. G. Ryder, and S. Zhang. Interprocedural modification side effect analysis with pointer aliasing. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, pages 56-67, June 1993.
    [13]
    J. Landon, A. Gupta, and M. Horowitz. Interleaving: A multithreading technique targeting multiprocessors and workstations. In Proceedings of the Sixth International Conference on Architectural Support/or Programming Languages and Operating Systems, pages 308-318, October 1994.
    [14]
    M. H. Lipasti, W. J. Schmidt, S. R. Kunkel, and R. R. Roediget. SPAID: Software prefetching in pointer- and call-intensive environments. In Proceedings of the 28th Annual IEEE/ACM International Symposium on Microarchitecture, I995.
    [15]
    T. C. Mowry. Tolerating Latency Through Software-Controlled Data Prcfetching. PhD thesis, Stanford University, March 1994. Technical Report CSL-TR-94-626.
    [16]
    T. C. Mowry, M. S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In Proceedings of the F~fth International Conference on Architectural Support for Programming Languages and Operating 'Systems, pages 62-73, October 1992.
    [17]
    A. Rogers, M. Carlisle, J. Reppy, and L. Hendren. Supporting dynamic data structures on distributed memory machines. A CM Trans. on Programming Languages and Systems, 17(2), March 1995.
    [18]
    M. D. Smith. Tracing with pixie. Technical Report CSL-TR-91- 497, Stanford University, November 1991.
    [19]
    C. J. Stephenson. Fast fits. In Proceedings o/the ACM 9th Symposium on Operating Systems, October 1983.
    [20]
    S. W. K. Tjiang and J. L. Hennessy. Sharlit: A tool for building optimizers. In SIGPLAN Conference on Programming Language Design and Implementation, 1992.
    [21]
    M.E. Wolf and M. S. Lain. A data locality optimizing algorithm. In Proceedings o/the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 30-44, June 1991.
    [22]
    Z. Zhang and J. Torrellas. Speeding up irregular applications in shared-memory multiprocessors: Memory binding and group prefetching. In Proceedings o/the ~~nd Annual International Symposium on Computer Architecture, pages 188-200, June 1995.

    Cited By

    View all
    • (2024)A prefetching indexing scheme for in-memory database systemsFuture Generation Computer Systems10.1016/j.future.2024.03.012156(179-190)Online publication date: Jul-2024
    • (2023)Optimization of OLAP In-Memory Database Management Systems with Processing-In-Memory ArchitectureArchitecture of Computing Systems10.1007/978-3-031-42785-5_18(264-278)Online publication date: 26-Aug-2023
    • (2022)CSPM: A Coordinated Software Prefetching Mechanism For Multi-Level Caches2022 7th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS55155.2022.9846079(86-91)Online publication date: 22-Apr-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGOPS Operating Systems Review
    ACM SIGOPS Operating Systems Review  Volume 30, Issue 5
    Dec. 1996
    273 pages
    ISSN:0163-5980
    DOI:10.1145/248208
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
      October 1996
      290 pages
      ISBN:0897917677
      DOI:10.1145/237090
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 1996
    Published in SIGOPS Volume 30, Issue 5

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)257
    • Downloads (Last 6 weeks)16

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A prefetching indexing scheme for in-memory database systemsFuture Generation Computer Systems10.1016/j.future.2024.03.012156(179-190)Online publication date: Jul-2024
    • (2023)Optimization of OLAP In-Memory Database Management Systems with Processing-In-Memory ArchitectureArchitecture of Computing Systems10.1007/978-3-031-42785-5_18(264-278)Online publication date: 26-Aug-2023
    • (2022)CSPM: A Coordinated Software Prefetching Mechanism For Multi-Level Caches2022 7th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS55155.2022.9846079(86-91)Online publication date: 22-Apr-2022
    • (2021)Fast Key-Value Lookups with Node TrackerACM Transactions on Architecture and Code Optimization10.1145/345209918:3(1-26)Online publication date: 8-Jun-2021
    • (2020)Informed Prefetching for Indirect Memory AccessesACM Transactions on Architecture and Code Optimization10.1145/337421617:1(1-29)Online publication date: 4-Mar-2020
    • (2018)Array Tracking Prefetcher for Indirect Accesses2018 IEEE 36th International Conference on Computer Design (ICCD)10.1109/ICCD.2018.00028(132-139)Online publication date: Oct-2018
    • (2018)TELEPORT: Hardware/software alternative to CUDA shared memory programmingMicroprocessors and Microsystems10.1016/j.micpro.2018.09.00463(169-181)Online publication date: Nov-2018
    • (2015)IMPProceedings of the 48th International Symposium on Microarchitecture10.1145/2830772.2830807(178-190)Online publication date: 5-Dec-2015
    • (2015)Random Walk TripleRushProceedings of the 24th International Conference on World Wide Web10.1145/2736277.2741687(1034-1044)Online publication date: 18-May-2015
    • (2015)Machine learning techniques for improved data prefetching5th International Conference on Energy Aware Computing Systems & Applications10.1109/ICEAC.2015.7352208(1-4)Online publication date: Mar-2015
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media