Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Vector prefetching

Published: 15 December 1995 Publication History
  • Get Citation Alerts
  • Abstract

    This paper focuses on extending the memory subsystem by integrating a prefetch buffer mechanism. Prefetching allows high-level application knowledge to increase memory performance, which is currently constraining the performance of most system. While prefetching does not reduce the latency of memory accesses, it hides this latency by overlapping memory access and instruction execution.
    The first prefetch operation to the buffer is initiated by an explicit fetch instruction. All further prefetch operations are issued automatically whenever a prefetched value is consumed. To efficiently support list and vector processing, the user can specify a stride value at the time the first prefetch operation is initiated.

    References

    [1]
    {AAD+93} Tom Asprey, Gregory S. Averill, Eric DeLano, Russ Mason, Bill Weiner, and Jeff Yetter. Performance features of the HP PA-7100 microprocessor. IEEE Micro, Special Issue on Hot Chips IV, 13(3), June 1993.
    [2]
    {BC91} Jean-Loup Baer and Tien-Fu Chen. An effective on-chip preloading scheme to reduce data access penalty. In Proc. of Supercomputing '91, pages 176-186, November 1991.
    [3]
    {BD94} Keith Boland and Apostolos Dollas. Predicting and precluding problems with memory latency. IEEE Micro, 14(4): 59-67, August 1994.
    [4]
    {CKP91} David Callahan, Ken Kennedy, and Allan Porterfield. Software prefetching. In Proc. of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems , pages 40--52, April 1991.
    [5]
    {FPJ92} John W. C. Fu, Janak H. Patel, and Bob L. Janssens. Stride directed prefetching in scalar processors. In Proc. of the 25th Annual International Symposium on Microarchitecture, pages 102-110, 1992.
    [6]
    {GP93} Michael K. Gschwind and Thomas J. Pietsch. A smart cache for improved vector performance. In First International Meeting on Vector and Parallel Processing, Porto, Portugal, September 1993.
    [7]
    {IBM90} IBM. IBM Journal of Research and Development, Special Issue on RISC/System6000. IBM, January 1990.
    [8]
    {Jou90} Norman P. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 364-373, August 1990.
    [9]
    {Kan89} Gerry Kane. MIPS RISC Architecture. Prentice Hall, 1989.
    [10]
    {KL91} Alexander C. Klaiber and Henry M. Levy. An architecture for software-controllod data prefetching. In Proceedings of the 18th Annual International Symposium on Computer Architecture, pages 43-53, May 1991.
    [11]
    {KP92} Andreas Krall and Thomas Pietsch. R3000 extensions for the support of logic and functional programming languages. Technical report, Abteilung für Programmiersprachen, Technische Universität Wien, 1992.
    [12]
    {Lar90} James R. Larus. SPIM S20: A MIPS R2000 simulator. Technical Report 966, University of Wisconsin-Madison, Madison, WI, September 1990.
    [13]
    {McM91} Frank H. McMahon. Lawrence Livermore National Laboratory FORTRAN Kernels Test: MFLOPS. FORTRAN source code, September 1991.
    [14]
    {Mot88} Motorola. MCS88100: RISC Microprocessor User's Manual. Motorola, Inc., 1988.
    [15]
    {PS94} Christian L. Piccardi and Jürgen E Strobel. Optimization and evaluation at the Livermore-loops for a prefctching MIPS-I architecture. Technical Report IB-94/14, Institut für Technische Informatik, Technische Universität Wien, Vienna, Austria, 1994.
    [16]
    {RL92} Anne Rogers and Kai Li. Software support for speculative loads. In Proc. of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 38-50, October 1992.
    [17]
    {RR93} Anne Rogers and Scott Rosenberg. Cycle level SPIM. Technical report, Department of Computer Science, Princeton University, Princeton, NJ, October 1993.
    [18]
    {Sta95} Richard Stallman. Using and Porting GNU CC. Free Software Foundation, Cambridge, MA, 1995. (Version 2.7).
    [19]
    {Tho64} James E. Thornton. Parallel operation in the Control Data 6600. In Proc. of the Spring Joint Computer Conference 1964, pages 33-40, 1964.
    [20]
    {Tho67} J. F. Thorlin. Code generation for PIE (parallel instruction execution) computers. In Proc. of the Spring Joint Computer Conference 1967, pages 641-643, 1967.
    [21]
    {Und93} Stephen Undy. Hummingbird: A low-cost superscalar PA-RISC processor. In HOT Chips V-Symposium Record 1993, pages 1.3.1-1.3.12, August 1993.

    Cited By

    View all
    • (2007)Exploiting eDRAM bandwidth with data prefetching: simulation and measurements2007 25th International Conference on Computer Design10.1109/ICCD.2007.4601945(504-511)Online publication date: Oct-2007
    • (2005)Data cache prefetching design space exploration for BlueGene/L supercomputerProceedings of the 17th International Symposium on Computer Architecture on High Performance Computing10.1109/CAHPC.2005.23(201-208)Online publication date: 24-Oct-2005
    • (2001)FPGA prototyping of a RISC processor core for embedded applicationsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/92.9240279:2(241-250)Online publication date: 1-Apr-2001
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 23, Issue 5
    Dec. 1995
    44 pages
    ISSN:0163-5964
    DOI:10.1145/218328
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 December 1995
    Published in SIGARCH Volume 23, Issue 5

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)70
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2007)Exploiting eDRAM bandwidth with data prefetching: simulation and measurements2007 25th International Conference on Computer Design10.1109/ICCD.2007.4601945(504-511)Online publication date: Oct-2007
    • (2005)Data cache prefetching design space exploration for BlueGene/L supercomputerProceedings of the 17th International Symposium on Computer Architecture on High Performance Computing10.1109/CAHPC.2005.23(201-208)Online publication date: 24-Oct-2005
    • (2001)FPGA prototyping of a RISC processor core for embedded applicationsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/92.9240279:2(241-250)Online publication date: 1-Apr-2001
    • (1999)Instruction set selection for ASIP designProceedings of the seventh international workshop on Hardware/software codesign10.1145/301177.301187(7-11)Online publication date: 1-Mar-1999

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media