Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Compiler-controlled memory

Published: 01 October 1998 Publication History

Abstract

Optimizations aimed at reducing the impact of memory operations on execution speed have long concentrated on improving cache performance. These efforts achieve a. reasonable level of success. The primary limit on the compiler's ability to improve memory behavior is its imperfect knowledge about the run-time behavior of the program. The compiler cannot completely predict runtime access patterns.There is an exception to this rule. During the register allocation phase, the compiler often must insert substantial amounts of spill code; that is, instructions that move values from registers to memory and back again. Because the compiler itself inserts these memory instructions, it has more knowledge about them than other memory operations in the program.Spill-code operations are disjoint from the memory manipulations required by the semantics of the program being compiled, and, indeed, the two can interfere in the cache. This paper proposes a hardware solution to the problem of increased spill costs---a small compiler-controlled memory (CCM) to hold spilled values. This small random-access memory can (and should) be placed in a distinct address space from the main memory hierarchy. The compiler can target spill instructions to use the CCM, moving most compiler-inserted memory traffic out of the pathway to main memory and eliminating any impact that those spill instructions would have on the state of the main memory hierarchy. Such memories already exist on some DSP microprocessors. Our techniques can be applied directly on those chips.This paper presents two compiler-based methods to exploit such a memory, along with experimental results showing that speedups from using CCM may be sizable. It shows that using the register allocation's coloring paradigm to assign spilled values to memory can greatly reduce the amount of memory required by a program.

References

[1]
Anonymous. Performance of pentium pro and pentium ii processor/cache combinations. Technical report, ECG Technology Communications Group, Compaq Computer Corporation, May 1997.
[2]
Bary R. Beck, David W.L. Yen, and Thomas L. Anderson. The cydra 5 minisupercomputer: Architecture and implementation. The Journal of Supercomputing, 7, 1993.
[3]
Peter Bergner, Peter DaM, David Engebretsen, and Matthew O'Keefe. Spill code minimization via interference region spilling. SiGPLAN Notices, 32(6):287-295, June 1997. Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation.
[4]
Preston Briggs. Register Allocation via Graph Coloring. PhD thesis, Rice University, April 1992.
[5]
Preston Briggs. The massively scalar compiler project. Technical report, Rice University, July 1994. Preliminary version available via anonymous ftp.
[6]
David Callahan, Alan Carle, Mary W. Hall, and Ken Kennedy. Constructing the procedure call multigraph. IEEE Transactions on Software Engineering, 16(4), April 1990.
[7]
David Callahan, Ken Kennedy, and Allan Porterfield. Software prcfetching. In Proceedings of tile Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, California, 1991.
[8]
Steve Carr, Kathryn S. McKinley, and Chau-Wen Tseng. Compiler optimizations for improving data locality. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, 1994.
[9]
Fred Chow, Sun Chan, Robert Kennedy, Shin-Ming Liu, Raymond Lo, and Peng Tu. A new algorithm for partial redundancy elimination based on ssa form. SIGPLAN Notices, 32(6):273-286, June 1997. Proceedings of the A CM SIGPLAN '97 Conference on Programming Language Design and Implementation.
[10]
Keith Cooper, Ken Kennedy, and Nathaniel Mclntosh. Cross-loop reuse analysis and its application to cache optimization. In Proceedings of the Ninth Workshop on Languages and Compilers for Parallel Computing, San Jose, California, 1996.
[11]
George E. Forsythe, Michael A. Malcolm, and Cleve B. Moler. Computer Methods for Mathematical Computations. Prentice-Hall, Englewood Cliffs, New Jersey, 1977.
[12]
Lal George and Andrew W. Appel. Iterated register coalescing. A CM Transactions on Programming Languages and Systems, 18(3):300-324, May 1996.
[13]
John Hennessy and David Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc., second edition, 1990.
[14]
Cristina Hristea, Daniel Lenoski, and John Keen. Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks. In ACM, editor, SC'97: High Performance Networking and Computing: Proceedings of the 1997 A CM/IEEE S C97 Conference: November 15- ~I, 1997, San Jose, California, USA., pages ??-??, New York, NY 10036, USA and 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 1997. ACM Press and IEEE Computer Society Press.
[15]
Intel Corporation. PentiumTM {I Processor Developer's Manual, 1997.
[16]
John Lu and Keith Cooper. Register promotion in c programs. SIGPLAN Notices, 32(6):308-319, June 1997. Proceedirzgs of the A CM SIGPLAN '97 Conference on Programming Language Design and Implementation.
[17]
Sally A. McKee. Compiling for efficient memory utilization. In Workshop on Interaction Between Compilers and Computer Architectures, Second IEEE Symposium on High Performance Computer Architecture (HPCA-~), San Jose, California, January 1996.
[18]
Kathryn S. McKinley. Personal communication. Email message, July 1998.
[19]
Kathryn S. McKinley and Olivier Temam. A quantitative analysis of loop nest locality, in Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, 1996.
[20]
Larry Meadows, Steven Nakamoto, and Vincent Schuster. A vectorizing, software pipelining compiler for LIW and superscalar architecture. In Proceedings of RISC '9~, San Jose, CA, February 1992.
[21]
Todd C. Mowry, Monica S. Lain, and Anoop Gut)ta. Design and evaluation of a compiler algorithln for prefetching. In Proceedings of the Fifth InternationM Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, 1992.
[22]
Vijay S. Pal, Parthasarathy Ranganathan, Sarita V. Adve, and Tracy Harton. An evaluation of memory consistency models for shared-memory systems with ilp processors. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, 1996.
[23]
Barbara G. Ryder. Constructing the call graph of a program. IEEE Transactions on Software Engineering, 5(3):217-226, May 1979.
[24]
SPEC release 1.2, September 1989. Standards Performance Evaluation Corporation.
[25]
SPEC release 1.10, September 1995. Standards Performance Evaluation Corporation.
[26]
Michael Upton, Thomas Huff, Trevor Mudge, and Richard Brown. Resource allocation in a high clock rate microprocessor. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, 1994.
[27]
Michael E. Wolf and Monica S. Lam. A data locality optimizing algorithm. SIGPLAN Notices, 26(6):30-44, June 1991. Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation.
[28]
Michael Wolfe. More iteration space tiling. In Proceedings of Supercomputing '89, pages 655-664, Rcno, Nevada, November 1989.
[29]
Win. A. Wulf and Sally A. McKee. Hitting the memory wall: implications of the obvious. Computer Architecture News, 23(1), March 1995.

Recommendations

Comments

Information & Contributors

Information

Published In

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1998
Published in SIGPLAN Volume 33, Issue 11

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)138
  • Downloads (Last 6 weeks)29
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media