Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/258915.258931acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article
Free access

Efficient procedure mapping using cache line coloring

Published: 01 May 1997 Publication History

Abstract

As the gap between memory and processor performance continues to widen, it becomes increasingly important to exploit cache memory eflectively. Both hardware and aoftware approaches can be explored to optimize cache performance. Hardware designers focus on cache organization issues, including replacement policy, associativity, line size and the resulting cache access time. Software writers use various optimization techniques, including software prefetching, data scheduling and code reordering. Our focus is on improving memory usage through code reordering compiler techniques.In this paper we present a link-time procedure mapping algorithm which can significantly improve the eflectiveness of the instruction cache. Our algorithm produces an improved program layout by performing a color mapping of procedures to cache lines, taking into consideration the procedure size, cache size, cache line size, and call graph. We use cache line coloring to guide the procedure mapping, indicating which cache lines to avoid when placing a procedure in the program layout. Our algorithm reduces on average the instruction cache miss rate by 40% over the original mapping and by 17% over the mapping algorithm of Pettis and Hansen [12].

References

[1]
T. Ball and J. Larus. Efficient path profiling. In ~gth International Symposium on Microarchiiecture, December 1996.
[2]
L. Belady. A study of replacement algorithms for a virtualstorage computer. IBM Systems Journal, 5(2):78-101, 1966.
[3]
B.N. Bershad, D. Lee, T.H, Romer, and J.B. Chen. Avoiding conflict misses dynamically in large direct-mapped caches. In Six international Conference on Architectural Support .for Progra?nming Languages and Operating Systems, pages 158-170, October 1994.
[4]
B. Calder and D. Grunwald. Reducing branch costs via branch alignment. In Six International Conference on Architectural Support .for Programming Languages and Operating Systems, pages 242-251. ACM, 1994.
[5]
B. Calder, D. Grunwald, and A. $rivastava. The predictability of branches in libraries, in ~8th International Symposium on Microarchitecture, pages 24-34, Ann Arbor, MI, November 1995. IEEE.
[6]
B. Calder, D. Grunwald, and B. Zorn. Quantifying behavioral differences between C and G++ programs. Journal of Programming Languages, 2(4), 1994.
[7]
P.J. Denning and S. C. Schwartz. Properties of the working-set model. Communications of the A CM, 15(3):191-198, March 1972.
[8]
J. A. Fisher and S. M. Freudenberger. Predicting conditional branch directions from previous runs of a program. In Proceedings o-f the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-V), pages 85-95, Boston, Mass., October 1992. ACM.
[9]
W.W. Hwu and P.P. Chang. Achieving high instruction cache performance with an optimizing compiler, in 16th Annual International Symposium on Computer Architecture, pages 242-251 ACM, 1989.
[10]
D. Knell. Issues i'a Trace-Driven Simulation. Lecture Notes in Computer Science No. 729, Performance Evaluation of Computer and Communication Systems,L. Donatiello and R. Nelson eds., Springer-Verlag, 1993, pp. 224- 244., 1990.
[11]
S. McFarling. Program optimization for instruction caches. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating $~stems (ASPLOS III), pages 183-191, April 1989.
[12]
K. Pettis and R.C. Hansen. Profile guided code positioning. in Proceedings of the A CM SIGPLAN '90 Conference on Programming Language Design and Implementation, pages 16-27. ACM, ACM, June 1990.
[13]
S.A. Przybylski. Cache Design: A Performance-Directed Approach. Morgan Kaufmann, San Marco, CA, 1990.
[14]
T.R. Puzak. Analysis of cache replacement-algorithms. Ph.D. Dissertation, University of Massachusetts, Amherst MA, 1985.
[15]
R.W. Quong. Expected I-cache miss rates via the gap model. In ~1st Annual International Symposium on Computer Architecture, pages 372-383, April 1994.
[16]
A.D. Samples and P.N. Hilfinger. Code reorganization for instruction caches. Techical Report UCB/CSD 88/447, October 1988.
[17]
A. Sampoga. Architectural Implications of C and C-t-4- Programming Models. MS Thesis, Northeastern University, August 1995.
[18]
A. Srivastava and A. gustace. ATOM: A system for building customized program analysis tools. In Proceedings of the Conference on Programming Language Design and Implementation, pages 196-205. ACM, 1994.
[19]
J.G. Thompson. Efficient analysis of caching systems. Ph.D. Dissertation, University of California, Berkeley, 1987.
[20]
J. Torrellas, C. Xia, and R. Daigle. Optimizing instruction cache performance for operating system intensive workloads. In Proceedings of the First International $Ttmposiam on High-Performance Computer Architecture, pages 360-369, January 1995.
[21]
D.W. Wall. Predicting program behavior using real or estimated profiles. In Proceedings of the A CM SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 59-70, Toronto, Ontario, Canada, June 1991.
[22]
Cliff Young and Michael D. Smith. Improving the accuracy of static branch prediction using branch correlation. In Six International Conference on Architectural Support -for Programming Languages and Operating $!lstems, pages 232-241, October 1994.

Cited By

View all
  • (2024)Byways: High-Performance, Isolated Network Functions for Multi-Tenant Cloud ServersProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698547(811-829)Online publication date: 20-Nov-2024
  • (2024)Reordering Functions in Mobiles Apps for Reduced Size and Faster Start-UpACM Transactions on Embedded Computing Systems10.1145/3660635Online publication date: 20-Apr-2024
  • (2023)Optimizing Function Layout for Mobile ApplicationsProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596277(52-63)Online publication date: 13-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '97: Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
May 1997
365 pages
ISBN:0897919076
DOI:10.1145/258915
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1997

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

PLDI97
Sponsor:
PLDI97: Conference on Programming Language
June 16 - 18, 1997
Nevada, Las Vegas, USA

Acceptance Rates

PLDI '97 Paper Acceptance Rate 31 of 158 submissions, 20%;
Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)180
  • Downloads (Last 6 weeks)46
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Byways: High-Performance, Isolated Network Functions for Multi-Tenant Cloud ServersProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698547(811-829)Online publication date: 20-Nov-2024
  • (2024)Reordering Functions in Mobiles Apps for Reduced Size and Faster Start-UpACM Transactions on Embedded Computing Systems10.1145/3660635Online publication date: 20-Apr-2024
  • (2023)Optimizing Function Layout for Mobile ApplicationsProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596277(52-63)Online publication date: 13-Jun-2023
  • (2020)Improving the Utilization of Micro-operation Caches in x86 Processors2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00025(160-172)Online publication date: Oct-2020
  • (2019)MxUACM Transactions on Embedded Computing Systems10.1145/335822418:5s(1-20)Online publication date: 8-Oct-2019
  • (2019)ShoesLocProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/33144183:1(1-23)Online publication date: 29-Mar-2019
  • (2019)Revealing Urban Dynamics by Learning Online and Offline Behaviours TogetherProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/33144173:1(1-25)Online publication date: 29-Mar-2019
  • (2014)Markov parameters tuning prediction to improve cache hit rateInternational Journal of Internet Protocol Technology10.1504/IJIPT.2014.0682668:4(190-199)Online publication date: 1-Mar-2014
  • (2014)Branch Prediction-Directed Dynamic Instruction Cache Locking for Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/266049213:5s(1-24)Online publication date: 6-Oct-2014
  • (2014)Software trace cacheACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667175(261-268)Online publication date: 10-Jun-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media