research-article

Code-based cache partitioning for improving hardware cache performance

Authors:

Young Ik EomAuthors Info & Claims

ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication

Article No.: 42, Pages 1 - 5

https://doi.org/10.1145/2184751.2184803

Published: 20 February 2012 Publication History

Abstract

Recently, improving hardware cache performance is getting more important, because the performance gap between processor and memory has caused "memory wall" problem. Most cache designs are based on the LRU replacement policy which is effective for high-locality workloads. However, it is ineffective for the workloads that have a working set greater than available cache size or weak-memory access patterns. To make up for the weakness of LRU policy, we introduce a novel code-based cache partitioning mechanism which does not require any hardware support. In our mechanism, we first collect profile data using binary instrumentation, and then classify the characteristic of code region through the collected code profiles. Finally, while the application is running, page coloring technique is used for code-based cache partitioning. To show effectiveness of our mechanism, we implemented our mechanism in the Linux kernel. Experiments on the workloads including weak-memory access pattern show that the proposed mechanism achieves performance improvement by up to 7.3% and the last-level cache miss reduction by up to 37.8%.

References

[1]

W. A. Wuld and S. A. McKee. Hitting the memory wall: Implications of the obvious. ACM Sigarch Computer Architecture News, 23(1):20--24, 1995.

Digital Library

[2]

A. González, C. Aliagas, and M. Valero. A data cache with multiple caching strategies tuned to different types of locality. In International Conference in Supercomputing (ICS), pages 338--347, 1995.

Digital Library

[3]

W. A. Wong and J.-L. Baer. Modified LRU policies for improving second-level cache behavior. In 6th International Symposium on High-Performance Computing Architecture (HPCA), pages 49--60, 2000.

[4]

R. Subramanian, Y. Smaragdakis, and G. H. Loh. Adaptive caches: effective shaping of cache behavior to workloads. In 39th International Symposium on Microarchitecture (MICRO'06), pages 385--396, 2006.

Digital Library

[5]

M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely, and J. Emer. Adaptive insertion policies for high performance caching. In 34th International Symposium on Computer Architecture (ISCA'07), pages 381--391, 2007.

Digital Library

[6]

C.-H. Chi and H. Dietz. Improving cache performance by selective cache bypass. In Twenty-Second Annual Hawaii International Conference on System Sciences, Architecture Track, pages 277--285, 1989.

[7]

M. Kharbutli and Y. Solihin. Counter-based cache replacement and bypassing algorithms. IEEE Transactions on Computers, 57(4):433--447, 2008.

Digital Library

[8]

T. L. Johnson, D. A. Connors, M. C. Merten, and W. mei W. Hwu. Run-time cache bypassing. IEEE Transactions on Computers, 48(12):1338--1354, 1999.

Digital Library

[9]

H. Dybdahl and P. Stenström. Enhancing last-level cache performance by block bypassing and early miss determination. In Asia-Pacific Computer Systems Architecture Conference, pages 52--66, 2006.

Digital Library

[10]

L. Soares, D. Tam, and M. Stumm. Reducing the harmful effects of last-level cache polluters with an os-level, software-only pollute buffer. In 41st International Symposium on Microarchitecture (MICRO'08), pages 258--269, 2008.

Digital Library

[11]

Q. Lu, J. Lin, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Soft-OLP: improving hardware cache performance through software-controlled object-level partitioning. In 18th International Conference on Parallel Architectures and Compilation Techniques, pages 246--257, 2009.

Digital Library

[12]

N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI'07), pages 89--100, 2007.

Digital Library

[13]

The Valgrind Developers. Valgrind. http://www.valgrind.org/.

[14]

G. Taylor, P. Davies, and M. Farmwald. The TLB slice--a low-cost high-speed address translation mechanism. In 17th International Symposium on Computer Architecture (ISCA'90), pages 355--363, 1990.

Digital Library

[15]

R. E. Kessler and M. D. Hill. Page placement algorithms for large real-indexed caches. ACM Transactions on Computer Systems, 10(4):338--359, 1992.

Digital Library

[16]

Gian-Paolo D. Musumeci and Mike Loukides. System performance tuning. O'REILLY, 2nd Edition, 2002.

[17]

E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1(1):269--271, 1959.

Digital Library

[18]

D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared L2 caches on multicore systems in software. In Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA), 2007.

[19]

J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multi-core cache partitioning: bridging the gap between simulation and real systems. In 14th International Symposium on High-Performance Computing Architecture (HPCA), pages 367--378, 2008.

[20]

X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multi-core cache management. In 4th ACM European Conference on Computer Systems (EuroSys'09), pages 89--102, 2009.

Digital Library

[21]

X. Jin, H. Chen, X. Wang, Z. Wang, X. Wen, Y. Luo, and X. Li. A simple cache partitioning approach in a virtualized environment. In 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, pages 519--524, 2009.

Cited By

Shao WYe BWang HParmer GRen Y(2022)Edge-RT: OS Support for Controlled Latency in the Multi-Tenant, Real-Time Edge2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00011(1-13)Online publication date: Dec-2022
https://doi.org/10.1109/RTSS55097.2022.00011
Jain PSurve S(2019)Resource Centric Characterization and Classification of Applications Using KMeans for Multicores2019 International Conference on Information Networking (ICOIN)10.1109/ICOIN.2019.8717981(25-30)Online publication date: Jan-2019
https://doi.org/10.1109/ICOIN.2019.8717981
Jain PSurve S(2019)Coordination and Synchronization in Multiagent System Based on Tilman Model of Resource Sharing2019 International Conference on Advances in Computing, Communication and Control (ICAC3)10.1109/ICAC347590.2019.9036776(1-6)Online publication date: Dec-2019
https://doi.org/10.1109/ICAC347590.2019.9036776
Show More Cited By

Index Terms

Code-based cache partitioning for improving hardware cache performance
1. General and reference
  1. Cross-computing tools and techniques
    1. Design
    2. Performance

Recommendations

Research on the impact of Gem5-based cache capacity on cache performance
ICMLCA '23: Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application

Cache is a key component of the processor, how to set the cache capacity at all levels to make it more effective for performance improvement? We have conducted research on this problem by evaluating different cache sizes configurations of a dual core two-...
Combining Process-Based Cache Partitioning and Pollute Region Isolation to Improve Shared Last Level Cache Utilization on Multicore Systems
TRUSTCOM '13: Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications

Shared last level cache has been widely used in modern multicore processors. However, uncontrolled cache sharing on multicore leads to more serious cache pollution than that on single-core processor. A process with weak locality can evict strong ...
Improving Disk Cache Hit-Ratios Through Cache Partitioning

An adaptive algorithm for managing fully associative cache memories shared by several identifiable processes is presented. The on-line algorithm extends an earlier model due to H.S. Stone et al. (1989) and partitions the cache storage in disjoint blocks ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication

February 2012

852 pages

ISBN:9781450311724

DOI:10.1145/2184751

Conference Chairs:
Suk-Han Lee
Sungkyunkwan University, Korea
,
Lajos Hanzo
University of Southampton, UK
,
Roslan Ismail
Universiti Kuala Lumpur, Malaysia
,
Program Chairs:
Dongsoo S. Kim
Indiana University
,
Min Young Chung
Sungkyunkwan University, Korea
,
Sang-Won Lee
Sungkyunkwan University, Korea

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing
SKKU: SUNGKYUNKWAN UNIVERSITY

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministry of Education, Science and Technology

Conference

ICUIMC '12

Sponsor:

SIGAPP
SKKU

ICUIMC '12: The 6th International Conference on Ubiquitous Information Management and Communication

February 20 - 22, 2012

Kuala Lumpur, Malaysia

Acceptance Rates

Overall Acceptance Rate 251 of 941 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
222
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shao WYe BWang HParmer GRen Y(2022)Edge-RT: OS Support for Controlled Latency in the Multi-Tenant, Real-Time Edge2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00011(1-13)Online publication date: Dec-2022
https://doi.org/10.1109/RTSS55097.2022.00011
Jain PSurve S(2019)Resource Centric Characterization and Classification of Applications Using KMeans for Multicores2019 International Conference on Information Networking (ICOIN)10.1109/ICOIN.2019.8717981(25-30)Online publication date: Jan-2019
https://doi.org/10.1109/ICOIN.2019.8717981
Jain PSurve S(2019)Coordination and Synchronization in Multiagent System Based on Tilman Model of Resource Sharing2019 International Conference on Advances in Computing, Communication and Control (ICAC3)10.1109/ICAC347590.2019.9036776(1-6)Online publication date: Dec-2019
https://doi.org/10.1109/ICAC347590.2019.9036776
Scolari ASironi FSciuto DSantambrogio M(2014)A Survey on Recent Hardware and Software-Level Cache Management TechniquesProceedings of the 2014 IEEE International Symposium on Parallel and Distributed Processing with Applications10.1109/ISPA.2014.41(242-247)Online publication date: 26-Aug-2014
https://dl.acm.org/doi/10.1109/ISPA.2014.41
Kumar NVyas SCytron RGill CZambreno JJones P(2014)Cache design for mixed criticality real-time systems2014 IEEE 32nd International Conference on Computer Design (ICCD)10.1109/ICCD.2014.6974730(513-516)Online publication date: Oct-2014
https://doi.org/10.1109/ICCD.2014.6974730
Caccamo MCesati MPellizzoni RBetti EDudko RMancuso R(2013)Real-time cache management framework for multi-core architecturesProceedings of the 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS.2013.6531078(45-54)Online publication date: 9-Apr-2013
https://dl.acm.org/doi/10.1109/RTAS.2013.6531078

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten