Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

A Survey of Techniques for Cache Locking

Published: 16 May 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Cache memory, although important for boosting application performance, is also a source of execution time variability, and this makes its use difficult in systems requiring worst-case execution time (WCET) guarantees. Cache locking is a promising approach for simplifying WCET estimation and providing predictability, and hence, several commercial processors provide ability for locking cache. However, cache locking also has several disadvantages (e.g., extra misses for unlocked blocks, complex algorithms required for selection of locking contents) and hence, a careful management is required to realize the full potential of cache locking. In this article, we present a survey of techniques proposed for cache locking. We categorize the techniques into several groups to underscore their similarities and differences. We also discuss the opportunities and obstacles in using cache locking. We hope that this article will help researchers gain insight into cache locking schemes and will also stimulate further work in this area.

    References

    [1]
    Tosiron Adegbija and Ann Gordon-Ross. 2015. Phase-based cache locking for embedded systems. In Great Lakes Symposium on VLSI. 115--120.
    [2]
    Kapil Anand and Rajeev Barua. 2009. Instruction cache locking inside a binary rewriter. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. 185--194.
    [3]
    Luis C. Aparicio, Juan Segarra, Clemente Rodriguez, and Victor Vinals. 2010. Combining prefetch with instruction cache locking in multitasking real-time systems. In International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). 319--328.
    [4]
    ARM. 1999. ARM966E-S Technical Reference Manual. Retrieved from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0164a/ch05s03s02.html.
    [5]
    ARM. 2007. ARM1156T2-S Technical Reference Manual. Retrieved from http://infocenter.arm.com/help/topic/com.arm.doc.ddi0338g/DDI0338G_arm1156t2s_r0p4_trm.pdf.
    [6]
    ARM. 2010. Cortex-A8 Technical Reference Manual. Retrieved from http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/DDI0344K_cortex_a8_r3p2_trm.pdf.
    [7]
    ARM. 2012. ARM Cortex-M Programming Guide to Memory Barrier Instructions. Retrieved from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHEADII.html.
    [8]
    Abu Asaduzzaman, Imad Mahgoub, and Fadi N. Sibai. 2009. Impact of L1 entire locking and L2 way locking on the performance, power consumption, and predictability of multicore real-time systems. In ACS/IEEE International Conference on Computer Systems and Applications (AICCSA). 705--711.
    [9]
    Abu Asaduzzaman, Fadi N. Sibai, and Manira Rani. 2010. Improving cache locking performance of modern embedded systems via the addition of a miss table at the L2 cache level. Journal of Systems Architecture 56, 4 (2010), 151--162.
    [10]
    A. Martí Campoy, A. P. Ivars, and J. V. Busquets-Mataix. 2001a. Using genetic algorithms in content selection for locking-caches. In International Symposium on Applied Informatics. 271--276.
    [11]
    A. Marti Campoy, A. Perles Ivars, and J. V. Busquets-Mataix. 2002. Dynamic use of locking caches in multitask, preemptive real-time systems. In 15th Triennial World Congress of the International Federation of Automatic Control.
    [12]
    A. Marti Campoy, A. Perles, F. Rodriguez, and J. V. Busquets-Mataix. 2003. Static use of locking caches vs. dynamic use of locking caches for real-time systems. In Canadian Conference on Electrical and Computer Engineering (CCECE), Vol. 2. 1283--1286.
    [13]
    Antonio Marti Campoy, Isabelle Puaut, Angel Perles Ivars, and Jose Vicente Busquets Mataix. 2005. Cache contents selection for statically-locked instruction caches: An algorithm comparison. In Euromicro Conference on Real-Time Systems (ECRTS). 49--56.
    [14]
    Marti Campoy, A. Perles Ivars, and J. V. Busquets-Mataix. 2001b. Static use of locking caches in multitask preemptive real-time systems. In Real-Time Embedded Systems Workshop. 1--6.
    [15]
    Bekim Cilku, Daniel Prokesch, and Peter Puschner. 2015. A time-predictable instruction-cache architecture that uses prefetching and cache locking. Software Technologies for Future Embedded and Ubiquitous Systems (SEUS) (2015).
    [16]
    Huping Ding, Yun Liang, and Tulika Mitra. 2012. WCET-centric partial instruction cache locking. In Design Automation Conference (DAC). 412--420.
    [17]
    Huping Ding, Yun Liang, and Tulika Mitra. 2013. Integrated instruction cache analysis and locking in multitasking real-time systems. In Design Automation Conference. 147.
    [18]
    Huping Ding, Yun Liang, and Tulika Mitra. 2014. WCET-centric dynamic instruction cache locking. In Design, Automation & Test in Europe. 27.
    [19]
    Heiko Falk, Sascha Plazar, and Henrik Theiling. 2007. Compile-time decided instruction cache locking using worst-case execution paths. In International Conference on Hardware/Software Codesign and System Synthesis. 143--148.
    [20]
    IBM. 2002. IBM PowerPC 750FX RISC Microprocessor. (2002).
    [21]
    Intel. 2007. 3rd Generation Intel XScale Microarchitecture: Developer’s Manual. http://download.intel.com/design/intelxscale/31628302.pdf. (May 2007).
    [22]
    Kyungtae Kang, Kyung-Joon Park, and Hongseok Kim. 2012. Functional-level energy characterization of μC/OS-II and cache locking for energy saving. Bell Labs Technical Journal 17, 1 (2012), 219--227.
    [23]
    N. G. Kumar, Sudhanshu Vyas, Ron K. Cytron, Christopher D. Gill, Joseph Zambreno, and Phillip H. Jones. 2014. Cache design for mixed criticality real-time systems. In International Conference on Computer Design (ICCD). 513--516.
    [24]
    Yau-Tsun Steven Li and Sharad Malik. 1995. Performance analysis of embedded software using implicit path enumeration. In ACM SIGPLAN Notices, Vol. 30. 88--98.
    [25]
    Yau-Tsun Steven Li, Sharad Malik, and Andrew Wolfe. 1996. Cache modeling for real-time software: Beyond direct mapped instruction caches. In Real-Time Systems Symposium. IEEE, 254--263.
    [26]
    Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, and Vivy Suhendra. 2012. Timing analysis of concurrent programs running on shared cache multi-cores. Real-Time Systems 48, 6 (2012), 638--680.
    [27]
    Yun Liang and Tulika Mitra. 2010. Instruction cache locking using temporal reuse profile. In Design Automation Conference. 344--349.
    [28]
    Chuanwen Lin, Naijie Gu, and Songsong Cai. 2013. Cache locking optimization in Java virtual machine. In Conference Anthology, IEEE. 1--4.
    [29]
    Tiantian Liu, Minming Li, and Chun Jason Xue. 2009a. Instruction cache locking for real-time embedded systems with multi-tasks. In IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). 494--499.
    [30]
    Tiantian Liu, Minming Li, and Chun Jason Xue. 2009b. Minimizing WCET for real-time embedded systems via static instruction cache locking. In Real-Time and Embedded Technology and Applications Symposium (RTAS). 35--44.
    [31]
    Tiantian Liu, Minming Li, and Chun Jason Xue. 2012. Instruction cache locking for embedded systems using probability profile. Journal of Signal Processing Systems 69, 2 (2012), 173--188.
    [32]
    Tiantian Liu, Yingchao Zhao, Minming Li, and Chun Jason Xue. 2010. Task assignment with cache partitioning and locking for WCET minimization on MPSoC. In ICPP. 573--582.
    [33]
    Matthew Loach and Wei Zhang. 2015. Exploring hybrid cache locking to balance performance and time predictability. In SoutheastCon. IEEE, 1--4.
    [34]
    Thomas Lundqvist and Per Stenström. 1999a. An integrated path and timing analysis method based on cycle-level symbolic execution. Real-Time Systems 17, 2--3 (1999), 183--207.
    [35]
    Thomas Lundqvist and Per Stenström. 1999b. Timing anomalies in dynamically scheduled microprocessors. In IEEE Real-Time Systems Symposium. 12--21.
    [36]
    MIPS. 2004. MIPS32 4KEc Processor Core Datasheet. Retrieved from http://www.rockbox.org/wiki/pub/Main/IriverLPlayerPort/MIPS-4KEcDataSheet.pdf.
    [37]
    Sparsh Mittal. 2014a. A survey of techniques for improving energy efficiency in embedded computing systems. International Journal of Computer Aided Engineering and Technology (IJCAET) 6, 4 (2014), 440--459.
    [38]
    Sparsh Mittal. 2014b. A survey of techniques for managing and leveraging caches in GPUs. Journal of Circuits, Systems, and Computers (JCSC) 23, 8 (2014).
    [39]
    Sparsh Mittal. 2014c. A survey of architectural techniques for improving cache power efficiency. Elsevier Sustainable Computing: Informatics and Systems 4, 1 (2014), 33--43.
    [40]
    Sparsh Mittal. 2015. A survey of power management techniques for phase change memory. International Journal of Computer Aided Engineering and Technology (IJCAET) (2015).
    [41]
    Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2015. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Transactions on Parallel and Distributed Systems (TPDS) (2015).
    [42]
    Fan Ni, Xiang Long, Han Wan, and Xiaopeng Gao. 2013. Combining instruction prefetching with partial cache locking to improve WCET in real-time systems. PloS one 8, 12 (2013), e82975.
    [43]
    John Picchi and Wei Zhang. 2015. Impact of L2 cache locking on GPU performance. In SoutheastCon. IEEE, 1--4.
    [44]
    Sascha Plazar, Jan C. Kleinsorge, Peter Marwedel, and Heiko Falk. 2012. WCET-aware static locking of instruction caches. In International Symposium on Code Generation and Optimization. 44--52.
    [45]
    Isabelle Puaut. 2006. WCET-centric software-controlled instruction caches for hard real-time systems. In Euromicro Conference on Real-Time Systems.
    [46]
    I. Puaut and A. Arnaud. 2006. Dynamic instruction cache locking in hard real-time systems. In Int. Conference on Real-Time and Network Systems.
    [47]
    Isabelle Puaut and David Decotigny. 2002. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In Real-Time Systems Symposium (RTSS). 114--123.
    [48]
    Isabelle Puaut and Christophe Pais. 2007. Scratchpad memories vs locked caches in hard real-time systems: A quantitative comparison. In Design, Automation & Test in Europe. 1--6.
    [49]
    Keni Qiu, Mengying Zhao, Chenchen Fu, and Chun Jason Xue. 2013. Data re-allocation enabled cache locking for embedded systems. In International Conference on Very Large Scale Integration (VLSI-SoC). 130--133.
    [50]
    Keni Qiu, Mengying Zhao, Chun Jason Xue, and Alex Orailoglu. 2014. Branch prediction-directed dynamic instruction cache locking for embedded systems. ACM Transactions on Embedded Computing Systems (TECS) 13, 5s (2014), 156.
    [51]
    Abhik Sarkar, Frank Mueller, and Harini Ramaprasad. 2015. Static task partitioning for locked caches in multicore real-time systems. ACM Transactions on Embedded Computing Systems (TECS) 14, 1 (2015), 4.
    [52]
    Mayank Shekhar, Abhik Sarkar, Harini Ramaprasad, and Frank Mueller. 2012. Semi-partitioned hard-real-time scheduling under locked cache migration in multicore systems. In Euromicro Conference on Real-Time Systems (ECRTS). 331--340.
    [53]
    Vivy Suhendra and Tulika Mitra. 2008. Exploring locking & partitioning for predictable shared caches on multi-cores. In Design Automation Conference. 300--303.
    [54]
    Henrik Theiling, Christian Ferdinand, and Reinhard Wilhelm. 2000. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Systems 18, 2--3 (2000), 157--179.
    [55]
    Xavier Vera, Björn Lisper, and Jingling Xue. 2003. Data cache locking for higher program predictability. In ACM SIGMETRICS Performance Evaluation Review, Vol. 31. 272--282.
    [56]
    Xavier Vera, Björn Lisper, and Jingling Xue. 2007. Data cache locking for tight timing calculations. ACM Transactions on Embedded Computing Systems (TECS) 7, 1 (2007), 4.
    [57]
    Bryan C. Ward, Jonathan L. Herman, Christopher J. Kenna, and James H. Anderson. 2013. Making shared caches more predictable on multicore platforms. In Euromicro Conference on Real-Time Systems (ECRTS). 157--167.
    [58]
    Reinhard Wilhelm, Daniel Grund, Jan Reineke, Marc Schlickling, Markus Pister, and Christian Ferdinand. 2009. Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 7 (2009), 966--978.
    [59]
    Wenguang Zheng and Hui Wu. 2014. WCET-aware dynamic instruction cache locking. In Conference on Languages, Compilers and Tools for Embedded Systems. 53--62.
    [60]
    Wenguang Zheng and Hui Wu. 2015. WCET-aware dynamic D-cache locking for a single task. In ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES).

    Cited By

    View all
    • (2023)A survey of software techniques to emulate heterogeneous memory systems in high-performance computingParallel Computing10.1016/j.parco.2023.103023116:COnline publication date: 1-Jul-2023
    • (2022) SBIs: Application Access to Safe, Baremetal Interrupt Latencies * 2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS54340.2022.00015(82-94)Online publication date: May-2022
    • (2022)Improving the Configuration of the Predictable ACDC Data Cache for Real-Time SystemsIEEE Access10.1109/ACCESS.2022.323006810(132708-132724)Online publication date: 2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 21, Issue 3
    Special Section on New Physical Design Techniques for the Next Generation Integration Technology and Regular Papers
    July 2016
    434 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2926747
    • Editor:
    • Naehyuck Chang
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 16 May 2016
    Accepted: 01 December 2015
    Revised: 01 October 2015
    Received: 01 August 2015
    Published in TODAES Volume 21, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CPU
    2. GPU
    3. Review
    4. cache locking
    5. cache partitioning
    6. classification
    7. hard real-time system
    8. multitasking
    9. worst-case execution time (WCET)

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Advanced Scientific Computing Research
    • Office of Science
    • U.S. Department of Energy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)362
    • Downloads (Last 6 weeks)42
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A survey of software techniques to emulate heterogeneous memory systems in high-performance computingParallel Computing10.1016/j.parco.2023.103023116:COnline publication date: 1-Jul-2023
    • (2022) SBIs: Application Access to Safe, Baremetal Interrupt Latencies * 2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS54340.2022.00015(82-94)Online publication date: May-2022
    • (2022)Improving the Configuration of the Predictable ACDC Data Cache for Real-Time SystemsIEEE Access10.1109/ACCESS.2022.323006810(132708-132724)Online publication date: 2022
    • (2022)A Survey of Techniques for Reducing Interference in Real-Time Applications on Multicore PlatformsIEEE Access10.1109/ACCESS.2022.315189110(21853-21882)Online publication date: 2022
    • (2021)ABC-DIMMProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00027(237-250)Online publication date: 14-Jun-2021
    • (2021)A Smart Cache Lockdown Technique for IoT SystemJournal of Physics: Conference Series10.1088/1742-6596/1927/1/0120031927:1(012003)Online publication date: 1-May-2021
    • (2020)Execution Model to Reduce the Interference of Shared Memory in ARINC 653 Compliant Multicore RTOSApplied Sciences10.3390/app1007246410:7(2464)Online publication date: 3-Apr-2020
    • (2020)Reducing the WCET and analysis time of systems with simple lockable instruction cachesPLOS ONE10.1371/journal.pone.022998015:3(e0229980)Online publication date: 19-Mar-2020
    • (2019)MxUACM Transactions on Embedded Computing Systems10.1145/335822418:5s(1-20)Online publication date: 8-Oct-2019
    • (2019)Cache Locking Content Selection Algorithms for ARINC-653 Compliant RTOSACM Transactions on Embedded Computing Systems10.1145/335819618:5s(1-20)Online publication date: 8-Oct-2019
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media