Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1629435.1629476acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

SuSeSim: a fast simulation strategy to find optimal L1 cache configuration for embedded systems

Published: 11 October 2009 Publication History

Abstract

Simulation of an application is a popular and reliable approach to find the optimal configuration of level one cache memory for an application specific embedded system processor. However, long simulation time is one of the main disadvantages of simulation based approaches. In this paper, we propose a new and fast simulation method, Super Set Simulator (SuSeSim). While previous methods use Top-Down searching strategy, SuSeSim utilizes a Bottom-Up search strategy along with a new elaborate data structure to reduce the search space to determine a cache hit or miss. SuSeSim can simulate hundreds of cache configurations simultaneously by reading an application's memory request trace just once. Total number of cache hits and misses are accurately recorded. Depending on different cache block sizes and benchmark applications, SuSeSim can reduce the number of tags to be checked by up to 43% compared to the existing fastest simulation approach (the CRCB algorithm). With the help of a faster search and an easy to maintain data structure, SuSeSim can be up to 94% faster in simulating memory requests compared to the CRCB algorithm.

References

[1]
Xtensa processor. http://www.tensilica.com/.
[2]
D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for architectural level power analysis and optimizations. In Proceedings of the 27th Annual International Symposium on Computer Architecture, pages 83--94, 2000.
[3]
D. Burger and T. M. Austin. The simplescalar tool set, version 2.0. SIGARCH Comput. Archit. News, 25(3):13--25, 1997.
[4]
J. Edler and M. D. Hill. Dinero iv trace-driven uniprocessor cache simulator. http://www.cs.wisc.edu/ markhill/DineroIV/, 2004.
[5]
W. Fornaciari, D. Sciuto, C. Silvano, and V. Zaccaria. A design framework to efficiently explore energy-delay tradeoffs. In CODES '01: Proceedings of the ninth international symposium on Hardware/software codesign, pages 260--265, New York, NY, USA, 2001. ACM.
[6]
J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBN System Journal, 9(2):78--117, 1970.
[7]
S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems, 21:703--746, 1999.
[8]
M. D. Hill and A. J. Smith. Evaluating associativity in cpu caches. IEEE Trans. Comput., 38(12):1612--1630, 1989.
[9]
K. Horiuchi, S. Kohara, N. Togawa, M. Yanagisawa, and T. Ohtsuki. A data cache optimization system for application processor cores and its experimental evaluation. In IEICE Technical Report, VLD2006-122, ICD2006-213, pages 19--24, 2006.
[10]
A. Janapsatya, A. Ignjatovi´c, and S. Parameswaran. Finding optimal l1 cache configuration for embedded systems. In ASP-DAC '06: Proceedings of the 2006 conference on Asia South Pacific design automation, pages 796--801, Piscataway, NJ, USA, 2006. IEEE Press.
[11]
C. Lee,M. Potkonjak, andW.H.Mangione-smith.Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In In International Symposium on Microarchitecture, pages 330--335, 1997.
[12]
S. Leibson and J.Massingham. Flix: Fast relief for performance-hungry embedded applications. Technical report, Tensilica Inc., 2005.
[13]
X. Li, H. S. Negi, T. Mitra, and A. Roychoudhury. Design space exploration of caches using compressed traces. In ICS '04: Proceedings of the 18th annual international conference on Supercomputing, pages 116--125, New York, NY, USA, 2004. ACM.
[14]
J. J. Pieper, A. Mellan, J. M. Paul, D. E. Thomas, and F. Karim. High level cache simulation for heterogeneous multiprocessors. In DAC '04: Proceedings of the 41st annual conference on Design automation, pages 287--292, New York, NY, USA, 2004. ACM.
[15]
D. Ponomarev, G. Kucuk, and K. Ghose. Accupower: An accurate power estimation tool for superscalar microprocessors. In DATE '02: Proceedings of the conference on Design, automation and test in Europe, page 124, Washington, DC, USA, 2002. IEEE Computer Society.
[16]
R. A. Sugumar and S. G. Abraham. Set-associative cache simulation using generalized binomial trees. ACM Trans. Comput. Syst., 13(1):32--56, 1995.
[17]
N. Tojo, N. Togawa, M. Yanagisawa, and T. Ohtsuki. Exact and fast l1 cache simulation for embedded systems. In ASP-DAC '09: Proceedings of the 2009 Conference on Asia and South Pacific Design Automation, pages 817--822, Piscataway, NJ, USA, 2009. IEEE Press.
[18]
X. Vera, N. Bermudo, J. Llosa, and A. Gonzalez. A fast and accurate framework to analyze and optimize cache memory behavior. ACM Trans. Prog. Lang. Syst., 26(2):263--300, 2004.

Cited By

View all
  • (2020)A Survey of Cache SimulatorsACM Computing Surveys10.1145/337239353:1(1-32)Online publication date: 6-Feb-2020
  • (2018)Predictability and Performance Aware Replacement Policy PVISAM for Unified Shared Caches in Real-time MulticoresIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.285708137:11(2720-2731)Online publication date: Nov-2018
  • (2018)A Self-Reconfiguring Cache Architecture to Improve Control Quality in Cyber-Physical Systems2018 IEEE 21st International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC.2018.00024(116-123)Online publication date: May-2018
  • Show More Cited By

Index Terms

  1. SuSeSim: a fast simulation strategy to find optimal L1 cache configuration for embedded systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
      October 2009
      498 pages
      ISBN:9781605586281
      DOI:10.1145/1629435
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 October 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. L1 cache
      2. LRU
      3. cache simulation
      4. miss rate
      5. simulation

      Qualifiers

      • Research-article

      Conference

      ESWeek '09
      ESWeek '09: Fifth Embedded Systems Week
      October 11 - 16, 2009
      Grenoble, France

      Acceptance Rates

      Overall Acceptance Rate 280 of 864 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)A Survey of Cache SimulatorsACM Computing Surveys10.1145/337239353:1(1-32)Online publication date: 6-Feb-2020
      • (2018)Predictability and Performance Aware Replacement Policy PVISAM for Unified Shared Caches in Real-time MulticoresIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.285708137:11(2720-2731)Online publication date: Nov-2018
      • (2018)A Self-Reconfiguring Cache Architecture to Improve Control Quality in Cyber-Physical Systems2018 IEEE 21st International Symposium on Real-Time Distributed Computing (ISORC)10.1109/ISORC.2018.00024(116-123)Online publication date: May-2018
      • (2016)Concurrent memory subsystem and application optimization for ASIP design2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS)10.1109/SAMOS.2016.7818325(1-10)Online publication date: Jul-2016
      • (2015)Exploring Multilevel Cache Hierarchies in Application Specific MPSoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2015.244573634:12(1991-2003)Online publication date: Dec-2015
      • (2015)Speeding up single pass simulation of PLRUt cachesThe 20th Asia and South Pacific Design Automation Conference10.1109/ASPDAC.2015.7059091(695-700)Online publication date: Jan-2015
      • (2014)Hardware-based fast exploration of cache hierarchies in application specific MPSoCsProceedings of the conference on Design, Automation & Test in Europe10.5555/2616606.2617020(1-6)Online publication date: 24-Mar-2014
      • (2014)MASH{fifo}Proceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593159(1-6)Online publication date: 1-Jun-2014
      • (2014)Rapid design space exploration of two-level unified caches2014 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS.2014.6865540(1937-1940)Online publication date: Jun-2014
      • (2014)A scorchingly fast FPGA-based Precise L1 LRU cache simulator2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASPDAC.2014.6742926(412-417)Online publication date: Jan-2014
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media