Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/320080.320098acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article
Free access

Automatic and efficient evaluation of memory hierarchies for embedded systems

Published: 16 November 1999 Publication History

Abstract

Automation is the key to the design of future embedded systems as it permits application-specific customization while keeping design costs low. A key problem faced by automatic design systems is evaluating the performance of the vast number of alternative designs in a timely manner. For this paper, we focus on an embedded system consisting of the following components: a VLIW processor, instruction cache, data cache, and second-level unified cache. A hierarchical approach of partitioning the system into its constituent components and evaluating each component individually is utilized. The performance of each processor is evaluated independent of its memory hierarchy, and each of the caches is simulated using the traces from a single reference processor. Since the changes in the processor architecture do indeed affect the address traces and thus the performance of the memory hierarchy, the overall performance is inaccurate. To overcome this error, the changes in the processor architecture are modeled as a dilation of the reference processor's address trace, where each instruction block in the trace is conceptually stretched out by the dilation coefficient. This approach provides a projected cache performance that more accurately accounts for changes in the processor architecture. In order to understand the accuracy of the dilation model, we separate the possible errors that the model introduces and quantify these errors on a set of benchmarks. The results show the dilation model is effective for most of the design space and facilitates efficient automatic design.

References

[1]
J. Hoogerbrugge and H. Corporaal, "Automatic synthesis of transport triggered processors," presented at Proc. First Ann. Conf. Advanced School for Computing and Imaging, Heijen, The Netherlands, 1995.
[2]
J.M. Mulder and R. J. Portier, "Cost-effective design of application-specific VLIW processors using the SCARCE framework," presented at Proc. 22nd Workshop on Microprogramming and Microarchitectures, 1989.
[3]
D. Kirovski, C. Lee, M. Potkonjak, and W. M. Mangione- Smith, "Application-driven synthesis of core-based systems," presented at Proc. IEEE Int. Conf. on Computer Aided Design (ICCAD), 1997.
[4]
M. Kobayashi and M. H. Macdougall, "The Stack Growth Function: Cache Line Reference Models," IEEE Transactions on Computers, vol. 38, pp. 798--805, 1989.
[5]
J.P. Singh, H. S. Stone, and D. F. Thiebaut, "A model of workloads and its use in miss-rate prediction for fullyassociative caches," IEEE Transactions on Computers, vol. 41, pp. 811-25, 1992.
[6]
G. Rajaram and V. Rajaraman, "A probabilistic method for calculating hit ratios in direct mapped caches," Journal of Network and Computer Applications, vol. 19, pp. 309-319, 1996.
[7]
D. B. Whalley, "Fast Instruction Cache Performance Evaluation Using Compile-Time Analysis," presented at Proc. ACM SIGMETRICS Confi, 1992.
[8]
R. W. Quong, "Expected I-cache miss rates via the gap model," presented at Proc. 21 Int. Symp. Comp. Arch., 1994.
[9]
B. L. Jacob, P. M. Chen, S. R. Silverman, and T. N. Mudge, "An Analytical Model for Designing Memory Hierarchies," IEEE Transactions on Computers, vol. 45, pp. 1180-94, 1996.
[10]
K. Lee and M. Dubois, "Empirical models of miss rates," Parallel Computing, vol. 24, pp. 205-219, 1998.
[11]
A. Agarwal, M. Horowitz, and J. Hennessy, "An Analytical Cache Model," A CM Trans. on Computer Systems, vol. 7, pp. 184--215, 1989.
[12]
P. Steenkiste, "The Impact of Code Density on Instruction Cache Performance," presented at Proc. of 16th Intl. Symp. on Computer Architecture, 1989.
[13]
S. Aditya and B. R. Rau, "Automatic architectural synthesis of VLIW and EPIC processors," Hewlett-Packard Laboratories, Palo Alto, CA to appear 1999.
[14]
"Trimaran Compiler Infrastructure," http' //www. trimaran, org.
[15]
S. Aditya, B. R. Rau, and R. Johnson, "Automatic design of VLIW/EPIC instruction formats," Hewlett-Packard Laboratories, Palo Alto, CA to appear 1999.
[16]
W. W. Hwu et al., "The Superblock: An effective technique for VLIW and superscalar compilation," The Journal of Supercomputing, vol. 7, pp. 229-248, 1993.
[17]
R. A. Sugumar and S. G. Abraham, "Multi-configuration simulation algorithms for the evaluation of computer architecture designs," CSE Division, University of Michigan, Ann Arbor CSE-TR--i 73-93, 1993.
[18]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "MediaBench: A tool for evaluating and synthesizing multimedia and communication systems," presented at Proc. 30th Int. Symp. Microarchitecture, 1997.

Cited By

View all
  • (2016)Integrated Exploration Methodology for Data Interleaving and Data-to-Memory Mapping on SIMD ArchitecturesACM Transactions on Embedded Computing Systems10.1145/289475415:3(1-23)Online publication date: 23-May-2016
  • (2013)Exploration of energy efficient memory organisations for dynamic multimedia applications using system scenariosDesign Automation for Embedded Systems10.1007/s10617-014-9145-617:3-4(669-692)Online publication date: 1-Sep-2013
  • (2008)Reducing complexity of multiobjective design space exploration in VLIW-based embedded systemsACM Transactions on Architecture and Code Optimization10.1145/1400112.14001165:2(1-33)Online publication date: 3-Sep-2008
  • Show More Cited By

Index Terms

  1. Automatic and efficient evaluation of memory hierarchies for embedded systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
      November 1999
      299 pages
      ISBN:076950437X

      Sponsors

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 16 November 1999

      Check for updates

      Qualifiers

      • Article

      Conference

      MICRO99
      Sponsor:

      Acceptance Rates

      MICRO 32 Paper Acceptance Rate 27 of 131 submissions, 21%;
      Overall Acceptance Rate 455 of 2,084 submissions, 22%

      Upcoming Conference

      MICRO '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Integrated Exploration Methodology for Data Interleaving and Data-to-Memory Mapping on SIMD ArchitecturesACM Transactions on Embedded Computing Systems10.1145/289475415:3(1-23)Online publication date: 23-May-2016
      • (2013)Exploration of energy efficient memory organisations for dynamic multimedia applications using system scenariosDesign Automation for Embedded Systems10.1007/s10617-014-9145-617:3-4(669-692)Online publication date: 1-Sep-2013
      • (2008)Reducing complexity of multiobjective design space exploration in VLIW-based embedded systemsACM Transactions on Architecture and Code Optimization10.1145/1400112.14001165:2(1-33)Online publication date: 3-Sep-2008
      • (2007)Fast, accurate design space exploration of embedded systems memory configurationsProceedings of the 2007 ACM symposium on Applied computing10.1145/1244002.1244159(699-706)Online publication date: 11-Mar-2007
      • (2006)Optimal topology exploration for application-specific 3D architecturesProceedings of the 2006 Asia and South Pacific Design Automation Conference10.1145/1118299.1118396(390-395)Online publication date: 24-Jan-2006
      • (2004)Balancing design options with SherpaProceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1023833.1023843(57-68)Online publication date: 22-Sep-2004
      • (2004)Dynamic on-chip memory management for chip multiprocessorsProceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1023833.1023838(14-23)Online publication date: 22-Sep-2004
      • (2003)Data remapping for design space optimization of embedded memory systemsACM Transactions on Embedded Computing Systems10.1145/643470.6434742:2(186-218)Online publication date: 1-May-2003
      • (2002)PICOComputer10.1109/MC.2002.103302635:9(39-47)Online publication date: 1-Sep-2002
      • (2001)Cool-cache for hot multimediaProceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture10.5555/563998.564033(274-283)Online publication date: 1-Dec-2001
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media