Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1023833.1023869acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

A post-compiler approach to scratchpad mapping of code

Published: 22 September 2004 Publication History

Abstract

ScratchPad Memories (SPMs) are commonly used in embedded systems because they are more energy-efficient than caches and enable tighter application control on the memory hierarchy. Optimally mapping code and data to SPMs is, however, still a challenge. This paper proposes an optimal scratchpad mapping approach for code segments, which has the distinctive characteristic of working directly on application binaries, thus requiring no access to either the compiler or the application source code - a clear advantage for legacy or proprietary, IP-protected applications.The mapping problem is solved by means of a Dynamic Programming algorithm applied to the execution traces of the target application. The algorithm is able to find the optimal set of instructions blocks to be moved into a dedicated SPM, either minimizing energy consumption or execution times. A patching tool, which can use the output of the optimal mapper, modifies the binary of the application and moves the relevant portions of its code segments to memory locations inside of the SPM.

References

[1]
Raam, F.M.; Agarwal, R.; Malik, K.; Landman, H.A.; Tago, H.; Teruyama, T.; Sakamoto, T.; Yoshida, T.; Yoshioka, S.; Fujimoto, Y.; Kobayashi, T.; Hiroi, T.; Oka, M.; Ohba, A.; Suzuoki, M.; Yutaka, T.; Yamamoto, Y., "A High Bandwidth Superscalar Microprocessor for Multimedia Applications", Digest of Technical Papers of the 1999 IEEE International Solid-State Circuits Conference, pp. 258--259, 1999.
[2]
Suzuoki, M.; Kutaragi, K.; Hiroi, T.; Magoshi, H.; Okamoto, S.; Oka, M.; Ohba, A.; Yamamoto, Y.; Furuhashi, M.; Tanaka, M.; Yutaka, T.; Okada, T.; Nagamatsu, M.; Urakawa, Y.; Funyu, M.; Kunimatsu, A.; Goto, H.; Hashimoto, K.; Ide, N.; Murakami, H.; Ohtaguro, Y.; Aono, A., "A Microprocessor with a 128-bit CPU, Ten Floating-Point MAC's, Four Floating-Point Dividers, and an MPEG-2 Decoder", IEEE Journal of Solid-State Circuits, Volume 34 Issue 11, Nov 1999, pp. 1608--1618, 1999.
[3]
Koyama, T.; Inoue, K.; Hanaki, H.; Yasue, M.; Iwata, E., "A 250-MHz Single-Chip Multiprocessor for Audio and Video Signal Processing", IEEE Journal of Solid-State Circuits, Volume 36 Issue 11, Nov 2001, pp. 1768--1774, 2001.
[4]
Benini, L.; Macii, A.; Macii, E.; Poncino, M., "Increasing Energy Efficiency of Embedded Systems by Application-Specific Memory Hierarchy Generation", IEEE Design and Test of Computers, Volume 17 Issue 2, Apr-Jun 2000, pp. 74--85, 2000.
[5]
Benini, L.; Macchiarulo, L.; Macii, A.; Poncino, M., "Layout-Driven Memory Synthesis for Embedded Systems-on-Chip", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Volume 10 Issue 2, Apr 2002, pp. 96--105, 2002.
[6]
Benini, L.; Bertozzi, D.; Bruni, D.; Drago, N.; Fummi, F.; Poncino, M., "Legacy SystemC Co-Simulation of Multi-Processor Systems-on-Chip", Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors, pp. 494--499, 2002.
[7]
Angiolini, F.; Benini, L.; Caprara, A., "Polynomial-Time Algorithm for On-Chip Scratchpad Memory Partitioning", Proceedings of the ACM International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pp. 318--326, 2003.
[8]
Kennedy, K.; Allen, J.R., "High-Performance Compilers", Elsevier Science and Technology Books, 2001.
[9]
Panda, P.R.; Dutt, N.D.; Nicolau, A., "Efficient Utilization of Scratch-pad Memory in Embedded Processor Applications", Proceedings of the European Design and Test Conference, pp. 7--11, 1997.
[10]
Panda, P.R.; Dutt, N.D.; Nicolau, A., "Local Memory Exploration and Optimization in Embedded Systems", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 18 Issue 1, Jan 1999, pp. 3--13, 1999.
[11]
Panda, P.R.; Dutt, N.D.; Nicolau, A.; Catthoor, F.; Vandecappelle, A.; Brockmeyer, E.; Kulkarni, C.; De Greef, E., "Data Memory Organization and Optimizations in Application-Specific Systems", IEEE Design and Test of Computers, Volume 18 Issue 3, May 2001, pp. 56--68, 2001.
[12]
Shiue, W.-T.; Chakrabarti, C., "Memory Exploration for Low Power, Embedded Systems", Proceedings of the 36th Design Automation Conference, pp. 140--145, 1999.
[13]
Kim, S.; Vijaykrishnan, N.; Kandemir, M.; Sivasubramaniam, A.; Irwin, M.J.; Geethanjali, E., "Power-Aware Partitioned Cache Architectures", Proceedings of the International Symposium on Low Power Electronics and Design, pp. 64--67, 2001.
[14]
Kandemir, M.; Ramanujam, J.; Irwin, M.J.; Vijaykrishnan, N.; Kadayif, I.; Parikh, A., "Dynamic Management of Scratch-Pad Memory Space", Proceedings of the Design Automation Conference, pp. 690--695, 2001.
[15]
Kandemir, M.; Choudhary, A., "Compiler-Directed Scratch Pad Memory Hierarchy Design and Management", Proceedings of the 39th Design Automation Conference, pp. 628--633, 2002.
[16]
Kandemir, M.; Kadayif, I.; Sezer, U., "Exploiting Scratch-Pad Memory Using Presburger Formulas", Proceedings of the 14th International Symposium on System Synthesis, pp. 7--12, 2001.
[17]
Kandemir, M.; Ramanujam, J.; Choudhary, A., "Exploiting Shared Scratch Pad Memory Space in Embedded Multiprocessor Systems", Proceedings of the 39th Design Automation Conference, pp. 219--224, 2002.
[18]
Banakar, R.; Steinke, S.; Lee, B-S.; Balakrishnan, M.; Marwedel, P., "Scratchpad Memory: a Design Alternative for Cache On-Chip Memory in Embedded Systems", Proceedings of the Tenth International Symposium on Hardware/Software Codesign, pp. 73--78, 2002.
[19]
Steinke, S.; Wehmeyer, L.; Lee, B-S.; Marwedel, P., "Assigning Program and Data Objects to Scratchpad for Energy Reduction", Proceedings of the IEEE Design and Test in Europe Conference (DATE), pp. 409--415, 2002.
[20]
Steinke, S.; Grunwald, N.; Wehmeyer, L.; Banakar, R.; Balakrishnan, M.; Marwedel, P., "Reducing Energy Consumption by Dynamic Copying of Instructions onto Onchip Memory", 15th International Symposium on System Synthesis, pp. 213--218, 2002.
[21]
Verma, M.; Steinke, S.; Marwedel, P., "Data Partitioning for Maximal Scratchpad Usage", Proceedings of the ASP-DAC 2003. Asia and South Pacific Design Automation Conference, pp. 77--83, 2003.
[22]
Verma, M.; Wehmeyer, L.; Marwedel, P., "Cache-Aware Scratchpad Allocation Algorithm", Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, Vol. 2, pp. 1264--1269, 2004.
[23]
Bertozzi, D.; Poletti, F.; Benini, L., "Performance Analysis of Arbitration Policies for SoC Communication Architectures", Design Automation of Embedded Systems, Special Issue on Covalidation of Embedded Hardware/Software Systems, 2003.
[24]
Loghi, M.; Angiolini, F.; Bertozzi, D.; Benini, L.; Zafalon, R., "Analyzing On-Chip Communication in a MPSoC Environment", Proceedings of the IEEE Design and Test in Europe Conference (DATE), February 2004, pp. 752--757, 2004.
[25]
Udayakumaran, S.; Barua, R., "Compiler-Decided Dynamic Memory Allocation for Scratch-Pad Based Embedded Systems", Proceedings of the ACM International Conference on Compilers, Architecture, and Synthesis for Embedded System (CASES), 2003.
[26]
Poletti, F.; Marchal, P.; Atienza, D.; Benini, L.; Catthoor, F.; Mendias, J. M., "An integrated hardware/software approach for run-time scratchpad management", Proceedings of the 41st Design Automation Conference, pp. 238--243, 2004.
[27]
Martello, S.; Toth, P., "Knapsack Problems", John Wiley & Sons, Chichester, 1990.
[28]
Bellman, R.E., "Dynamic Programming", Princeton University Press, Princeton, NJ, 1957.
[29]
Ibarra, O. H.; Kim, C. E., "Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems", Journal of the ACM (JACM), Volume 22 Issue 4, Oct 1975, pp. 463--468, 1975.
[30]
Sahni S., "Approximate Algorithms for the 0/1 Knapsack Problem", Journal of the ACM (JACM), Volume 22 Issue 1, Jan 1975, pp. 115--124, 1975.
[31]
Macii, A.; Macii, E.; Poncino, M., "Improving the Efficiency of Memory Partitioning by Address Clustering", Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pp. 18--23, 2003.
[32]
Jagger, D.; Seal, D., "ARM Architecture Reference Manual Second Edition", Addison-Wesley, 2000.
[33]
SWARM http://www.g141.com/projects/swarm/
[34]
CACTI http://research.compaq.com/wrl/people/jouppi/CACTI.html

Cited By

View all
  • (2020)Unified Thread- and Data-Mapping for Multi-Threaded Multi-Phase Applications on SPM Many-Cores2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116493(1496-1501)Online publication date: Mar-2020
  • (2020)Data Compression and Re-computation Based Performance Improvement in Multi-Core Architectures2020 10th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC47524.2020.9031221(0390-0395)Online publication date: Jan-2020
  • (2019)Developments in memory management in OpenMPInternational Journal of High Performance Computing and Networking10.5555/3302714.330271913:1(70-85)Online publication date: 1-Jan-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
September 2004
324 pages
ISBN:1581138903
DOI:10.1145/1023833
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 September 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. design automation
  2. dynamic programming
  3. embedded design
  4. executable patching
  5. memory hierarchy
  6. optimization algorithm
  7. post-compiler processing
  8. power saving
  9. scratchpad memory

Qualifiers

  • Article

Conference

CASES04

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Unified Thread- and Data-Mapping for Multi-Threaded Multi-Phase Applications on SPM Many-Cores2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116493(1496-1501)Online publication date: Mar-2020
  • (2020)Data Compression and Re-computation Based Performance Improvement in Multi-Core Architectures2020 10th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC47524.2020.9031221(0390-0395)Online publication date: Jan-2020
  • (2019)Developments in memory management in OpenMPInternational Journal of High Performance Computing and Networking10.5555/3302714.330271913:1(70-85)Online publication date: 1-Jan-2019
  • (2019)Survey and Taxonomy of Volunteer ComputingACM Computing Surveys10.1145/332007352:3(1-35)Online publication date: 3-Jul-2019
  • (2019)Leveraging User-related Internet of Things for Continuous AuthenticationACM Computing Surveys10.1145/331402352:3(1-38)Online publication date: 18-Jun-2019
  • (2019)Control Flow Checking or Not? (for Soft Errors)ACM Transactions on Embedded Computing Systems10.1145/330131118:1(1-25)Online publication date: 15-Feb-2019
  • (2019)Scratchpad-Memory Management for Multi-Threaded Applications on Many-Core ArchitecturesACM Transactions on Embedded Computing Systems10.1145/330130818:1(1-28)Online publication date: 5-Feb-2019
  • (2019)Static Function Prefetching for Efficient Code Management on Scratchpad Memory2019 IEEE 37th International Conference on Computer Design (ICCD)10.1109/ICCD46524.2019.00056(350-358)Online publication date: Nov-2019
  • (2017)An energy-efficient memory hierarchy for multi-issue processorsProceedings of the Conference on Design, Automation & Test in Europe10.5555/3130379.3130466(368-373)Online publication date: 27-Mar-2017
  • (2017)An energy-efficient memory hierarchy for multi-issue processorsDesign, Automation & Test in Europe Conference & Exhibition (DATE), 201710.23919/DATE.2017.7927018(368-373)Online publication date: Mar-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media