Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-642-11515-8_6guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Energy and throughput efficient transactional memory for embedded multicore systems

Published: 25 January 2010 Publication History

Abstract

We propose a new design for an energy-efficient hardware transactional memory (HTM) system for power-aware embedded devices. Prior hardware transactional memory designs proposed a small, fully-associative transactional cache at the same level as the L1 cache. We propose an alternative design that unifies the transactional and L1 caches, and provides a small victim cache to reduce effects of capacity and conflict evictions. We evaluate our new HTM scheme on a variety of benchmarks, both in terms of energy and performance. We show that the victim cache scheme can provide up to a 4X improvement in energy-delay product, compared to a traditional HTM scheme that uses a separate transactional cache.

References

[1]
Ferri, C., Bahar, R.I., Moreshet, T., Viescas, A., Herlihy, M.: Energy efficient synchronization techniques for embedded architectures. In: ACM/IEEE Great Lakes International Symposium on VLSI (May 2008)
[2]
Herlihy, M., Moss, J.E.B.: Transactional memory: Architectural support for lockfree data structures. In: International Symposium on Computer Architecture (May 1993)
[3]
Tumeo, A., Pilato, C., Palermo, G., Ferrandi, F., Sciuto, D.: HW/SW methodologies for synchronization in FPGA multiprocessors. In: International Symposium on Field Programmable Gate Arrays (2009)
[4]
Lee, J., Park, K.H.: Delayed locking technique for improving real-time performance of embedded linux by prediction of timer interrupt. In: IEEE Real Time and Embedded Technology and Applications Symposium (2005)
[5]
Loghi, M., Poncino, M., Benini, L.: Cache coherence tradeoffs in shared-memory MPSoCs. ACM Transactions on Embedded Computing Systems 5(2), 383-407 (2006)
[6]
Monchiero, M., Palermo, G., Silvano, C., Villa, O.: Power/performance hardware optimization for synchronization intensive applications in MPSoCs. In: Design Automation and Test in Europe Conference (April 2006)
[7]
Yu, C., Petrov, P.: Latency and bandwidth efficient communication through system customization for embedded multiprocessors. In: Design Automation Conference (2008)
[8]
Cho, H., Ravindran, B., Jensen, E.D.: Lock-free synchronization for dynamic embedded real-time systems. In: Design Automation and Test in Europe Conference (2006)
[9]
Yang, C., Orailoglu, A.: Light-weight synchronization for inter-processor communication acceleration on embedded MPSoCs. In: International Conference on Compilers, Architecture and Synthesis for Embedded Systems (2007)
[10]
Moore, K.E., Bobba, J., Moravan, M.J., Hill, M.D., Wood, D.A.: LogTM: Logbased transactional memory. In: International Symposium on High-Performance Computer Architecture (February 2006)
[11]
Hammond, L., Carlstrom, B.D., Wong, V., Hertzberg, B., Chen, M., Kozyrakis, C., Olukotun, K.: Programming with transactional coherence and consistency (TCC). ACM SIGOPS Operating Systems Review 38(5), 1-13 (2004)
[12]
Shavit, N., Touitou, D.: Software transactional memory. Distributed Computing Special issue(10), 99-116 (1997)
[13]
Herlihy, M., Koskinen, E.: Transactional boosting: A methodology for highlyconcurrent transactional objects. In: Principles and Practice of Parallel Programming, PPOPP (2008)
[14]
Damron, P., Fedorova, A., Lev, Y., Luchangco, V., Moir, M., Nussbaum, D.: Hybrid transactional memory. In: International Conference on Architectural Support for Programming Languages and Operating Systems (2006)
[15]
Shriraman, A., Dwarkadas, S., Scott, M.L.: Flexible decoupled transactional memory support. In: Proceedings of the 35th International Symposium on Computer Architecture (2008)
[16]
Larus, J., Rajwar, R.: Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan & Claypool Publishers, San Francisco (2007)
[17]
Waliullah, M.M., Stenstrom, P.: Starvation-free commit arbitration policies for transactional memory systems. ACM SIGARCH Computer Architecture News 35(1), 39-46 (2007)
[18]
Ananian, C.S., Asanovic, K., Kuszmaul, B.C., Leiserson, C.E., Lie, S.: Unbounded transactional memory. In: International Symposium on High-Performance Computer Architecture (February 2005)
[19]
Ceze, L., Tuck, J., Cascaval, C., Torrellas, J.: Bulk disambiguation of speculative threads in multiprocessors. In: International Symposium on Computer Architecture (June 2006)
[20]
Rajwar, R., Herlihy, M., Lai, K.: Virtualizing Transactional Memory. In: International Symposium on Computer Architecture (June 2005)
[21]
Blundell, C., Devietti, J., Lewis, E.C., Martin, M.: Making the fast case common and the uncommon case simple in unbounded transactional memory. In: International Symposium on Computer Architecture (June 2007)
[22]
Angiolini, F., Ceng, J., Leupers, R., Ferrari, F., Ferri, C., Benini, L.: An integrated open framework for heterogeneous MPSoC design space exploration. In: Design Automation and Test in Europe Conference (DATE), pp. 1145-1150 (2006)
[23]
Loghi, M., Angiolini, F., Bertozzi, D., Benini, L., Zafalon, R.: Analyzing on-chip communication in a MPSoC environment. In: Design Automation and Test in Europe Conference (DATE), February 2004, pp. 752-757 (2004)
[24]
STMicroelectronics: Nomadik platform, http://www.stm.com
[25]
Efthymiou, A., Garside, J.D.: An adaptive serial-parallel cam architecture for lowpower cache blocks. In: International Symposium on Low Power Electronics and Design (2002)
[26]
AMBA: ARM Ltd. The advanced microcontroller bus architecture (AMBA), http://www.arm.com/products/solutions/AMBAHomePage.html
[27]
Goodacre, J., Sloss, A.N.: Parallelism and the ARM instruction set architecture. IEEE Computer 38(7) (July 2005)
[28]
Banakar, R., Steinke, S., Lee, B.S., Balakrishnan, M., Marwedel, P.: Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In: Symposium on Hardware/Software Codesign, pp. 73-78 (2002)
[29]
Jouppi, N.P.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In: International Symposium on Computer Architecture (May 1990)
[30]
Bahar, R.I., Albera, G., Manne, S.: Power and performance tradeoffs using various caching strategies. In: International Symposium on Low Power Electronics and Design, August 1998, pp. 64-69 (1998)
[31]
Minh, C.C., Chung, J., Kozyrakis, C., Olukotun, K.: STAMP: Stanford transactional applications for multi-processing. In: IISWC 2008: Proceedings of The IEEE International Symposium on Workload Characterization (September 2008)
[32]
STMicroelectronics-Cortex: STMicroelectronics Cortex-M3 CPU, http://www.st.com/mcu/inchtml-pages-stm32.html
[33]
Freescale-QE: Freescale low-power QE family processor, http://www.freescale.com/files/microcontrollers/

Cited By

View all
  • (2015)Energy-Efficient and High-Performance Lock Speculation Hardware for Embedded Multicore SystemsACM Transactions on Embedded Computing Systems10.1145/270009714:3(1-27)Online publication date: 21-May-2015
  • (2013)VGTSProceedings of the 19th international conference on Parallel Processing10.1007/978-3-642-40047-6_22(203-214)Online publication date: 26-Aug-2013
  • (2011)Optimizing throughput/power trade-offs in hardware transactional memory using DVFS and intelligent schedulingProceedings of the international conference on Supercomputing10.1145/1995896.1995918(141-150)Online publication date: 31-May-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
HiPEAC'10: Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
January 2010
368 pages
ISBN:3642115144
  • Editors:
  • Yale N. Patt,
  • Pierfrancesco Foglia,
  • Evelyn Duesterwald,
  • Paolo Faraboschi,
  • Xavier Martorell

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 25 January 2010

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Energy-Efficient and High-Performance Lock Speculation Hardware for Embedded Multicore SystemsACM Transactions on Embedded Computing Systems10.1145/270009714:3(1-27)Online publication date: 21-May-2015
  • (2013)VGTSProceedings of the 19th international conference on Parallel Processing10.1007/978-3-642-40047-6_22(203-214)Online publication date: 26-Aug-2013
  • (2011)Optimizing throughput/power trade-offs in hardware transactional memory using DVFS and intelligent schedulingProceedings of the international conference on Supercomputing10.1145/1995896.1995918(141-150)Online publication date: 31-May-2011
  • (2010)Evaluation of a hardware transactional memory model in an NoC-based embedded MPSoCProceedings of the 23rd symposium on Integrated circuits and system design10.1145/1854153.1854177(85-90)Online publication date: 6-Sep-2010

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media