research-article

An Energy-Efficient DRAM Cache Architecture for Mobile Platforms With PCM-Based Main Memory

Authors:

Jae W. LeeAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 21, Issue 1

Article No.: 7, Pages 1 - 22

https://doi.org/10.1145/3451995

Published: 14 January 2022 Publication History

Abstract

A long battery life is a first-class design objective for mobile devices, and main memory accounts for a major portion of total energy consumption. Moreover, the energy consumption from memory is expected to increase further with ever-growing demands for bandwidth and capacity. A hybrid memory system with both DRAM and PCM can be an attractive solution to provide additional capacity and reduce standby energy. Although providing much greater density than DRAM, PCM has longer access latency and limited write endurance to make it challenging to architect it for main memory.

To address this challenge, this article introduces CAMP, a novel DRAM cache architecture for mobile platforms with PCM-based main memory. A DRAM cache in this environment is required to filter most of the writes to PCM to increase its lifetime, and deliver highest efficiency even for a relatively small-sized DRAM cache that mobile platforms can afford. To address this CAMP divides DRAM space into two regions: a page cache for exploiting spatial locality in a bandwidth-efficient manner and a dirty block buffer for maximally filtering writes. CAMP improves the performance and energy-delay-product by 29.2% and 45.2%, respectively, over the baseline PCM-oblivious DRAM cache, while increasing PCM lifetime by 2.7×. And CAMP also improves the performance and energy-delay-product by 29.3% and 41.5%, respectively, over the state-of-the-art design with dirty block buffer, while increasing PCM lifetime by 2.5×.

References

[1]

Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2015. Prediction hybrid cache: An energy-efficient STT-RAM cache architecture. IEEE Transactions on Computers 65, 3 (2015), 940–951. DOI:https://doi.org/10.1109/TC.2015.2435772

Digital Library

[2]

J. H. Ahn, S. Li, S. O, and N. P. Jouppi. 2013. mcsima+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. DOI:https://doi.org/10.1109/ISPASS.2013.6557148

Digital Library

[3]

ANANDTECH. 2019. Intel shares new optane and 3D NAND roadmap. Retrieved December 6th, 2021 from https://www.anandtech.com/show/14903/intel-shares-new-optane-and-3d-nand-roadmap.

[4]

Sina Asadi, Amir Mahdi Hosseini Monazzah, Hamed Farbeh, and Seyed Ghassem Miremadi. [n.d.]. Wipe: Wearout informed pattern elimination to improve the endurance of nvm-based caches. In Proceedings of the 2017 22nd Asia and South Pacific Design Automation Conference. 188–193.

[5]

Hung-Sheng Chang, Yuan-Hao Chang, Tei-Wei Kuo, and Hsiang-Pang Li. 2015. A light-weighted software-controlled cache for PCM-based main memory systems. In Proceedings of the 2015 IEEE/ACM International Conference on Computer-aided Design. IEEE, 22–29. DOI:https://doi.org/10.1109/ICCAD.2015.7372545

Digital Library

[6]

Yu-Ming Chang, Pi-Cheng Hsiu, Yuan-Hao Chang, Chi-Hao Chen, Tei-Wei Kuo, and Cheng-Yuan Michael Wang. 2016. Improving PCM endurance with a constant-cost wear leveling design. ACM Transactions on Design Automation of Electronic Systems 22, 1 (2016), 1–27. DOI:https://doi.org/10.1145/2905364

Digital Library

[7]

S. Cho and H. Lee. 2009. Flip-n-write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 347–357. DOI:https://doi.org/10.1145/1669112.1669157

Digital Library

[8]

John L. Hennessy and David A. Patterson. 2014. Computer Organization and Design MIPS Edition: The Hardware/ Software Interface (5th. Ed.). Morgan Kaufmann Publishers Inc., Waltham, MA. 455–466 pages.

[9]

G. Dhiman, R. Ayoub, and T. Rosing. 2009. PDRAM: A hybrid PRAM and DRAM main memory system. In Proceedings of the 46th ACM/IEEE Design Automation Conference. IEEE, 664–669. DOI:https://doi.org/10.1145/1629911.1630086

Digital Library

[10]

Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but effective heterogeneous main memory with on-chip memory controller support. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1–11. DOI:https://doi.org/10.1109/SC.2010.50

Digital Library

[11]

Hamed Farbeh, Hyeonggyu Kim, Seyed Ghassem Miremadi, and Soontae Kim. 2016. Floating-ECC: Dynamic repositioning of error correcting code bits for extending the lifetime of STT-RAM caches. IEEE Transactions on Computers 65, 12 (2016), 3661–3675. DOI:https://doi.org/10.1109/TC.2016.2557326

Digital Library

[12]

Alexandre P. Ferreira, Miao Zhou, Santiago Bock, Bruce Childers, Rami Melhem, and Daniel Mossé. 2010. Increasing PCM main memory lifetime. In Proceedings of the 2010 Design, Automation & Test in Europe Conference & Exhibition. IEEE, 914–919. DOI:https://doi.org/10.1109/DATE.2010.5456923

Digital Library

[13]

Seyedeh Golsana Ghaemi, Iman Ahmadpour, Mehdi Ardebili, and Hamed Farbeh. 2019. Sleepy-LRU: Extending the lifetime of non-volatile caches by reducing activity of age bits. The Journal of Supercomputing 75, 7 (2019), 3945–3974. DOI:https://doi.org/10.1007/s11227-019-02758-0

Digital Library

[14]

N. Gulur, M. Mehendale, R. Manikantan, and R. Govindarajan. 2014. Bi-modal DRAM cache: Improving hit rate, hit latency and bandwidth. In Proceeding of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE. DOI:https://doi.org/10.1109/MICRO.2014.36

Digital Library

[15]

C. C. Huang and V. Nagarajan. 2014. ATCache: Reducing DRAM cache latency via a small SRAM tag cache. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. 51–60. DOI:https://doi.org/10.1145/2628071.2628089

Digital Library

[16]

Intel. 2015. Intel and micron produce breakthrough memory technology. Retrieved December 6th, 2021 from https://newsroom.intel.com/news-releases/intel-and-micron-produce-breakthrough-memory-technology.

[17]

H. Jang, Y. Lee, J. Kim, Y. Kim, J. Kim, J. Jeong, and J. W. Lee. 2016. Efficient footprint caching for tagless DRAM caches. In Proceedings of 2016 IEEE International Symposium on High Performance Computer Architecture. IEEE. DOI:https://doi.org/10.1109/HPCA.2016.7446068

[18]

JEDEC. 2014. Wide I/O 2(WideIO2): JESD229-2 standard. Retrieved December 6th, 2021 from https://www.jedec.org/ standards-documents/docs/jesd229-2.

[19]

D. Jevdjic, G. H. Loh, C. Kaynak, and B. Falsafi. 2014. Unison cache: A scalable and effective die-stacked DRAM cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 25–37. DOI:https://doi.org/10.1109/MICRO.2014.51

Digital Library

[20]

Djordje Jevdjic, Stavros Volos, and Babak Falsafi. 2013. Die-stacked DRAM caches for servers: Hit ratio, latency, or bandwidth? have it all with footprint cache. In Proceedings of the 40th Annual International Symposium on Computer Architecture. 404–415. DOI:https://doi.org/10.1145/2508148.2485957

Digital Library

[21]

Takayuki Kawahara. 2011. Scalable spin-transfer torque RAM technology for normally-off computing. IEEE Design & Test of Computers 41, 1 (2011), 52–63. DOI:https://doi.org/10.1109/MDT.2010.97

Digital Library

[22]

H. A. Khouzani, C. Yang, and Jingtong Hu. 2015. Improving performance and lifetime of DRAM-PCM hybrid main memory through a proactive page allocation strategy. In Proceedings of the 20th Asia and South Pacific Design Automation Conference. IEEE. DOI:https://doi.org/10.1109/ASPDAC.2015.7059057

[23]

Jung-Sik Kim, Chi Sung Oh, Hocheol Lee, Donghyuk Lee, Hyong-Ryol Hwang, Sooman Hwang, Byongwook Na, Joungwook Moon, Jin-Guk Kim, Hanna Park, Jang-Woo Ryu, Kiwon Park, Sang-Kyu Kang, So-Young Kim, Hoyoung Kim, Jong-Min Bang, Hyunyoon Cho, Minsoo Jang, Cheolmin Han, Jung-Bae Lee, Kyehyun Kyung, Joo-Sun Choi, and Young-Hyun Jun. 2011. a 1.2V 12.8GB/s 2Gb mobile Wide-I/O DRAM with 4x 128 I/Os using TSV-based stacking. In Proceedings of the 2011 IEEE International Solid-state Circuits Conference. IEEE. DOI:https://doi.org/10.1109/ISSCC.2011.5746413

[24]

Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 2–13. DOI:https://doi.org/10.1145/1555815.1555758

Digital Library

[25]

Dong Uk Lee, Kyung Whan Kim, Kwan Weon Kim, Hongjung Kim, Ju Young Kim, Young Jun Park, Jae Hwan Kim, Dae Suk Kim, Heat Bit Park, Jin Wook Shin, Jang Hwan Cho, Ki Hun Kwon, Min Jeong Kim, Jaejin Lee, Kun Woo Park, Byongtae Chung, and Sungjoo Hong. 2014. 25.2 A 1.2 V 8Gb 8-channel 128GB/s High-bandwidth Memory (HBM) Stacked DRAM with Effective Microbump I/O Test Methods using 29nm Process and TSV. In Proceedings of the 2011 IEEE International Solid-state Circuits Conference. IEEE. DOI:https://doi.org/10.1109/ISSCC.2014.6757501

[26]

Hyung Gyu Lee, Seungcheol Baek, Chrysostomos Nicopoulos, and Jongman Kim. 2011. An energy-and performance-aware DRAM cache architecture for hybrid DRAM/PCM main memory systems. In Proceedings of the 2011 IEEE 29th International Conference on Computer Design. IEEE, 381–387. DOI:https://doi.org/10.1109/ICCD.2011.6081427

Digital Library

[27]

Y. Lee, J. Kim, H. Jang, H. Yang, J. Kim, J. Jeong, and J. W. Lee. 2015. A fully associative, tagless DRAM cache. In Proceedings of the 42th Annual International Symposium on Computer Architecture. ACM, 211–222. DOI:https://doi.org/10.1145/2872887.2750383

Digital Library

[28]

Ye-jyun Lin, Chia-Lin Yang, Hsiang-Pang Li, and Cheng-Yuan Michael Wang. 2015. A buffer cache architecture for smartphones with hybrid DRAM/PCM memory. In Proceedings of the 2015 IEEE Non-volatile Memory System and Applications Symposium. IEEE, 1–6. DOI:https://doi.org/10.1109/NVMSA.2015.7304363

[29]

G. H. Loh and M. D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceeding of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 454–464. DOI:https://doi.org/10.1145/2155620.2155673

Digital Library

[30]

A. Mirhoseini, M. Potkonjak, and F. Koushanfar. 2012. Coding-based energy minimization for phase change memory. In Proceedings of the 49th ACM/IEEE Design Automation Conference. IEEE, 68–76. DOI:https://doi.org/10.1145/2228360.2228374

Digital Library

[31]

H. Park, C. Kim, S. Yoo, and C. Park. 2015. Filtering dirty data in DRAM to reduce PRAM writes. In Proceeding of the 2015 IFIP/IEEE International Conference on Very Large Scale Integration. IEEE. DOI:https://doi.org/10.1109/VLSI-SoC.2015.7314437

[32]

Hyunsun Park, Sungjoo Yoo, and Sunggu Lee. 2011. Power management of hybrid DRAM/PRAM-based main memory. In Proceedings of the 48th Design Automation Conference. IEEE, 59–64. DOI:https://doi.org/10.1145/2024724.2024738

Digital Library

[33]

J. T. Pawlowski. 2011. Hybrid Memory Cube. In Proceedings of the Hot Chips Symposium. IEEE. Retrieved on December 6th, 2021 from https://doi.org/10.1109/HOTCHIPS.2011.7477494

[34]

B. Pourshirazi and Z. Zhu. 2017. NEMO: An energy-efficient hybrid main memeory system for mobile devices. In Proceedings of the International Symposium on Memory Systems. ACM. DOI:https://doi.org/10.1145/3132402.3132445

Digital Library

[35]

Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-Tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 235–246. DOI:https://doi.org/10.1109/MICRO.2012.30

Digital Library

[36]

Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 24–33. DOI:https://doi.org/10.1145/1555815.1555760

Digital Library

[37]

Luiz E. Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page placement in hybrid memory systems. In Proceedings of the International Conference on Supercomputing. ACM, 85–95. DOI:https://doi.org/10.1145/1995896.1995911

Digital Library

[38]

Samsung. 2010. 4Gb DDP LPDDR2-S4 SDRAM (K4P4G304EC) datasheet. Retrieved December 6th, 2021 from https://datasheetspdf.com/.

[39]

Chang Hyun Park Sanghoon Cha, Bokyeong Kim and Jaehyuk Huh. 2019. Morphable DRAM cache design for hybrid memory systems. In Proceedings of the ACM Transactions on Architecture and Code Optimization. ACM. DOI:https://doi.org/10.1145/3338505

Digital Library

[40]

Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O’Connor, and Mithuna Thottethodi. 2012. A mostly-clean DRAM cache for effective hit speculation and self-balancing dispatch. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 247–257. DOI:https://doi.org/10.1109/MICRO.2012.31

Digital Library

[41]

A. Sodani, R. Gramunt, J. Corbal, H. S. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y. C. Liu. 2016. Knights landing: Second-generation intel xeon phi product. IEEE Micro 36, 2 (2016), 34–46. DOI:https://doi.org/10.1109/MM.2016.25

Digital Library

[42]

Jung-Geun Kim Su-Kyung Yoon, Jitae Yun, and Shin-Dug Kim. 2018. Self-adaptive filtering algorithm with PCM-Based memory storage system. Tecs 17, 3 (2018), 1–23. DOI:https://doi.org/10.1145/3190856

Digital Library

[43]

G. Sun, D. Niu, J. Ouyang, and Y. Xie. 2011. A frequent-value based PRAM memory architecture. In Proceedings of the 16th Asia and South Pacific Design Automation Conference. IEEE. DOI:https://doi.org/10.1109/ASPDAC.2011.5722186

Digital Library

[44]

Jue Wang, Xiangyu Dong, Yuan Xie, and Norman P. Jouppi. 2014. Endurance-aware cache line management for non-volatile caches. ACM Transactions on Architecture and Code Optimization 11, 1 (2014), 1–25. DOI:https://doi.org/10.1145/2579671

Digital Library

[45]

H. S. P. Wong, H. Y. Lee, S. Yu, Y. S. Chen, Y. Wu, P. S. Chen, B. Lee, F. T. Chen, and M. J. Tsai. 2012. Metal-oxide RRAM. Proceedings of the IEEE 100, 6 (2012), 1951–1970. DOI:https://doi.org/10.1109/JPROC.2012.2190369

[46]

B. D. Yang, J. E. Lee, J. S. Kim, J. Cho, S. Y. Lee, and B. G. Yu. 2007. A low power phase-change random access memory using a data-comparison write scheme. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems. IEEE. DOI:https://doi.org/10.1109/ISCAS.2007.377981

[47]

ZDNET. 2019. Getting ready for NVRAM: Intel’s 3D xpoint launches soon. Retrieved December 6th, 2021 from https://www.zdnet.com/article/getting-ready-for-nvram/.

[48]

Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 14–23. DOI:https://doi.org/10.1145/1555815.1555759

Digital Library

Cited By

Bagchi ADharamjeet Rishabh OSuri MPanda P(2024)POEM: Performance Optimization and Endurance Management for Non-volatile CachesACM Transactions on Design Automation of Electronic Systems10.1145/365345229:5(1-36)Online publication date: 27-Mar-2024
https://dl.acm.org/doi/10.1145/3653452
Rahman MTunny SBhuiyan HIslam M(2024)A case study on integrating energy‐efficient technologies for display devicesElectronics Letters10.1049/ell2.1330460:15Online publication date: Aug-2024
https://doi.org/10.1049/ell2.13304
Yu ZYang CZhang RTian PHe XZhou LLi HLiu D(2024)Wear-leveling-aware buddy-like memory allocator for persistent memory file systemsFuture Generation Computer Systems10.1016/j.future.2023.08.013150:C(37-48)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.future.2023.08.013
Show More Cited By

Index Terms

An Energy-Efficient DRAM Cache Architecture for Mobile Platforms With PCM-Based Main Memory
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
2. Information systems
  1. Information storage systems
    1. Information storage technologies
      1. Storage class memory
        Phase change memory

Recommendations

Morphable DRAM Cache Design for Hybrid Memory Systems

DRAM caches have emerged as an efficient new layer in the memory hierarchy to address the increasing diversity of memory components. When a small amount of fast memory is combined with slow but large memory, the cache-based organization of the fast ...
Write reconstruction for write throughput improvement on MLC PCM based main memory

The emerging Phase Change Memory (PCM) is considered as one of the most promising candidates to replace DRAM as main memory due to its better scalability and non-volatility. With multi-bit storage capability, Multiple-Level-Cell (MLC) PCM outperforms ...
Low Overhead Software Wear Leveling for Hybrid PCM + DRAM Main Memory on Embedded Systems
Phase change memory (PCM) is a promising DRAM replacement in embedded systems due to its attractive characteristics, such as low-cost, shock-resistivity, nonvolatility, high density, and low leakage power. However, relatively low endurance has limited its ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems

ACM Transactions on Embedded Computing Systems Volume 21, Issue 1

January 2022

288 pages

ISSN:1539-9087

EISSN:1558-3465

DOI:10.1145/3505211

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 14 January 2022

Accepted: 01 February 2021

Revised: 01 January 2021

Received: 01 November 2019

Published in TECS Volume 21, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Samsung Electronics (Memory Business)
National Research Foundation of Korea (NRF)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
1,289
Total Downloads

Downloads (Last 12 months)332
Downloads (Last 6 weeks)16

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bagchi ADharamjeet Rishabh OSuri MPanda P(2024)POEM: Performance Optimization and Endurance Management for Non-volatile CachesACM Transactions on Design Automation of Electronic Systems10.1145/365345229:5(1-36)Online publication date: 27-Mar-2024
https://dl.acm.org/doi/10.1145/3653452
Rahman MTunny SBhuiyan HIslam M(2024)A case study on integrating energy‐efficient technologies for display devicesElectronics Letters10.1049/ell2.1330460:15Online publication date: Aug-2024
https://doi.org/10.1049/ell2.13304
Yu ZYang CZhang RTian PHe XZhou LLi HLiu D(2024)Wear-leveling-aware buddy-like memory allocator for persistent memory file systemsFuture Generation Computer Systems10.1016/j.future.2023.08.013150:C(37-48)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.future.2023.08.013
Kargar SNawab F(2023)Hamming Tree: The Case for Energy-Aware Indexing for NVMsProceedings of the ACM on Management of Data10.1145/35893271:2(1-27)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589327
Han SWang QJiang Y(2023)MRAM-Based Cache System Design and Policy Optimization for RISC-V Multi-Core CPUsIEEE Transactions on Magnetics10.1109/TMAG.2023.326746759:6(1-14)Online publication date: Jun-2023
https://doi.org/10.1109/TMAG.2023.3267467
Rai STalawar B(2023)Analysis of power-performance trade-offs in DRAM-NVM based hybrid main memoryINTERNATIONAL CONFERENCE ON APPLIED COMPUTATIONAL INTELLIGENCE AND ANALYTICS (ACIA-2022)10.1063/5.0133358(040011)Online publication date: 2023
https://doi.org/10.1063/5.0133358
Rahman MHuq ATunny SAnik MPervin MIslam M(2023)A feasibility analysis of image approximation with image quality assessmentsIET Image Processing10.1049/ipr2.12993Online publication date: 14-Nov-2023
https://doi.org/10.1049/ipr2.12993
Kargar SNawab F(2022)Challenges and future directions for energy, latency, and lifetime improvements in NVMsDistributed and Parallel Databases10.1007/s10619-022-07421-x41:3(163-189)Online publication date: 21-Sep-2022
https://dl.acm.org/doi/10.1007/s10619-022-07421-x

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents