Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

An Energy-Efficient DRAM Cache Architecture for Mobile Platforms With PCM-Based Main Memory

Published: 14 January 2022 Publication History

Abstract

A long battery life is a first-class design objective for mobile devices, and main memory accounts for a major portion of total energy consumption. Moreover, the energy consumption from memory is expected to increase further with ever-growing demands for bandwidth and capacity. A hybrid memory system with both DRAM and PCM can be an attractive solution to provide additional capacity and reduce standby energy. Although providing much greater density than DRAM, PCM has longer access latency and limited write endurance to make it challenging to architect it for main memory.
To address this challenge, this article introduces CAMP, a novel DRAM cache architecture for mobile platforms with PCM-based main memory. A DRAM cache in this environment is required to filter most of the writes to PCM to increase its lifetime, and deliver highest efficiency even for a relatively small-sized DRAM cache that mobile platforms can afford. To address this CAMP divides DRAM space into two regions: a page cache for exploiting spatial locality in a bandwidth-efficient manner and a dirty block buffer for maximally filtering writes. CAMP improves the performance and energy-delay-product by 29.2% and 45.2%, respectively, over the baseline PCM-oblivious DRAM cache, while increasing PCM lifetime by 2.7×. And CAMP also improves the performance and energy-delay-product by 29.3% and 41.5%, respectively, over the state-of-the-art design with dirty block buffer, while increasing PCM lifetime by 2.5×.

References

[1]
Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2015. Prediction hybrid cache: An energy-efficient STT-RAM cache architecture. IEEE Transactions on Computers 65, 3 (2015), 940–951. DOI:https://doi.org/10.1109/TC.2015.2435772
[2]
J. H. Ahn, S. Li, S. O, and N. P. Jouppi. 2013. mcsima+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. DOI:https://doi.org/10.1109/ISPASS.2013.6557148
[3]
ANANDTECH. 2019. Intel shares new optane and 3D NAND roadmap. Retrieved December 6th, 2021 from https://www.anandtech.com/show/14903/intel-shares-new-optane-and-3d-nand-roadmap.
[4]
Sina Asadi, Amir Mahdi Hosseini Monazzah, Hamed Farbeh, and Seyed Ghassem Miremadi. [n.d.]. Wipe: Wearout informed pattern elimination to improve the endurance of nvm-based caches. In Proceedings of the 2017 22nd Asia and South Pacific Design Automation Conference. 188–193.
[5]
Hung-Sheng Chang, Yuan-Hao Chang, Tei-Wei Kuo, and Hsiang-Pang Li. 2015. A light-weighted software-controlled cache for PCM-based main memory systems. In Proceedings of the 2015 IEEE/ACM International Conference on Computer-aided Design. IEEE, 22–29. DOI:https://doi.org/10.1109/ICCAD.2015.7372545
[6]
Yu-Ming Chang, Pi-Cheng Hsiu, Yuan-Hao Chang, Chi-Hao Chen, Tei-Wei Kuo, and Cheng-Yuan Michael Wang. 2016. Improving PCM endurance with a constant-cost wear leveling design. ACM Transactions on Design Automation of Electronic Systems 22, 1 (2016), 1–27. DOI:https://doi.org/10.1145/2905364
[7]
S. Cho and H. Lee. 2009. Flip-n-write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 347–357. DOI:https://doi.org/10.1145/1669112.1669157
[8]
John L. Hennessy and David A. Patterson. 2014. Computer Organization and Design MIPS Edition: The Hardware/ Software Interface (5th. Ed.). Morgan Kaufmann Publishers Inc., Waltham, MA. 455–466 pages.
[9]
G. Dhiman, R. Ayoub, and T. Rosing. 2009. PDRAM: A hybrid PRAM and DRAM main memory system. In Proceedings of the 46th ACM/IEEE Design Automation Conference. IEEE, 664–669. DOI:https://doi.org/10.1145/1629911.1630086
[10]
Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but effective heterogeneous main memory with on-chip memory controller support. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1–11. DOI:https://doi.org/10.1109/SC.2010.50
[11]
Hamed Farbeh, Hyeonggyu Kim, Seyed Ghassem Miremadi, and Soontae Kim. 2016. Floating-ECC: Dynamic repositioning of error correcting code bits for extending the lifetime of STT-RAM caches. IEEE Transactions on Computers 65, 12 (2016), 3661–3675. DOI:https://doi.org/10.1109/TC.2016.2557326
[12]
Alexandre P. Ferreira, Miao Zhou, Santiago Bock, Bruce Childers, Rami Melhem, and Daniel Mossé. 2010. Increasing PCM main memory lifetime. In Proceedings of the 2010 Design, Automation & Test in Europe Conference & Exhibition. IEEE, 914–919. DOI:https://doi.org/10.1109/DATE.2010.5456923
[13]
Seyedeh Golsana Ghaemi, Iman Ahmadpour, Mehdi Ardebili, and Hamed Farbeh. 2019. Sleepy-LRU: Extending the lifetime of non-volatile caches by reducing activity of age bits. The Journal of Supercomputing 75, 7 (2019), 3945–3974. DOI:https://doi.org/10.1007/s11227-019-02758-0
[14]
N. Gulur, M. Mehendale, R. Manikantan, and R. Govindarajan. 2014. Bi-modal DRAM cache: Improving hit rate, hit latency and bandwidth. In Proceeding of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE. DOI:https://doi.org/10.1109/MICRO.2014.36
[15]
C. C. Huang and V. Nagarajan. 2014. ATCache: Reducing DRAM cache latency via a small SRAM tag cache. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. 51–60. DOI:https://doi.org/10.1145/2628071.2628089
[16]
Intel. 2015. Intel and micron produce breakthrough memory technology. Retrieved December 6th, 2021 from https://newsroom.intel.com/news-releases/intel-and-micron-produce-breakthrough-memory-technology.
[17]
H. Jang, Y. Lee, J. Kim, Y. Kim, J. Kim, J. Jeong, and J. W. Lee. 2016. Efficient footprint caching for tagless DRAM caches. In Proceedings of 2016 IEEE International Symposium on High Performance Computer Architecture. IEEE. DOI:https://doi.org/10.1109/HPCA.2016.7446068
[18]
JEDEC. 2014. Wide I/O 2(WideIO2): JESD229-2 standard. Retrieved December 6th, 2021 from https://www.jedec.org/ standards-documents/docs/jesd229-2.
[19]
D. Jevdjic, G. H. Loh, C. Kaynak, and B. Falsafi. 2014. Unison cache: A scalable and effective die-stacked DRAM cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 25–37. DOI:https://doi.org/10.1109/MICRO.2014.51
[20]
Djordje Jevdjic, Stavros Volos, and Babak Falsafi. 2013. Die-stacked DRAM caches for servers: Hit ratio, latency, or bandwidth? have it all with footprint cache. In Proceedings of the 40th Annual International Symposium on Computer Architecture. 404–415. DOI:https://doi.org/10.1145/2508148.2485957
[21]
Takayuki Kawahara. 2011. Scalable spin-transfer torque RAM technology for normally-off computing. IEEE Design & Test of Computers 41, 1 (2011), 52–63. DOI:https://doi.org/10.1109/MDT.2010.97
[22]
H. A. Khouzani, C. Yang, and Jingtong Hu. 2015. Improving performance and lifetime of DRAM-PCM hybrid main memory through a proactive page allocation strategy. In Proceedings of the 20th Asia and South Pacific Design Automation Conference. IEEE. DOI:https://doi.org/10.1109/ASPDAC.2015.7059057
[23]
Jung-Sik Kim, Chi Sung Oh, Hocheol Lee, Donghyuk Lee, Hyong-Ryol Hwang, Sooman Hwang, Byongwook Na, Joungwook Moon, Jin-Guk Kim, Hanna Park, Jang-Woo Ryu, Kiwon Park, Sang-Kyu Kang, So-Young Kim, Hoyoung Kim, Jong-Min Bang, Hyunyoon Cho, Minsoo Jang, Cheolmin Han, Jung-Bae Lee, Kyehyun Kyung, Joo-Sun Choi, and Young-Hyun Jun. 2011. a 1.2V 12.8GB/s 2Gb mobile Wide-I/O DRAM with 4x 128 I/Os using TSV-based stacking. In Proceedings of the 2011 IEEE International Solid-state Circuits Conference. IEEE. DOI:https://doi.org/10.1109/ISSCC.2011.5746413
[24]
Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 2–13. DOI:https://doi.org/10.1145/1555815.1555758
[25]
Dong Uk Lee, Kyung Whan Kim, Kwan Weon Kim, Hongjung Kim, Ju Young Kim, Young Jun Park, Jae Hwan Kim, Dae Suk Kim, Heat Bit Park, Jin Wook Shin, Jang Hwan Cho, Ki Hun Kwon, Min Jeong Kim, Jaejin Lee, Kun Woo Park, Byongtae Chung, and Sungjoo Hong. 2014. 25.2 A 1.2 V 8Gb 8-channel 128GB/s High-bandwidth Memory (HBM) Stacked DRAM with Effective Microbump I/O Test Methods using 29nm Process and TSV. In Proceedings of the 2011 IEEE International Solid-state Circuits Conference. IEEE. DOI:https://doi.org/10.1109/ISSCC.2014.6757501
[26]
Hyung Gyu Lee, Seungcheol Baek, Chrysostomos Nicopoulos, and Jongman Kim. 2011. An energy-and performance-aware DRAM cache architecture for hybrid DRAM/PCM main memory systems. In Proceedings of the 2011 IEEE 29th International Conference on Computer Design. IEEE, 381–387. DOI:https://doi.org/10.1109/ICCD.2011.6081427
[27]
Y. Lee, J. Kim, H. Jang, H. Yang, J. Kim, J. Jeong, and J. W. Lee. 2015. A fully associative, tagless DRAM cache. In Proceedings of the 42th Annual International Symposium on Computer Architecture. ACM, 211–222. DOI:https://doi.org/10.1145/2872887.2750383
[28]
Ye-jyun Lin, Chia-Lin Yang, Hsiang-Pang Li, and Cheng-Yuan Michael Wang. 2015. A buffer cache architecture for smartphones with hybrid DRAM/PCM memory. In Proceedings of the 2015 IEEE Non-volatile Memory System and Applications Symposium. IEEE, 1–6. DOI:https://doi.org/10.1109/NVMSA.2015.7304363
[29]
G. H. Loh and M. D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceeding of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 454–464. DOI:https://doi.org/10.1145/2155620.2155673
[30]
A. Mirhoseini, M. Potkonjak, and F. Koushanfar. 2012. Coding-based energy minimization for phase change memory. In Proceedings of the 49th ACM/IEEE Design Automation Conference. IEEE, 68–76. DOI:https://doi.org/10.1145/2228360.2228374
[31]
H. Park, C. Kim, S. Yoo, and C. Park. 2015. Filtering dirty data in DRAM to reduce PRAM writes. In Proceeding of the 2015 IFIP/IEEE International Conference on Very Large Scale Integration. IEEE. DOI:https://doi.org/10.1109/VLSI-SoC.2015.7314437
[32]
Hyunsun Park, Sungjoo Yoo, and Sunggu Lee. 2011. Power management of hybrid DRAM/PRAM-based main memory. In Proceedings of the 48th Design Automation Conference. IEEE, 59–64. DOI:https://doi.org/10.1145/2024724.2024738
[33]
J. T. Pawlowski. 2011. Hybrid Memory Cube. In Proceedings of the Hot Chips Symposium. IEEE. Retrieved on December 6th, 2021 from https://doi.org/10.1109/HOTCHIPS.2011.7477494
[34]
B. Pourshirazi and Z. Zhu. 2017. NEMO: An energy-efficient hybrid main memeory system for mobile devices. In Proceedings of the International Symposium on Memory Systems. ACM. DOI:https://doi.org/10.1145/3132402.3132445
[35]
Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-Tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 235–246. DOI:https://doi.org/10.1109/MICRO.2012.30
[36]
Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 24–33. DOI:https://doi.org/10.1145/1555815.1555760
[37]
Luiz E. Ramos, Eugene Gorbatov, and Ricardo Bianchini. 2011. Page placement in hybrid memory systems. In Proceedings of the International Conference on Supercomputing. ACM, 85–95. DOI:https://doi.org/10.1145/1995896.1995911
[38]
Samsung. 2010. 4Gb DDP LPDDR2-S4 SDRAM (K4P4G304EC) datasheet. Retrieved December 6th, 2021 from https://datasheetspdf.com/.
[39]
Chang Hyun Park Sanghoon Cha, Bokyeong Kim and Jaehyuk Huh. 2019. Morphable DRAM cache design for hybrid memory systems. In Proceedings of the ACM Transactions on Architecture and Code Optimization. ACM. DOI:https://doi.org/10.1145/3338505
[40]
Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O’Connor, and Mithuna Thottethodi. 2012. A mostly-clean DRAM cache for effective hit speculation and self-balancing dispatch. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 247–257. DOI:https://doi.org/10.1109/MICRO.2012.31
[41]
A. Sodani, R. Gramunt, J. Corbal, H. S. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y. C. Liu. 2016. Knights landing: Second-generation intel xeon phi product. IEEE Micro 36, 2 (2016), 34–46. DOI:https://doi.org/10.1109/MM.2016.25
[42]
Jung-Geun Kim Su-Kyung Yoon, Jitae Yun, and Shin-Dug Kim. 2018. Self-adaptive filtering algorithm with PCM-Based memory storage system. Tecs 17, 3 (2018), 1–23. DOI:https://doi.org/10.1145/3190856
[43]
G. Sun, D. Niu, J. Ouyang, and Y. Xie. 2011. A frequent-value based PRAM memory architecture. In Proceedings of the 16th Asia and South Pacific Design Automation Conference. IEEE. DOI:https://doi.org/10.1109/ASPDAC.2011.5722186
[44]
Jue Wang, Xiangyu Dong, Yuan Xie, and Norman P. Jouppi. 2014. Endurance-aware cache line management for non-volatile caches. ACM Transactions on Architecture and Code Optimization 11, 1 (2014), 1–25. DOI:https://doi.org/10.1145/2579671
[45]
H. S. P. Wong, H. Y. Lee, S. Yu, Y. S. Chen, Y. Wu, P. S. Chen, B. Lee, F. T. Chen, and M. J. Tsai. 2012. Metal-oxide RRAM. Proceedings of the IEEE 100, 6 (2012), 1951–1970. DOI:https://doi.org/10.1109/JPROC.2012.2190369
[46]
B. D. Yang, J. E. Lee, J. S. Kim, J. Cho, S. Y. Lee, and B. G. Yu. 2007. A low power phase-change random access memory using a data-comparison write scheme. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems. IEEE. DOI:https://doi.org/10.1109/ISCAS.2007.377981
[47]
ZDNET. 2019. Getting ready for NVRAM: Intel’s 3D xpoint launches soon. Retrieved December 6th, 2021 from https://www.zdnet.com/article/getting-ready-for-nvram/.
[48]
Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 14–23. DOI:https://doi.org/10.1145/1555815.1555759

Cited By

View all
  • (2024)POEM: Performance Optimization and Endurance Management for Non-volatile CachesACM Transactions on Design Automation of Electronic Systems10.1145/365345229:5(1-36)Online publication date: 27-Mar-2024
  • (2024)A case study on integrating energy‐efficient technologies for display devicesElectronics Letters10.1049/ell2.1330460:15Online publication date: Aug-2024
  • (2024)Wear-leveling-aware buddy-like memory allocator for persistent memory file systemsFuture Generation Computer Systems10.1016/j.future.2023.08.013150:C(37-48)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 21, Issue 1
January 2022
288 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3505211
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 14 January 2022
Accepted: 01 February 2021
Revised: 01 January 2021
Received: 01 November 2019
Published in TECS Volume 21, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DRAM cache
  2. phase change memory (PCM)
  3. hybrid memory system

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Samsung Electronics (Memory Business)
  • National Research Foundation of Korea (NRF)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)332
  • Downloads (Last 6 weeks)16
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)POEM: Performance Optimization and Endurance Management for Non-volatile CachesACM Transactions on Design Automation of Electronic Systems10.1145/365345229:5(1-36)Online publication date: 27-Mar-2024
  • (2024)A case study on integrating energy‐efficient technologies for display devicesElectronics Letters10.1049/ell2.1330460:15Online publication date: Aug-2024
  • (2024)Wear-leveling-aware buddy-like memory allocator for persistent memory file systemsFuture Generation Computer Systems10.1016/j.future.2023.08.013150:C(37-48)Online publication date: 1-Jan-2024
  • (2023)Hamming Tree: The Case for Energy-Aware Indexing for NVMsProceedings of the ACM on Management of Data10.1145/35893271:2(1-27)Online publication date: 20-Jun-2023
  • (2023)MRAM-Based Cache System Design and Policy Optimization for RISC-V Multi-Core CPUsIEEE Transactions on Magnetics10.1109/TMAG.2023.326746759:6(1-14)Online publication date: Jun-2023
  • (2023)Analysis of power-performance trade-offs in DRAM-NVM based hybrid main memoryINTERNATIONAL CONFERENCE ON APPLIED COMPUTATIONAL INTELLIGENCE AND ANALYTICS (ACIA-2022)10.1063/5.0133358(040011)Online publication date: 2023
  • (2023)A feasibility analysis of image approximation with image quality assessmentsIET Image Processing10.1049/ipr2.12993Online publication date: 14-Nov-2023
  • (2022)Challenges and future directions for energy, latency, and lifetime improvements in NVMsDistributed and Parallel Databases10.1007/s10619-022-07421-x41:3(163-189)Online publication date: 21-Sep-2022

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media