research-article

Self-learnable Cluster-based Prefetching Method for DRAM-Flash Hybrid Main Memory Architecture

Authors:

Young-Sun Youn,

Bernd Burgstaller,

Shin-Dug KimAuthors Info & Claims

ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 15, Issue 1

Article No.: 10, Pages 1 - 21

https://doi.org/10.1145/3284932

Published: 09 January 2019 Publication History

Abstract

This article presents a novel prefetching mechanism for memory-intensive workloads used in large-scale data centers. We design a negative-AND-flash/dynamic random-access memory (DRAM) hybrid memory architecture as a cost-effective memory architecture to resolve the scalability and power consumption problems of a DRAM-based model. A smart prefetching mechanism based on a cluster-management scheme to cope with dynamically varying and complex access patterns of any given application is designed for maximizing the performance of the DRAM. In this article, we propose a new concept for page management, called a cluster, which prefetches data in our hybrid memory architecture. The cluster management is based on a self-learning scheme on dynamically changeable access patterns by considering any correlation between missed pages. Experimental results show that the overall performance is significantly improved in relation to hit rate, execution time, and energy consumption. Namely, our proposed model can enhance the hit rate by 15% and reduce the execution time by 1.75 times. In addition, we can save energy consumption by around 48% by cutting the number of flushed pages to about an eighth of that in a conventional system.

References

[1]

A. Anand, C. Muthukrishnan, S. Kappes, A. Akella, and S. Nath. 2010. Cheap and large CAMs for high performance data-intensive networked systems. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’10).

Digital Library

[2]

Joe Arnold. 2014. Open Stack Swift: Using, Administering, and Developing for Swift Object Storage. O'Reilly Media, Inc.

Digital Library

[3]

D. Arteaga, J. Cabrera, J. Xu, S. Sundararaman, and M. Zhao. 2016. CloudCache: On-demand flash cache management for cloud computing. In Proceeedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 355--369.

Digital Library

[4]

F. Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the USENIX Annual Technical Conference FREENIX Track. 41--46.

Digital Library

[5]

S. Byan, J. Lentini, A. Madan, and L. Mercury Pabon. 2012. Host-side flash caching for the data center. In Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--12.

[6]

L. P. Chang. 2008. Hybrid solid-state disks: Combining heterogeneous NAND flash in large SSDs. In Proceedings of the Asia and South Pacific Design Automation Conference. IEEE, 428--433.

Digital Library

[7]

Z. Chen, Y. Lu, N. Xiao, and F. Liu. 2014. A hybrid memory built by SSD and DRAM to support in-memory big data analytics. Knowl. Info. Syst. 41, 2 (2014), 335--354.

Digital Library

[8]

A. Cidon, A. Eisenman, M. Alizadeh, and S. Dynacache Katti. 2015. Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15).

Digital Library

[9]

B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 143--154.

Digital Library

[10]

G. Dhiman, R. Ayoub, and T. PDRAM Rosing. 2009. A hybrid PRAM and DRAM main memory system. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC’09). 664--669.

Digital Library

[11]

B. Fitzpatrick. 2004. Distributed caching with memcached. Linux J. 124 (2004), 5.

Digital Library

[12]

B. S. Gill and L. A. D. Bathen. 2007. AMP: Adaptive multi-stream prefetching in a shared cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’07), 7, 5 (2007), 185--198.

Digital Library

[13]

Z. Hu, M. Martonosi, and S. Kaxiras. 2003. TCP: Tag correlating prefetchers. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture. IEEE, 317--326.

Digital Library

[14]

P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. 2010. Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings of the USENIX Annual Technical Conference.

Digital Library

[15]

D. Joseph and D. Grunwald. 1999. Prefetching using Markov predictors. IEEE Trans. Comput. 48, 2 (1999), 121--133.

Digital Library

[16]

G. B. Kandiraju and A. Sivasubramaniam. 2002. Going the distance for TLB prefetching: An application-driven study. IEEE Comput. Soc. 30, 2 (2002), 195--206.

Digital Library

[17]

W. H. Kang, S. W. Lee, and B. Moon. 2012. Flash-based extended cache for higher throughput and faster recovery. Proc. VLDB Endow. 5 (2012), 1615--1626.

Digital Library

[18]

J. M. Keller, M. R. Gray, and J. A. Givens. 1985. A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybernet. 4 (1985), 580--585.

[19]

T. Kgil and T. Mudge. 2006. FlashCache: A NAND flash memory file cache for low power web servers. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM, 103--112.

Digital Library

[20]

T. Kgil, D. Roberts, and T. Mudge. 2008. Improving NAND flash-based disk caches. In Proceedings of the 35th Annual International Symposium on Computer Architecture. ACM, 327--338.

Digital Library

[21]

S. Kim and A. V. Veidenbaum. 1997. Stride-directed prefetching for secondary caches. In Proceedings of the 1997 International Conference on Parallel Processing. IEEE, 314--321.

Digital Library

[22]

Y. Kim, B. Tauras, A. Gupta, and B. Urgaonkar. 2009. Flashsim: A simulator for nand flash-based solid-state drives. In Proceedings of the 1st International Conference on Advances in System Simulation (SIMUL’09). IEEE, 125--131.

Digital Library

[23]

M. Kryder and C. Kim. 2010. After hard drives what comes next? IEEE Trans. Magnet. (2010), 3406--3413.

[24]

Chuanpeng Li and Kai Shen. 2005. Managing prefetch memory for data-intensive online servers. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies—Volume 4 (FAST’05). USENIX Association, Berkeley, CA, 19--19.

Digital Library

[25]

Gabriel H. Loh. 2008. 3D-Stacked memory architectures for multi-core processors. In Proceedings of the 35th Annual International Symposium on Computer Architecture. ACM, 453--464.

Digital Library

[26]

S. Nath and A. Kansal. 2007. FlashDB: Dynamic self-tuning database for NAND flash. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks. ACM, 410--419.

Digital Library

[27]

K. J. Nesbit and J. E. Smith. 2004. Data cache prefetching using a global history buffer. In Proceedings of the International Symposium on High Performance Computer Architecture. 96--105.

Digital Library

[28]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, and D. Stafford. 2013. Scaling memcache at Facebook. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 385--398.

Digital Library

[29]

C. Park, J. U. Kang, S. Y. Park, and J. S. Kim. 2004. Energy-aware demand paging on NAND flash-based embedded storages. In Proceedings of the 2004 International Symposium on Low Power Electronics and Design. ACM, 338--343.

Digital Library

[30]

Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high-performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 24--33.

Digital Library

[31]

A. J. Smith. 1978. Sequential program prefetching in memory hierarchies. Computer 11, 12 (1978), 7--21.

Digital Library

[32]

W. Song, Y. Kim, H. Kim, J. Lim, and J. Kim. 2014. Personalized optimization for android smartphones. ACM Trans. Embed. Comput. Syst. 13, 2s (2014), 60.

Digital Library

[33]

Devesh Tiwari et al. 2013. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 119--132.

Digital Library

[34]

Su-Kyung Yoon et al. 2016. Optimized memory-disk integrated system with DRAM and nonvolatile memory. IEEE Trans. Multi-Scale Comput. Syst. 2.2 (2016), 83--93.

[35]

Su-Kyung Yoon et al. 2017. Harmonized memory system for object-based cloud storage. In Cluster Computing. Springer.

[36]

J. Zawodny. 2009. Redis: Lightweight key/value store that goes the extra mile. Linux Magazine (2009), 79.

Cited By

Nalajala ARagunathan TNaha RBattula S(2024)Application and user-specific data prefetching and parallel read algorithms for distributed file systemsCluster Computing10.1007/s10586-023-04160-127:3(3593-3613)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04160-1
Nalajala ARagunathan TNaha RBattula S(2023)HRFP: Highly Relevant Frequent Patterns-Based Prefetching and Caching Algorithms for Distributed File SystemsElectronics10.3390/electronics1205118312:5(1183)Online publication date: 1-Mar-2023
https://doi.org/10.3390/electronics12051183

Index Terms

Self-learnable Cluster-based Prefetching Method for DRAM-Flash Hybrid Main Memory Architecture

Recommendations

Maintaining Cache Coherence through Compiler-Directed Data Prefetching

In this paper, we propose a compiler-directed cache coherence scheme which makes use of data prefetching to enforce cache coherence in large-scale distributed shared-memory (DSM) systems. TheCache Coherence With Data Prefetching(CCDP) scheme uses ...
Migration based page caching algorithm for a hybrid main memory of DRAM and PRAM
SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

As the DRAM based main memory significantly increases the power and cost budget of a computer system, new memory technologies such as Phase-change RAM (PRAM), Ferroelectric RAM (FRAM), and Magnetic RAM (MRAM) have been proposed to replace the DRAM. ...
Power management of hybrid DRAM/PRAM-based main memory
DAC '11: Proceedings of the 48th Design Automation Conference

Hybrid main memory consisting of DRAM and non-volatile memory is attractive since the non-volatile memory can give the advantage of low standby power while DRAM provides high performance and better active power. In this work, we address the power ...

Comments

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems

ACM Journal on Emerging Technologies in Computing Systems Volume 15, Issue 1

Special Issue on Emerging Networks-on-Chip and Regular Papers

January 2019

283 pages

ISSN:1550-4832

EISSN:1550-4840

DOI:10.1145/3303864

Editor:
Yuan Xie
University of California, Santa Barbara, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 09 January 2019

Accepted: 01 October 2018

Revised: 01 April 2018

Received: 01 July 2017

Published in JETC Volume 15, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Graduate School of YONSEI University Research Scholarship Grants in 2017
Samsung Electronics
National Research Foundation of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
404
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nalajala ARagunathan TNaha RBattula S(2024)Application and user-specific data prefetching and parallel read algorithms for distributed file systemsCluster Computing10.1007/s10586-023-04160-127:3(3593-3613)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04160-1
Nalajala ARagunathan TNaha RBattula S(2023)HRFP: Highly Relevant Frequent Patterns-Based Prefetching and Caching Algorithms for Distributed File SystemsElectronics10.3390/electronics1205118312:5(1183)Online publication date: 1-Mar-2023
https://doi.org/10.3390/electronics12051183

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents