Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Self-learnable Cluster-based Prefetching Method for DRAM-Flash Hybrid Main Memory Architecture

Published: 09 January 2019 Publication History

Abstract

This article presents a novel prefetching mechanism for memory-intensive workloads used in large-scale data centers. We design a negative-AND-flash/dynamic random-access memory (DRAM) hybrid memory architecture as a cost-effective memory architecture to resolve the scalability and power consumption problems of a DRAM-based model. A smart prefetching mechanism based on a cluster-management scheme to cope with dynamically varying and complex access patterns of any given application is designed for maximizing the performance of the DRAM. In this article, we propose a new concept for page management, called a cluster, which prefetches data in our hybrid memory architecture. The cluster management is based on a self-learning scheme on dynamically changeable access patterns by considering any correlation between missed pages. Experimental results show that the overall performance is significantly improved in relation to hit rate, execution time, and energy consumption. Namely, our proposed model can enhance the hit rate by 15% and reduce the execution time by 1.75 times. In addition, we can save energy consumption by around 48% by cutting the number of flushed pages to about an eighth of that in a conventional system.

References

[1]
A. Anand, C. Muthukrishnan, S. Kappes, A. Akella, and S. Nath. 2010. Cheap and large CAMs for high performance data-intensive networked systems. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’10).
[2]
Joe Arnold. 2014. Open Stack Swift: Using, Administering, and Developing for Swift Object Storage. O'Reilly Media, Inc.
[3]
D. Arteaga, J. Cabrera, J. Xu, S. Sundararaman, and M. Zhao. 2016. CloudCache: On-demand flash cache management for cloud computing. In Proceeedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 355--369.
[4]
F. Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the USENIX Annual Technical Conference FREENIX Track. 41--46.
[5]
S. Byan, J. Lentini, A. Madan, and L. Mercury Pabon. 2012. Host-side flash caching for the data center. In Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST’12). IEEE, 1--12.
[6]
L. P. Chang. 2008. Hybrid solid-state disks: Combining heterogeneous NAND flash in large SSDs. In Proceedings of the Asia and South Pacific Design Automation Conference. IEEE, 428--433.
[7]
Z. Chen, Y. Lu, N. Xiao, and F. Liu. 2014. A hybrid memory built by SSD and DRAM to support in-memory big data analytics. Knowl. Info. Syst. 41, 2 (2014), 335--354.
[8]
A. Cidon, A. Eisenman, M. Alizadeh, and S. Dynacache Katti. 2015. Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15).
[9]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 143--154.
[10]
G. Dhiman, R. Ayoub, and T. PDRAM Rosing. 2009. A hybrid PRAM and DRAM main memory system. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC’09). 664--669.
[11]
B. Fitzpatrick. 2004. Distributed caching with memcached. Linux J. 124 (2004), 5.
[12]
B. S. Gill and L. A. D. Bathen. 2007. AMP: Adaptive multi-stream prefetching in a shared cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’07), 7, 5 (2007), 185--198.
[13]
Z. Hu, M. Martonosi, and S. Kaxiras. 2003. TCP: Tag correlating prefetchers. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture. IEEE, 317--326.
[14]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. 2010. Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings of the USENIX Annual Technical Conference.
[15]
D. Joseph and D. Grunwald. 1999. Prefetching using Markov predictors. IEEE Trans. Comput. 48, 2 (1999), 121--133.
[16]
G. B. Kandiraju and A. Sivasubramaniam. 2002. Going the distance for TLB prefetching: An application-driven study. IEEE Comput. Soc. 30, 2 (2002), 195--206.
[17]
W. H. Kang, S. W. Lee, and B. Moon. 2012. Flash-based extended cache for higher throughput and faster recovery. Proc. VLDB Endow. 5 (2012), 1615--1626.
[18]
J. M. Keller, M. R. Gray, and J. A. Givens. 1985. A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybernet. 4 (1985), 580--585.
[19]
T. Kgil and T. Mudge. 2006. FlashCache: A NAND flash memory file cache for low power web servers. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM, 103--112.
[20]
T. Kgil, D. Roberts, and T. Mudge. 2008. Improving NAND flash-based disk caches. In Proceedings of the 35th Annual International Symposium on Computer Architecture. ACM, 327--338.
[21]
S. Kim and A. V. Veidenbaum. 1997. Stride-directed prefetching for secondary caches. In Proceedings of the 1997 International Conference on Parallel Processing. IEEE, 314--321.
[22]
Y. Kim, B. Tauras, A. Gupta, and B. Urgaonkar. 2009. Flashsim: A simulator for nand flash-based solid-state drives. In Proceedings of the 1st International Conference on Advances in System Simulation (SIMUL’09). IEEE, 125--131.
[23]
M. Kryder and C. Kim. 2010. After hard drives what comes next? IEEE Trans. Magnet. (2010), 3406--3413.
[24]
Chuanpeng Li and Kai Shen. 2005. Managing prefetch memory for data-intensive online servers. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies—Volume 4 (FAST’05). USENIX Association, Berkeley, CA, 19--19.
[25]
Gabriel H. Loh. 2008. 3D-Stacked memory architectures for multi-core processors. In Proceedings of the 35th Annual International Symposium on Computer Architecture. ACM, 453--464.
[26]
S. Nath and A. Kansal. 2007. FlashDB: Dynamic self-tuning database for NAND flash. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks. ACM, 410--419.
[27]
K. J. Nesbit and J. E. Smith. 2004. Data cache prefetching using a global history buffer. In Proceedings of the International Symposium on High Performance Computer Architecture. 96--105.
[28]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, and D. Stafford. 2013. Scaling memcache at Facebook. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 385--398.
[29]
C. Park, J. U. Kang, S. Y. Park, and J. S. Kim. 2004. Energy-aware demand paging on NAND flash-based embedded storages. In Proceedings of the 2004 International Symposium on Low Power Electronics and Design. ACM, 338--343.
[30]
Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high-performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture. ACM, 24--33.
[31]
A. J. Smith. 1978. Sequential program prefetching in memory hierarchies. Computer 11, 12 (1978), 7--21.
[32]
W. Song, Y. Kim, H. Kim, J. Lim, and J. Kim. 2014. Personalized optimization for android smartphones. ACM Trans. Embed. Comput. Syst. 13, 2s (2014), 60.
[33]
Devesh Tiwari et al. 2013. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 119--132.
[34]
Su-Kyung Yoon et al. 2016. Optimized memory-disk integrated system with DRAM and nonvolatile memory. IEEE Trans. Multi-Scale Comput. Syst. 2.2 (2016), 83--93.
[35]
Su-Kyung Yoon et al. 2017. Harmonized memory system for object-based cloud storage. In Cluster Computing. Springer.
[36]
J. Zawodny. 2009. Redis: Lightweight key/value store that goes the extra mile. Linux Magazine (2009), 79.

Cited By

View all
  • (2024)Application and user-specific data prefetching and parallel read algorithms for distributed file systemsCluster Computing10.1007/s10586-023-04160-127:3(3593-3613)Online publication date: 1-Jun-2024
  • (2023)HRFP: Highly Relevant Frequent Patterns-Based Prefetching and Caching Algorithms for Distributed File SystemsElectronics10.3390/electronics1205118312:5(1183)Online publication date: 1-Mar-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems
ACM Journal on Emerging Technologies in Computing Systems  Volume 15, Issue 1
Special Issue on Emerging Networks-on-Chip and Regular Papers
January 2019
283 pages
ISSN:1550-4832
EISSN:1550-4840
DOI:10.1145/3303864
  • Editor:
  • Yuan Xie
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 09 January 2019
Accepted: 01 October 2018
Revised: 01 April 2018
Received: 01 July 2017
Published in JETC Volume 15, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Memory system
  2. high performance computing
  3. prefetching technique

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Application and user-specific data prefetching and parallel read algorithms for distributed file systemsCluster Computing10.1007/s10586-023-04160-127:3(3593-3613)Online publication date: 1-Jun-2024
  • (2023)HRFP: Highly Relevant Frequent Patterns-Based Prefetching and Caching Algorithms for Distributed File SystemsElectronics10.3390/electronics1205118312:5(1183)Online publication date: 1-Mar-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media