research-article

Public Access

Mithril: mining sporadic associations for cache prefetching

Authors:

Trausti Sæmundsson,

Ymir VigfussonAuthors Info & Claims

SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing

Pages 66 - 79

https://doi.org/10.1145/3127479.3131210

Published: 24 September 2017 Publication History

Abstract

The growing pressure on cloud application scalability has accentuated storage performance as a critical bottleneck. Although cache replacement algorithms have been extensively studied, cache prefetching - reducing latency by retrieving items before they are actually requested - remains an underexplored area. Existing approaches to history-based prefetching, in particular, provide too few benefits for real systems for the resources they cost.

We propose Mithril, a prefetching layer that efficiently exploits historical patterns in cache request associations. Mithril is inspired by sporadic association rule mining and only relies on the timestamps of requests. Through evaluation of 135 block-storage traces, we show that Mithril is effective, giving an average of a 55% hit ratio increase over LRU and Probability Graph, and a 36% hit ratio gain over Amp at reasonable cost. Finally, we demonstrate the improvement comes from Mithril being able to capture mid-frequency blocks.

References

[1]

Amer, A., Long, D. D., and Burns, R. C. Group-based management of distributed file caches. In Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on (2002), IEEE, pp. 525--534.

[2]

Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. Measurements of a distributed file system. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (New York, NY, USA, 1991), SOSP '91, ACM, pp. 198--212.

Digital Library

[3]

Bergamasco, D. Ioblazer. https://labs.vmware.com/flings/ioblazer. Accessed: 2017-01-30.

[4]

Chang, F., and Gibson, G. A. Automatic i/o hint generation through speculative execution. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (Berkeley, CA, USA, 1999), OSDI '99, USENIX Association, pp. 1--14.

Digital Library

[5]

Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu., C., and Tseng, V. S. SPMF: a Java Open-Source Pattern Mining Library. Journal of Machine Learning Research (JMLR) 15 (2014), 3389--3393.

[6]

Gill, B. S., and Bathen, L. A. D. AMP: adaptive multi-stream prefetching in a shared cache. In Proceedings of the 5th USENIX conference on File and Storage Technologies (2007), USENIX Association, pp. 26--26.

[7]

Gill, B. S., and Bathen, L. A. D. Optimal multistream sequential prefetching in a shared cache. ACM Transactions on Storage (TOS) 3, 3 (Oct. 2007).

Digital Library

[8]

Gill, B. S., and Modha, D. S. SARC: Sequential prefetching in adaptive replacement cache. In USENIX Annual Technical Conference, General Track (2005), pp. 293--308.

[9]

Gniady, C., Butt, A. R., and Hu, Y. C. Program-counter-based pattern classification in buffer caching. In 6th Symp. Operating Systems Design & Implementation (OSDI) (Dec 2004), pp. 395--408.

[10]

Griffioen, J., and Appleton, R. Reducing file system latency using a predictive approach. In USENIX summer (1994), pp. 197--207.

Digital Library

[11]

Gu, P., Zhu, Y., Jiang, H., and Wang, J. Nexus: a novel weighted-graph-based prefetching algorithm for metadata servers in petabyte-scale storage systems. In Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International Symposium on (2006), vol. 1, IEEE, pp. 8--416.

[12]

Gulati, A., Shanmuganathan, G., Ahmad, I., Waldspurger, C., and Uysal, M. Pesto: Online storage performance management in virtualized datacenters. In Proceedings of the 2nd ACM Symposium on Cloud Computing (New York, NY, USA, 2011), SOCC '11, ACM, pp. 19:1--19:14.

Digital Library

[13]

Han, J., Pei, J., and Kamber, M. Data mining: concepts and techniques. Elsevier, 2011.

Digital Library

[14]

Jiang, S., Ding, X., Xu, Y., and Davis, K. A prefetching scheme exploiting both data layout and access history on disk. ACM Transactions on Storage (TOS) 9, 3 (2013), 10.

[15]

Jiang, S., and Zhang, X. Lirs: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. In Proceedings of the 2002 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (New York, NY, USA, 2002), SIGMETRICS '02, ACM, pp. 31--42.

Digital Library

[16]

Koh, Y. S., and Rountree, N. Finding sporadic rules using apriori-inverse. In Proceedings of the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (Berlin, Heidelberg, 2005), PAKDD'05, Springer-Verlag, pp. 97--106.

Digital Library

[17]

Li, M., Varki, E., Bhatia, S., and Merchant, A. Tap: Table-based prefetching for storage caches. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (Berkeley, CA, USA, 2008), FAST'08, USENIX Association, pp. 6:1--6:16.

Digital Library

[18]

Li, Z., Chen, Z., Srinivasan, S. M., and Zhou, Y. C-Miner: Mining block correlations in storage systems. In FAST (2004), vol. 4, pp. 173--186.

[19]

Li, Z., Chen, Z., and Zhou, Y. Mining block correlations to improve storage performance. ACM Transactions on Storage (TOS) 1, 2 (2005), 213--245.

Digital Library

[20]

McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. A fast file system for unix. ACM Trans. Comput. Syst. 2, 3 (Aug. 1984), 181--197.

Digital Library

[21]

Megiddo, N., and Modha, D. S. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (Berkeley, CA, USA, 2003), FAST '03, USENIX Association, pp. 115--130.

Digital Library

[22]

Narayanan, D., Donnelly, A., and Rowstron, A. Write offloading: Practical power management for enterprise storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (Berkeley, CA, USA, 2008), FAST'08, USENIX Association, pp. 17:1--17:15.

Digital Library

[23]

Soundararajan, G., Mihailescu, M., and Amza, C. Context-aware prefetching at the storage server. In USENIX Annual Technical Conference (2008), pp. 377--390.

[24]

Tang, L., Huang, Q., Lloyd, W., Kumar, S., and Li, K. Ripq: Advanced photo caching on flash for facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (Berkeley, CA, USA, 2015), FAST'15, USENIX Association, pp. 373--386.

[25]

Teng, J. Z., and Gumaer, R. A. Managing ibm database 2 buffers to maximize performance. IBM Syst. J. 23, 2 (June 1984), 211--218.

Digital Library

[26]

Waldspurger, C. A., Park, N., Garthwaite, A., and Ahmad, I. Efficient mrc construction with shards. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (Berkeley, CA, USA, 2015), FAST'15, USENIX Association, pp. 95--110.

[27]

Wong, T. M., and Wilkes, J. My cache or yours? making storage more exclusive. In Proceedings of the General Track of the Annual Conference on USENIX Annual Technical Conference (Berkeley, CA, USA, 2002), ATEC '02, USENIX Association, pp. 161--175.

Digital Library

[28]

Yang, J. mimircache. https://github.com/1a1a11a/mimircache. Accessed: 2017-01-30.

[29]

Yang, S., Srinivasan, K., Udayashankar, K., Krishnan, S., Feng, J., Zhang, Y., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. Tombolo: Performance enhancements for cloud storage gateways. In IEEE 32nd Symposium on Mass Storage Systems and Technologies, MSST 2016 (2016).

[30]

Zhou, Y., Philbin, J., and Li, K. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the General Track: 2001 USENIX Annual Technical Conference (Berkeley, CA, USA, 2001), USENIX Association, pp. 91--104.

Digital Library

Cited By

Ferreira IOki E(2024)Latency-Aware Cache Mechanism for Resolver Service of Domain Name SystemsNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575387(1-4)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575387
Yang JMao ZYue YRashmi KNaor DGoel A(2023)GL-CacheProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585946(115-133)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.5555/3585938.3585946
Yang JZhang YQiu ZYue YVinayak RDruschel PKaufmann AMace JFlinn JSeltzer M(2023)FIFO queues are all you need for cache evictionProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613147(130-149)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613147
Show More Cited By

Recommendations

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...
SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing

September 2017

672 pages

ISBN:9781450350280

DOI:10.1145/3127479

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 September 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

SoCC '17

Sponsor:

SoCC '17: ACM Symposium on Cloud Computing

September 24 - 27, 2017

California, Santa Clara

Acceptance Rates

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
542
Total Downloads

Downloads (Last 12 months)119
Downloads (Last 6 weeks)17

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ferreira IOki E(2024)Latency-Aware Cache Mechanism for Resolver Service of Domain Name SystemsNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575387(1-4)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575387
Yang JMao ZYue YRashmi KNaor DGoel A(2023)GL-CacheProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585946(115-133)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.5555/3585938.3585946
Yang JZhang YQiu ZYue YVinayak RDruschel PKaufmann AMace JFlinn JSeltzer M(2023)FIFO queues are all you need for cache evictionProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613147(130-149)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613147
Yang JQiu ZZhang YYue YRashmi KBaumann ACrooks NSchwarzkopf M(2023)FIFO can be Better than LRU: the Power of Lazy Promotion and Quick DemotionProceedings of the 19th Workshop on Hot Topics in Operating Systems10.1145/3593856.3595887(70-79)Online publication date: 22-Jun-2023
https://dl.acm.org/doi/10.1145/3593856.3595887
Qiu ZYang JZhang JLi CMa XChen QYang MXu YFedorova ANarayanan DDi Luna GQuerzoni L(2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3587446
Li ZPi XPark Y(2023)S/C: Speeding up Data Materialization with Bounded Memory2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00393(1981-1994)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00393
Niu WCheng JSun DSong X(2022)Research on hybrid Markov prediction model based on clusteringInternational Symposium on Computer Applications and Information Systems (ISCAIS 2022)10.1117/12.2639741(49)Online publication date: 19-May-2022
https://doi.org/10.1117/12.2639741
Yang JYue YRashmi K(2021)A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at TwitterACM Transactions on Storage10.1145/346852117:3(1-35)Online publication date: 16-Aug-2021
https://dl.acm.org/doi/10.1145/3468521
Ma JZheng XLiu YChen Z(2021)KBP: Mining Block Access Pattern for I/O Prediction with K-Truss2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00035(167-176)Online publication date: Sep-2021
https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00035
Dongjie ZHaiwen DYundong SZhaoshuo TNing C(2021)A data grouping model based on cache transaction for unstructured data storage systemsInternational Journal of Intelligent Systems10.1002/int.2272837:8(4488-4514)Online publication date: 3-Nov-2021
https://doi.org/10.1002/int.22728
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents