Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3488733.3488739guideproceedingsArticle/Chapter ViewAbstractPublication PageshotstorageConference Proceedingsconference-collections
research-article
Free access

Desperately seeking ... optimal multi-tier cache configurations

Published: 13 July 2020 Publication History

Abstract

Modern cache hierarchies are tangled webs of complexity. Multiple tiers of heterogeneous physical and virtual devices, with many configurable parameters, all contend to optimally serve swarms of requests between local and remote applications. The challenge of effectively designing these systems is exacerbated by continuous advances in hardware, firmware, innovation in cache eviction algorithms, and evolving workloads and access patterns. This rapidly expanding configuration space has made it costly and time-consuming to physically experiment with numerous cache configurations for even a single stable workload. Current cache evaluation techniques (e.g., Miss Ratio Curves) are short-sighted: they analyze only a single tier of cache, focus primarily on performance, and fail to examine the critical relationships between metrics like throughput and monetary cost. Publicly available I/O cache simulators are also lacking: they can only simulate a fixed or limited number of cache tiers, are missing key features, or offer limited analyses.
It is our position that best practices in cache analysis should include the evaluation of multi-tier configurations, coupled with more comprehensive metrics that reveal critical design trade-offs, especially monetary costs. We are developing an n-level I/O cache simulator that is general enough to model any cache hierarchy, captures many metrics, provides a robust set of analysis features, and is easily extendable to facilitate experimental research or production level provisioning. To demonstrate the value of our proposed metrics and simulator, we extended an existing cache simulator (PyMimircache). We present several interesting and counter-intuitive results in this paper.

References

[1]
Accusim: Accurate simulation of cache replacement algorithms, March 2020. https://engineering.purdue.edu/~ychu/accusim/.
[2]
Waleed Ali, Sarina Sulaiman, and Norbahiah Ahmad. Performance improvement of least-recently-used policy in web proxy cache replacement using supervised machine learning. In SOCO, 2014.
[3]
Anandtech: Hardware news and tech reviews since 1997. www.anandtech.com.
[4]
Dulcardo Arteaga, Jorge Cabrera-Gámez, Jing Xu, Swaminathan Sundararaman, and Ming Zhao. Cloudcache: On-demand flash cache management for cloud computing. In FAST, 2016.
[5]
Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. Robinhood: Tail latency aware caching - dynamic reallocation from cache-rich to cache-poor. In OSDI, 2018.
[6]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. The gem5 simulator. SIGARCH Computer Architecture News, 39(2):1-7, August 2011.
[7]
Daniel Byrne, Nilufer Onder, and Zhenlin Wang. mpart: Miss-ratio curve guided partitioning in key-value stores. In ISMM, 2018.
[8]
Kevin K. Chang, Abhijith Kashyap, Hasan Hassan, Saugata Ghose, Kevin Hsieh, Donghyuk Lee, Tianshi Li, Gennady Pekhimenko, Samira Khan, and Onur Mutlu. Understanding latency variation in modern DRAMchips: Experimental characterization, analysis, and optimization. In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, SIGMETRICS'16, pages 323-336, New York, NY, USA, 2016. ACM.
[9]
X. Chen, N. Khoshavi, J. Zhou, D. Huang, R. F. DeMara, J. Wang, W. Wen, and Y. Chen. Aos: Adaptive overwrite scheme for energy-efficient mlc stt-ram cache. In 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC), pages 1-6, June 2016.
[10]
Xian Chen, Wenzhi Chen, Zhongyong Lu, Peng Long, Shuiqiao Yang, and Zonghiu Wang. A duplication-aware SSD-based cache architecture for primary storage in virtualization environment. IEEE Systems Journal, 11(4):2578-2589, December 2017.
[11]
Zhiguang Chen, NongXiao, and Fang Liu. Sac: Rethinking the cache replacement policy for ssd-based storage systems. In Proceedings of the 5th Annual International Systems and Storage Conference, SYSTOR '12, New York, NY, USA, 2012. Association for Computing Machinery.
[12]
Yue Cheng, Aayush Gupta, Anna Povzner, and Ali R. Butt. High performance in-memory caching through flexible fine-grained services. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC '13, New York, NY, USA, 2013. Association for Computing Machinery.
[13]
Yuxia Cheng, Wenzhi Chen, Zonghui Wang, Xinjie Yu, and Yang Xiang. AMC: an adaptive multi-level cache algorithm in hybrid storage systems. Concurrency and Computation: Practice and Experience, 27(16):4230-4246, 2015.
[14]
Yuxia Cheng, Yang Xiang, Wenzhi Chen, Houcine Hassan, and Abdulhameed Alelaiwi. Efficient cache resource aggregation using adaptive multi-level exclusive caching policies. Future Generation Computer Systems, 86:964 - 974, 2018.
[15]
Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. Dynacache: Dynamic cloud caching. In 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 15), Santa Clara, CA, July 2015. USENIX Association.
[16]
Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. Cliffhanger: Scaling performance cliffs in web memory caches. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 379-392, Santa Clara, CA, March 2016. USENIX Association.
[17]
Jeffrey Dean and Luiz André Barroso. The tail at scale. Communications of the ACM, 56(2):74-80, February 2013.
[18]
Dinero iv trace-driven uniprocessor cache simulator. http://pages.cs.wisc.edu/~markhill/DineroIV/.
[19]
Nosayba El-Sayed, Ioan A. Stefanovici, George Amvrosiadis, Andy A. Hwang, and Bianca Schroeder. Temperature management in data centers: Why some (might) like it hot. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS'12, pages 163-174, New York, NY, USA, 2012. ACM.
[20]
Jianyu Fu, Dulcardo Arteaga, and Ming Zhao. Locality-driven mrc construction and cache allocation. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '18, pages 19-20, New York, NY, USA, 2018. ACM.
[21]
U. U. Hafeez, M. Wajahat, and A. Gandhi. ElMem: Towards an Elastic Memcached System. In Proceedings of the 38th IEEE International Conference on Distributed Computing Systems, pages 278-289, Vienna, Austria, 2018.
[22]
U. U. Hafeez, M. Wajahat, and A. Gandhi. Elmem: Towards an elastic memcached system. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pages 278-289, 2018.
[23]
Alireza Haghdoost. Sim-ideal, Dec 2013. https://github.com/arh/sim-ideal/tree/master.
[24]
Md E. Haque, Yong hun Eom, Yuxiong He, Sameh Elnikety, Ricardo Bianchini, and Kathryn S. McKinley. Few-to-many: Incremental parallelism for reducing tail latency in interactive services. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS'15, pages 161-175, New York, NY, USA, 2015. ACM.
[25]
Lulu He, Zhibin Yu, and Hai Jin. Fractalmrc: Online cache miss rate curve prediction on commodity systems. 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pages 1341-1351, 2012.
[26]
Xiameng Hu, Xiaolin Wang, Lan Zhou, Yingwei Luo, Zhenlin Wang, Chen Ding, and Chencheng Ye. Fast miss ratio curve modeling for storage cache. TOS, 14:12:1-12:34, 2018.
[27]
Dr. Shaily Jain and Nitin Nitin. Memory map: A multiprocessor cache simulator. Journal of Electrical and Computer Engineering, 2012, 09 2012.
[28]
Myeongjae Jeon, Saehoon Kim, Seung-won Hwang, Yuxiong He, Sameh Elnikety, Alan L. Cox, and Scott Rixner. Predictive parallelization: Taming tail latencies in web search. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR'14, pages 253-262, New York, NY, USA, 2014. ACM.
[29]
N. Jeremic, G.M'uhl, A. Busse, and J. Richling. The pitfalls of deploying solid-state drive RAIDs. In Proceedings of the 4th Annual International Conference on Systems and Storage, SYSTOR '11. ACM, 2011.
[30]
M. Jung and M. Kandemir. Revisiting widely held SSD expectations and rethinking system-level implications. In Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '13, pages 203-216, New York, NY, USA, 2013. ACM.
[31]
Ricardo Koller, Akshat Verma, and Raju Rangaswami. Generalized erss tree model: Revisiting working sets. Performance Evaluation, 67:1139-1154, 2010.
[32]
Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports, and Steven D. Gribble. Tales of the tail: Hardware, OS, and application-level sources of tail latency. In Proceedings of the ACM Symposium on Cloud Computing, SoCC'14, pages 9:1-9:14, New York, NY, USA, 2014. ACM.
[33]
Z. Li, M. Chen, A. Mukker, and E. Zadok. On the trade-offs among performance, energy, and endurance in a versatile hybrid drive. ACM Transactions on Storage (TOS), 11(3), July 2015.
[34]
Z. Li, M. Chen, and E. Zadok. Greendm: A versatile hybrid drive for energy and performance. Technical report, Stony Brook University, 2013. Paper under review.
[35]
Z. Li, A. Mukker, and E. Zadok. On the importance of evaluating storage systems' $costs. In Proceedings of the 6th USENIX Conference on Hot Topics in Storage and File Systems, HotStorage'14, 2014.
[36]
Chieh-Jan Mike Liang, Jie Liu, Liqian Luo, Andreas Terzis, and Feng Zhao. RACNet: A high-fidelity data center sensing network. In Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems, SenSys'09, pages 15-28, New York, NY, USA, 2009. ACM.
[37]
Y. Lu, J. Shu, and W. Zheng. Extending the lifetime of flash-based storage through reducing write amplification from file systems. In In Proceedings of the 11th USENIX Symposium on File and Storage Technologies (FAST '13), 2013.
[38]
Rano Mal and Yul Chu. A flexible multi-core functional cache simulator (fm-sim). In Proceedings of the Summer Simulation Multi-Conference, SummerSim '17, San Diego, CA, USA, 2017. Society for Computer Simulation International.
[39]
Michael Mesnier, Feng Chen, Tian Luo, and Jason B. Akers. Differentiated storage services. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 57-70, New York, NY, USA, 2011. ACM.
[40]
D. Narayanan, A. Donnelly, and A. Rowstron. Write off-loading: Practical power management for enterprise storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST 2008), 2008.
[41]
Iyswarya Narayanan, DiWang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, Anand Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine Khessib, and Kushagra Vaid. SSD failures in datacenters: What? when? and why? In Proceedings of the Ninth ACM Israeli Experimental Systems Conference (SYSTOR '16), pages 7:1-7:11, Haifa, Israel, May 2016. ACM.
[42]
A. V. Nori, J. Gaur, S. Rai, S. Subramoney, and H. Wang. Criticality aware tiered cache hierarchy: A fundamental relook at multi-level cache hierarchies. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pages 96-109, June 2018.
[43]
Massachusetts Institute of Technology. Dynamorio: Dynamic instrumentation tool platform, February 2009. http://www.dynamorio.org/.
[44]
Sundaresan Rajasekaran, Shaohua Duan, Wei Zhang, and Timothy Wood. Multi-cache: Dynamic, efficient partitioning for multi-tier caches in consolidated VM environments. In 2016 IEEE International Conference on Cloud Engineering (IC2E), pages 182-191, April 2016.
[45]
R. Salkhordeh, S. Ebrahimi, and H. Asadi. Reca: An efficient reconfigurable cache architecture for storage systems with online workload characterization. IEEE Transactions on Parallel and Distributed Systems, 29(7):1605-1620, July 2018.
[46]
Ricardo Santana, Steven Lyons, Ricardo Koller, Raju Rangaswami, and Jason Liu. To arc or not to arc. In HotStorage, 2015.
[47]
Priya Sehgal, Vasily Tarasov, and Erez Zadok. Evaluating performance and energy in file system server workloads. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST), pages 253-266, San Jose, CA, February 2010. USENIX Association.
[48]
Carl Staelin and Hector Garcia-molina. Clustering active disk data to improve disk performance. Technical Report CS-TR-298-9, Princeton University, NJ, USA, 1990.
[49]
Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation, NSDI'15, pages 513-527, Berkeley, CA, USA, 2015. USENIX Association.
[50]
Tom's hardware: For the hardcore pc enthusiast. www.tomshardware.com.
[51]
Userbenchmark. www.userbenchmark.com.
[52]
A. Verma, R. Koller, L. Useche, and R. Rangaswami. SRCMap: Energy proportional storage using dynamic consolidation. In Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST'10, 2010.
[53]
Giuseppe Vietri, Liana V. Rodriguez, Wendy A. Martinez, Steven Lyons, Jason Liu, Raju Rangaswami, Ming Zhao, and Giri Narasimhan. Driving cache replacement with ml-based lecar. In HotStorage, 2018.
[54]
Carl A. Waldspurger, Nohhyun Park, Alex Garthwaite, and Irfan Ahmad. Efficient mrc construction with shards. In FAST, 2015.
[55]
Carl A. Waldspurger, Trausti Saemundson, Irfan Ahmad, and Nohhyun Park. Cache modeling and optimization using miniature simulations. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference, USENIX ATC '17, pages 487-498, Berkeley, CA, USA, 2017. USENIX Association.
[56]
Han Wan, Xiaopeng Gao, Xiang Long, and Zhiqiang Wang. Gcsim: A gpu-based trace-driven simulator for multi-level cache. In Yong Dou, Ralf Gruber, and Josef M. Joller, editors, Advanced Parallel Processing Technologies, pages 177-190, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
[57]
Jiangtao Wang, Zhiliang Guo, and Xiaofeng Meng. An efficient design and implementation of multi-level cache for database systems. In DASFAA, 2015.
[58]
A. Wildani, E. L. Miller, and L. Ward. Efficiently identifying working sets in block I/O streams. In Proceedings of the 4th Annual International Conference on Systems and Storage, SYSTOR '11, pages 5:1-5:12. ACM, 2011.
[59]
John Wilkes. The pantheon storage-system simulator. 1996.
[60]
Suzhen Wu, Yanping Lin, Bo Mao, and Hong Jiang. Gcar: Garbage collection aware cache management with improved performance for flash-based ssds. In Proceedings of the 2016 International Conference on Supercomputing, ICS '16, New York, NY, USA, 2016. Association for Computing Machinery.
[61]
Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. Bobtail: Avoiding long tails in the cloud. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, NSDI'13, pages 329-342, Berkeley, CA, USA, 2013. USENIX Association.
[62]
Juncheng Yang. PyMimircache. https://github.com/1a1a11a/PyMimircache. Retrieved April 17, 2019.
[63]
Juncheng Yang, Reza Karimi, Trausti Sæmundsson, Avani Wildani, and Ymir Vigfusson. MITHRIL: mining sporadic associations for cache prefetching. CoRR, abs/1705.07400, 2017.
[64]
Juncheng Yang, Reza Karimi, Trausti Sæmundsson, Avani Wildani, and Ymir Vigfusson. Mithril: Mining sporadic associations for cache prefetching. In Proceedings of the 2017 Symposium on Cloud Computing, SoCC '17, pages 66-79, New York, NY, USA, 2017. ACM.
[65]
Lei Zhang, Reza Karimi, Irfan Ahmad, and Ymir Vigfusson. Optimal data placement for heterogeneous cache, memory, and storage systems. In Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '20, 2020. To appear.
[66]
Timothy Zhu, Anshul Gandhi, Mor Harchol-Balter, and Michael A. Kozuch. Saving cash by using less cache. In Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Computing, HotCloud'12, page 3, USA, 2012. USENIX Association.

Index Terms

  1. Desperately seeking ... optimal multi-tier cache configurations
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Guide Proceedings
            HotStorage '20: Proceedings of the 12th USENIX Conference on Hot Topics in Storage and File Systems
            July 2020
            12 pages

            Sponsors

            • ORACLE
            • VMware

            Publisher

            USENIX Association

            United States

            Publication History

            Published: 13 July 2020

            Qualifiers

            • Research-article

            Acceptance Rates

            Overall Acceptance Rate 34 of 87 submissions, 39%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 50
              Total Downloads
            • Downloads (Last 12 months)28
            • Downloads (Last 6 weeks)5
            Reflects downloads up to 15 Oct 2024

            Other Metrics

            Citations

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media