Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Cache What You Need to Cache: Reducing Write Traffic in Cloud Cache via “One-Time-Access-Exclusion” Policy

Published: 16 July 2020 Publication History
  • Get Citation Alerts
  • Abstract

    The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization.
    To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.

    References

    [1]
    Ethem Alpaydin. 2014. Introduction to Machine Learning. MIT Press, Cambridge, MA.
    [2]
    Xiao Bai, B. Barla Cambazoglu, and Archie Russell. 2016. Improved caching techniques for large-scale image hosting services. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 639--648.
    [3]
    Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.
    [4]
    Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM’99), Vol. 1. IEEE, Los Alamitos, CA, 126--134.
    [5]
    Li-Pin Chang, Yu-Syun Liu, and Wen-Huei Lin. 2016. Stable greedy: Adaptive garbage collection for durable page-mapping multichannel SSDs. ACM Transactions on Embedded Computing Systems 15, 1 (Jan. 2016), Article 13, 25 pages.
    [6]
    Feng Chen, Tian Luo, and Xiaodong Zhang. 2011. CAFTL: A content-aware flash translation layer enhancing the lifespan of flash memory based solid state drives. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11), Vol. 11. 77--90.
    [7]
    F. H. Chen, M. C. Yang, Y. H. Chang, and T. W. Kuo. 2015. PWL: A progressive wear leveling to minimize data migration overheads for NAND flash devices. In Proceedings of the 2015 Design, Automation, and Test in Europe Conference and Exhibition (DATE’15). 1209--1212.
    [8]
    Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2015. Dynacache: Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15).
    [9]
    Riley Crane and Didier Sornette. 2008. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences 105, 41 (2008), 15649--15653.
    [10]
    Zhaoxia Deng, Lunkai Zhang, Nikita Mishra, Henry Hoffmann, and Frederic T. Chong. 2017. Memory cocktail therapy: A general learning-based framework to optimize dynamic tradeoffs in NVMs. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, 232--244.
    [11]
    Assaf Eisenman, Asaf Cidon, Evgenya Pergament, Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, and Sachin Katti. 2019. Flashield: A hybrid key-value cache that controls flash write amplification. In Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation (NSDI’19). 65--78.
    [12]
    Charles Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 17. 973--978.
    [13]
    Eran Gal and Sivan Toledo. 2005. Algorithms and data structures for flash memories. ACM Computing Surveys 37, 2 (2005), 138--163.
    [14]
    Francesco Gelli, Tiberio Uricchio, Marco Bertini, Alberto Del Bimbo, and Shih-Fu Chang. 2015. Image popularity prediction in social media using sentiment and context features. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, New York, NY, 907--910.
    [15]
    Ping Huang, Wenjie Liu, Kun Tang, Xubin He, and Ke Zhou. 2016. ROP: Alleviating refresh overheads via reviving the memory system in frozen cycles. In Proceedings of the 2016 45th International Conference on Parallel Processing (ICPP’16). IEEE, Los Alamitos, CA, 169--178.
    [16]
    Ping Huang, Pradeep Subedi, Xubin He, Shuang He, and Ke Zhou. 2014. FlexECC: Partially relaxing ECC of MLC SSD for better cache performance. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC’14). 489--500.
    [17]
    Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. 2013. An analysis of Facebook photo caching. In Proceedings of the 24th ACM Symposium on Operating Systems Principles. ACM, New York, NY, 167--181.
    [18]
    Sai Huang, Qingsong Wei, Dan Feng, Jianxi Chen, and Cheng Chen. 2016. Improving flash-based disk cache with lazy adaptive replacement. ACM Transactions on Storage 12, 2 (2016), 8.
    [19]
    Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. ACM SIGMETRICS Performance Evaluation Review 30, 1 (2002), 31--42.
    [20]
    Daniel A. Jiménez and Elvira Teran. 2017. Multiperspective reuse prediction. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, 436--448.
    [21]
    Xavier Jimenez, David Novo, and Paolo Ienne. 2014. Wear unleveling: Improving NAND flash lifetime by balancing page endurance. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14), Vol. 14. 47--59.
    [22]
    Ramakrishna Karedla, J. Spencer Love, and Bradley G. Wherry. 1994. Caching strategies to improve disk system performance. Computer 27, 3 (1994), 38--46.
    [23]
    Georgios Keramidas, Pavlos Petoumenos, and Stefanos Kaxiras. 2007. Cache replacement based on reuse-distance prediction. In Proceedings of the 2007 25th International Conference on Computer Design. IEEE, Los Alamitos, CA, 245--250.
    [24]
    Mazen Kharbutli and Yan Solihin. 2008. Counter-based cache replacement and bypassing algorithms. IEEE Transactions on Computers 57, 4 (2008), 433--447.
    [25]
    Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular? In Proceedings of the 23rd International Conference on World Wide Web. ACM, New York, NY, 867--876.
    [26]
    Ren-Shuo Liu, Chia-Lin Yang, and Wei Wu. 2012. Optimizing NAND flash-based SSDs via retention relaxation. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 11.
    [27]
    Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). 115--130.
    [28]
    Leeor Peled, Shie Mannor, Uri Weiser, and Yoav Etsion. 2015. Semantic locality and context-based prefetching using reinforcement learning. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). ACM, New York, NY, 285--297.
    [29]
    Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 24--33.
    [30]
    Gokul Soundararajan, Vijayan Prabhakaran, Mahesh Balakrishnan, and Ted Wobber. 2010. Extending SSD lifetimes with disk-based write caches. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). 101--114.
    [31]
    Gabor Szabo and Bernardo A. Huberman. 2010. Predicting the popularity of online content. Communications of the ACM 53, 8 (2010), 80--88.
    [32]
    Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Advanced photo caching on flash for Facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 373--386.
    [33]
    Linpeng Tang, Qi Huang, Amit Puntambekar, Ymir Vigfusson, Wyatt Lloyd, and Kai Li. 2017. Popularity prediction of Facebook videos for higher quality streaming. In Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC’17). 111--123.
    [34]
    Elvira Teran, Zhe Wang, and Daniel A. Jiménez. 2016. Perceptron learning for reuse prediction. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE, Los Alamitos, CA, 1--12.
    [35]
    Laszlo A. Belady.1966. A study of replacement algorithms for a virtual-storage computer. IBM Systems Journal 5, 2 (1966), 78--101.
    [36]
    Xiaopeng Fan, Jiannong Cao, Haixia Mao, Weigang Wu, Yubin Zhao, and Chengzhong Xu. 2016. Web access patterns enhancing data access performance of cooperative caching in IMANETs. In Proceedings of the 2016 17th IEEE International Conference on Mobile Data Management (MDM’16), Vol. 1. IEEE, Los Alamitos, CA, 50--59.
    [37]
    Lei Guo, Enhua Tan, Songqing Chen, Zhen Xiao, and Xiaodong Zhang. 2008. The stretched exponential distribution of Internet media access patterns. In Proceedings of the 27th ACM Symposium on Principles of Distributed Computing. ACM, Los Alamitos, CA, 283--294.
    [38]
    Rohan Samarasinghe, Yoshihiro Yasutake, and Takaichi Yoshida. 2005. Optimizing the access performance and data freshness of distributed cache objects considering user access pattern. In Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05), Vol. 2. IEEE, Los Alamitos, CA, 325--328.
    [39]
    M. Zubair Shafiq, Amir R. Khakpour, and Alex X. Liu. 2016. Characterizing caching workload of a large commercial content delivery network. In Proceedings of the 35th Annual IEEE International Conference on Computer Communications (INFOCOM’16). IEEE, Los Alamitos, CA, 1--9.
    [40]
    Aditya Sundarrajan, Mingdong Feng, Mangesh Kasbekar, and Ramesh K. Sitaraman. 2017. Footprint descriptors: Theory and practice of cache provisioning in a global CDN. In Proceedings of the 13th International Conference on Emerging Networking Experiments and Technologies. ACM, New York, NY, 55--67.
    [41]
    Yue Yang and Jianwen Zhu. 2016. Write skew and Zipf distribution: Evidence and implications. ACM Transactions on Storage 12, 4 (2016), 1--19.
    [42]
    Guanying Wu and Xubin He. 2012. Delta-FTL: Improving SSD lifetime via exploiting content locality. In Proceedings of the 7th ACM European Conference on Computer Systems. ACM, New York, NY, 253--266.
    [43]
    Qiang Yang, Haining Henry Zhang, and Tianyi Li. 2001. Mining web logs for prediction models in WWW caching and prefetching. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 473--478.
    [44]
    Rui Ye, Wentao Meng, and Shenggang Wan. 2017. Extending lifetime of SSD in Raid5 systems through a reliable hierarchical cache. In Proceedings of the 2017 International Conference on Networking, Architecture, and Storage (NAS’17). IEEE, Los Alamitos, CA, 1--8.
    [45]
    Qingyuan Zhao, Murat A. Erdogdu, Hera Y. He, Anand Rajaraman, and Jure Leskovec. 2015. Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1513--1522.
    [46]
    Ke Zhou, Shaofu Hu, Ping Huang, and Yuhong Zhao. 2017. LX-SSD: Enhancing the lifespan of NAND flash-based memory via recycling invalid pages. In Proceedings of the 2017 IEEE 33rd Symposium on Massive Storage Systems and Technology.
    [47]
    Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, and Tianming Yang. 2019. Improving cache performance for large-scale photo stores via heuristic prefetching scheme. IEEE Transactions on Parallel and Distributed Systems 30, 9 (2019), 2033--2045.
    [48]
    Ke Zhou, Yu Zhang, Ping Huang, Hua Wang, Yongguang Ji, Bin Cheng, and Ying Liu. 2018. LEA: A lazy eviction algorithm for SSD cache in cloud block storage. In Proceedings of the 2018 IEEE 36th International Conference on Computer Design (ICCD’18). IEEE, Los Alamitos, CA, 569--572.

    Cited By

    View all
    • (2023)FIFO queues are all you need for cache evictionProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613147(130-149)Online publication date: 23-Oct-2023
    • (2022)GHOSM: Graph-based Hybrid Outline and Skeleton Modelling for Shape RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/355492219:2s(1-23)Online publication date: 4-Aug-2022
    • (2022)Fine-Grained Fragment Diffusion for Cross Domain Crowd CountingProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548298(5659-5668)Online publication date: 10-Oct-2022
    • Show More Cited By

    Index Terms

    1. Cache What You Need to Cache: Reducing Write Traffic in Cloud Cache via “One-Time-Access-Exclusion” Policy

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 16, Issue 3
      August 2020
      150 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3410885
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 July 2020
      Online AM: 07 May 2020
      Accepted: 01 April 2020
      Revised: 01 February 2020
      Received: 01 June 2019
      Published in TOS Volume 16, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. SSD
      2. machine learning
      3. photo caching
      4. social network

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)67
      • Downloads (Last 6 weeks)17
      Reflects downloads up to 09 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)FIFO queues are all you need for cache evictionProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613147(130-149)Online publication date: 23-Oct-2023
      • (2022)GHOSM: Graph-based Hybrid Outline and Skeleton Modelling for Shape RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/355492219:2s(1-23)Online publication date: 4-Aug-2022
      • (2022)Fine-Grained Fragment Diffusion for Cross Domain Crowd CountingProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548298(5659-5668)Online publication date: 10-Oct-2022
      • (2022)SS-LRUProceedings of the 59th ACM/IEEE Design Automation Conference10.1145/3489517.3530469(397-402)Online publication date: 10-Jul-2022
      • (2022)A survey on AI for storageCCF Transactions on High Performance Computing10.1007/s42514-022-00101-34:3(233-264)Online publication date: 23-May-2022

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media