Abstract
In most storage systems, the storage nodes store data on a local filesystem. Thus, unless they have a dedicated caching layer, they benefit from the usual filesystem cache in the host’s free memory. However, in erasure-coded storage systems, caching is effective only if all the systematic fragments corresponding to an object are in the cache. In this work, we propose a new caching policy adapting traditional methods to erasure-coded storage systems. The main idea of our solution is to cache a full object rather than fragments object. A simulation-based evaluation showed that our full replica solution is able to improve the cache hit ratio and reduce the cache waste ratio compared to the traditional caching method. Moreover, experimental evaluation has been conducted. It indicates that our implementation not only validates the previous results but also shows that cache hits on full replicas have a better request response time.
Similar content being viewed by others
References
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 35–40 (2010)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10 (2010)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. SIGOPS Oper. Syst. Rev. 37(5), 29–43 (2003)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: BigTable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008)
Corbett, J.C.: Spanner: Google’s globally-distributed database
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, SOSP ’07, pp. 205–220. ACM, New York (2007)
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 307–320. USENIX Association (2006)
Ruty, G., Surcouf, A., Rougier, J.L.: Collapsing the layers: 6Stor, a scalable and IPv6-centric distributed storage system. In: 2017 Fourth International Conference on Software Defined Systems (SDS), pp. 81–86 (2017)
Talaat, F.M., Ali, S.H., Saleh, A.I., Ali, H.A.: Effective cache replacement strategy (ECRS) for real-time fog computing environment. Clust. Comput. 23, 1–25 (2020)
Shahid, M.H., Hameed, A.R., ul Islam, S., Khattak, H.A., Ud Din, I., Rodrigues, J.J.P.C.: Energy and delay efficient fog computing using caching mechanism. Comput. Commun. 154, 534–541 (2020)
Kalghoum, A., Saidane, L.A.: FCR-NS: a novel caching and forwarding strategy for named data networking based on software defined networking. Clust. Comput. 22(3), 981–994 (2019)
Hou, R., Zhang, L., Wu, T., Mao, T., Luo, J.: Bloom-filter-based request node collaboration caching for named data networking. Clust. Comput. 22(3), 6681–6692 (2019)
Bok, K., Oh, H., Lim, J., Pae, Y., Choi, H., Lee, B., Yoo, J.: An efficient distributed caching for accessing small files in HDFS. Clust. Comput. 20(4), 3579–3592 (2017)
Yu, Y., Wang, W., Huang, R., Zhang, J., Letaief, K.B.: Achieving load-balanced, redundancy-free cluster caching with selective partition. IEEE Trans. Parallel Distrib. Syst. 31(2), 439–454 (2019)
Herodotou, H.: AutoCache: employing machine learning to automate caching in distributed file systems. In: 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), pp. 133–139. IEEE (2019)
Xiang, Y., Lan, T., Aggarwal, V., Chen, Y.-F.R.: Joint latency and cost optimization for erasure-coded data center storage. IEEE/ACM Trans. Netw. 24(4), 2443–2457 (2016)
Joshi, G., Liu, Y., Soljanin, E.: On the delay-storage trade-off in content download from coded distributed storage systems. IEEE J. Sel. Areas Commun. 32(5), 989–997 (2014)
Nadgowda, S.J., Sreenivas, R.C., Gupta, S., Gupta, N., Verma, A.: C2P: co-operative caching in distributed storage systems. In: International Conference on Service-Oriented Computing, pp. 214–229. Springer (2014)
Luo, T., Aggarwal, V., Peleato, B.: Coded caching with distributed storage. arXiv preprint (2016).arXiv:1611.06591
Aggarwal, V., Chen, Y.-F.R., Lan, T., Xiang, Y.: Sprout: a functional caching approach to minimize service latency in erasure-coded storage. IEEE/ACM Trans. Netw. 25(6), 3683–3694 (2017)
Rashmi, K.V., Kosaian, J., Chowdhury, M., Stoica, I., Ramchandran, K.: EC-Cache: load-balanced, low-latency cluster caching with online erasure coding. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), 2–4 November 2016, Savannah, GA, USA, pp. 401–417 (2016)
Al-Abbasi, A.O., Aggarwal, V.: TTLCache: taming latency in erasure-coded storage through TTL caching. IEEE Trans. Netw. Serv. Manag. 17(3), 1582–1596 (2020)
Red Hat: GlusterFS: Red Hat Storage Software Appliance. Technical Report (2011)
Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and Zipf-like distributions: evidence and implications. In: INFOCOM’99. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, vol. 1, pp. 126–134. IEEE (1999)
Huberman, B.A., Pirolli, P.L.T., Pitkow, J.E., Lukose, R.M.: Strong regularities in world wide web surfing. Science 280(5360), 95–97 (1998)
Adamic, L.A., Huberman, B.A.: Zipf’s law and the internet. Glottometrics 3(1), 143–150 (2002)
Crovella, M.E., Taqqu, M.S., Bestavros, A.: Heavy-tailed probability distributions in the world wide web. In: A Practical Guide to Heavy Tails, vol. 1, pp. 3–26. Birkh\(\ddot{\text{a}}\)user Basel, Boston (1998)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ruty, G., Baccouch, H., Nguyen, V. et al. Popularity-based full replica caching for erasure-coded distributed storage systems. Cluster Comput 24, 3173–3186 (2021). https://doi.org/10.1007/s10586-021-03317-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-021-03317-0