Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ICCG: low-cost and efficient consistency with adaptive synchronization for metadata replication

Published: 11 November 2024 Publication History

Abstract

The rapid growth in the storage scale of wide-area distributed file systems (DFS) calls for fast and scalable metadata management. Metadata replication is the widely used technique for improving the performance and scalability of metadata management. Because of the POSIX requirement of file systems, many existing metadata management techniques utilize a costly design for the sake of metadata consistency, leading to unacceptable performance overhead. We propose a new metadata consistency maintenance method (ICCG), which includes an incremental consistency guaranteed directory tree synchronization (ICGDT) and a causal consistency guaranteed replica index synchronization (CCGRI), to ensure system performance without sacrificing metadata consistency. ICGDT uses a flexible consistency scheme based on the state of files and directories maintained through the conflict state tree to provide an incremental consistency for metadata, which satisfies both metadata consistency and performance requirements. CCGRI ensures low latency and consistent access to data by establishing a causal consistency for replica indexes through multi-version extent trees and logical time. Experimental results demonstrate the effectiveness of our methods. Compared with the strong consistency policies widely used in modern DFSes, our methods significantly improve the system performance. For example, in file creation, ICCG can improve the performance of directory tree operations by at least 36.4 times.

References

[1]
Lavric J V, Juurola E, Vermeulen A T, and Kutsch W L Integrated carbon observation system (ICOS)-a domain-overarching long-term research infrastructure for the future Proceedings of AGU Fall Meeting Abstracts 2016 GC21C-1117
[2]
Wrzeszcz M, Trzepla K, Sota R, Zemek K, Lichoń T, Opiola L, Nikolow D, Dutka Ł, Slota R, and Kitowski J Metadata organization and management for globalization of data access with Onedata Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics 2016 312-321
[3]
Wei B, Xiao L M, Zhou H J, Qin G J, Song Y, and Zhang C H Global virtual data space for unified data access across supercomputing centers IEEE Transactions on Cloud Computing 2023 11 2 1822-1839
[4]
Huo J T, Xu Y W, Huo Z S, Xiao L M, and He Z X Research on key technologies of edge cache in virtual data space across wan Frontiers of Computer Science 2023 17 1 171102
[5]
Dai H, Wang Y, Kent K B, Zeng L F, and Xu C Z The state of the art of metadata managements in large-scale distributed file systems-scalability, performance and availability IEEE Transactions on Parallel and Distributed Systems 2022 33 12 3850-3869
[6]
Lv W H, Lu Y Y, Zhang Y M, Duan P L, and Shu J W InfiniFS: an efficient metadata service for Large-Scale distributed filesystems Proceedings of the 20th USENIX Conference on File and Storage Technologies 2022 313-328
[7]
Ousterhout J K, Da Costa H, Harrison D, Kunze J A, Kupfer M, and Thompson J G A trace-driven analysis of the Unix 4.2 BSD file system Proceedings of the 10th ACM Symposium on Operating Systems Principles 1985 15-24
[8]
Miller E L, Greenan K, Leung A, et al. Reliable and efficient metadata storage and indexing using nvram 2008
[9]
OPENSFS Lustre 2023
[10]
Thomson A and Abadi D J CalvinFS: Consistent WAN replication and scalable metadata management for distributed file systems Proceedings of the 13th USENIX Conference on File and Storage Technologies 2015 1-14
[11]
Weil S A, Brandt S A, Miller E L, Long D D E, and Maltzahn C Ceph: a scalable, high-performance distributed file system Proceedings of the 7th Symposium on Operating Systems Design and Implementation 2006 307-320
[12]
Shvachko K, Kuang H, Radia S, and Chansler R The Hadoop distributed file system Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST) 2010 1-10
[13]
Alvaro P, Condie T, Conway N, Elmeleegy K, Hellerstein J M, and Sears P C BOOM: data-centric programming in the datacenter 2009 Berkeley University of California at Berkeley
[14]
Parallel Data Lab Shardfs 2023
[15]
Matri P, Perez M S, Costan A, and Antoniu G TyrFS: increasing small files access performance with dynamic metadata replication Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) 2018 452-461
[16]
Burrows M The chubby lock service for loosely-coupled distributed systems Proceedings of the 7th Symposium on Operating Systems Design and Implementation 2006 335-350
[17]
Lipcon T, Alves D, Burkert D, et al. Kudu: Storage for fast analytics on fast data 2015 36-77 28
[18]
Li Z Y, Xue R N, and Ao L X Replichard: towards tradeoff between consistency and performance for metadata Proceedings of 2016 International Conference on Supercomputing 2016 25
[19]
Bravo M, Rodrigues L, and Van Roy P Saturn: a distributed metadata service for causal consistency Proceedings of the 12th European Conference on Computer Systems 2017 111-126
[20]
Vef M A, Moti N, Süß T, Tocci T, Nou R, Miranda A, Cortes T, and Brinkmann A GekkoFS-a temporary distributed file system for HPC applications Proceedings of 2018 IEEE International Conference on Cluster Computing (CLUSTER) 2018 319-324
[21]
Guerraoui R, Pavlovic M, and Seredinschi D A Incremental consistency guarantees for replicated objects Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation 2016 169-184
[22]
Abadi D Consistency tradeoffs in modern distributed database system design: CAP is only part of the story Computer 2012 45 2 37-42
[23]
Rodeh O and Teperman A zFS-a scalable distributed file system using object disks Proceedings of the 20th IEEE/ the 11th NASA Goddard Conference on Mass Storage Systems and Technologies 2003 207-218
[24]
Boyer E B, Broomfield M C, and Perrotti T A Glusterfs one storage server to rule them all 2012 Los Alamos Los Alamos National Laboratory
[25]
Niazi S, Ismail M, Haridi S, Dowling J, Grohsschmiedt S, and Ronström M HopsFS: scaling hierarchical file system metadata using newSQL databases Proceedings of the 15th USENIX Conference on File and Storage Technologies 2017 89-103
[26]
Ozsu M T and Valduriez P Principles of Distributed Database Systems 1999 Upper Saddle River Prentice Hall
[27]
Lamport L Paxos made simple 2001 51-58 ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001)
[28]
Ongaro D and Ousterhout J In search of an understandable consensus algorithm Proceedings of 2014 USENIX Conference on USENIX Annual Technical Conference 2014 305-320
[29]
Xu Q Q, Arumugam R V, Yong K L, and Mahadevan S Efficient and scalable metadata management in EB-scale file systems IEEE Transactions on Parallel and Distributed Systems 2014 25 11 2840-2850
[30]
Zhou J, Chen Y, Wang W P, and Meng D MAMS: a highly reliable policy for metadata service Proceedings of the 44th International Conference on Parallel Processing 2015 729-738
[31]
Chen Z, Xiong J, and Meng D Replication-based highly available metadata management for cluster file systems Proceedings of 2010 IEEE International Conference on Cluster Computing 2010 292-301
[32]
Chandra T D, Griesemer R, and Redstone J Paxos made live: an engineering perspective Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing 2007 398-407
[33]
Saito Y and Shapiro M Optimistic replication ACM Computing Surveys 2005 37 1 42-81
[34]
Ladin R, Liskov B, Shrira L, and Ghemawat S Providing high availability using lazy replication ACM Transactions on Computer Systems 1992 10 4 360-391
[35]
MongoDB Delayed replica set members 2023
[36]
Bailis P, Fekete A, Franklin M J, Ghodsi A, Hellerstein J M, and Stoica I Feral concurrency control: an empirical investigation of modern application integrity Proceedings of 2015 ACM SIGMOD International Conference on Management of Data 2015 1327-1342
[37]
Giannakopoulos I, Konstantinou I, Tsoumakos D, and Koziris N Cloud application deployment with transient failure recovery Journal of Cloud Computing 2018 7 1 11
[38]
Jia J, Liu Y, Zhang G Z, Gao Y L, and Qian D P Software approaches for resilience of high performance computing systems: a survey Frontiers of Computer Science 2023 17 4 174105
[39]
Wang C, Mohror K, and Snir M File system semantics requirements of HPC applications Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing 2021 19-30
[40]
Zhang C X, Li Y M, Zhang R, Qian W N, and Zhou A Y Scalable and quantitative contention generation for performance evaluation on OLTP databases Frontiers of Computer Science 2023 17 2 172202
[41]
Lamport L Malkhi D Time, clocks, and the ordering of events in a distributed system Concurrency: The Works of Leslie Lamport 2019 New York ACM 179-196
[42]
Wei B, Xiao L M, Song Y, Qin G J, Zhu J B, Yan B C, Wang C B, and Huo Z S A self-tuning client-side metadata prefetching scheme for wide area network file systems Science China Information Sciences 2022 65 3 132101
[43]
Zhou H, Qian W N, Zhou X, Dong Q W, Zhou A Y, and Tan W R Scalable and adaptive log manager in distributed systems Frontiers of Computer Science 2023 17 2 172205
[44]
Alibaba Alibaba elastic compute service 2023
[45]
HPC IO Benchmark Repository Mdtest parallel I/O benchmark 2023
[46]
Gupta A and Milojicic D Evaluation of HPC applications on cloud Proceedings of the 6th Open Cirrus Summit 2011 22-26
[47]
Wang C, Snir M, and Mohror K High performance computing application I/O traces 2020 Livermore Lawrence Livermore National Laboratory
[48]
Charapko A, Ailijiang A, and Demirbas M Linearizable quorum reads in Paxos Proceedings of the 11th USENIX Workshop on Hot Topics in Storage and File Systems 2019 8
[49]
Jens A Fio-flexible io tester 2014
[50]
Glass G, Gopalan A, Koujalagi D, Palicherla A, and Sakdeo S Logical synchronous replication in the tintri VMstore file system Proceedings of the 16th USENIX Conference on File and Storage Technologies 2018 295-308
[51]
Lampson B and Lomet D A new presumed commit optimization for two phase commit Proceedings of the 19th International Conference on Very Large Data Bases (VLDB’93) 1993 630-640
[52]
Liu J W, Shen H Y, Chi H M, Narman H S, Yang Y Y, Cheng L, and Chung W Y A low-cost multi-failure resilient replication scheme for high-data availability in cloud storage IEEE/ACM Transactions on Networking 2021 29 4 1436-1451
[53]
Haeberlen A, Mislove A, and Druschel P Glacier: highly durable, decentralized storage despite massive correlated failures Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation 2005 143-158
[54]
Liu J W and Shen H Y A popularity-aware cost-effective replication scheme for high data durability in cloud storage Proceedings of 2016 IEEE International Conference on Big Data (Big Data) 2016 384-389
[55]
Zhou J, Chen Y, Wang W P, He S B, and Meng D A highly reliable metadata service for large-scale distributed file systems IEEE Transactions on Parallel and Distributed Systems 2020 31 2 374-392
[56]
Stamatakis D, Tsikoudis N, Micheli E, and Magoutis K A general-purpose architecture for replicated metadata services in distributed file systems IEEE Transactions on Parallel and Distributed Systems 2017 28 10 2747-2759

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 19, Issue 1
Jan 2025
170 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 November 2024
Accepted: 05 December 2023
Received: 19 December 2022

Author Tags

  1. metadata management
  2. metadata replication
  3. consistency
  4. directory tree
  5. replica index

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media