Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1404014.1404030guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Measurement and analysis of large-scale network file system workloads

Published: 22 June 2008 Publication History

Abstract

In this paper we present the analysis of two large-scale network file system workloads. We measured CIFS traffic for two enterprise-class file servers deployed in the NetApp data center for a three month period. One file server was used by marketing, sales, and finance departments and the other by the engineering department. Together these systems represent over 22TB of storage used by over 1500 employees, making this the first ever large-scale study of the CIFS protocol.
We analyzed how our network file system workloads compared to those of previous file system trace studies and took an in-depth look at access, usage, and sharing patterns. We found that our workloads were quite different from those previously studied; for example, our analysis found increased read-write file access patterns, decreased read-write ratios, more randomfile access, and longer file lifetimes. In addition, we found a number of interesting properties regarding file sharing, file re-use, and the access patterns of file types and users, showing that modern file system workload has changed in the past 5-10 years. This change in workload characteristics has implications on the future design of network file systems, which we describe in the paper.

References

[1]
A. Aggarwal and K. Auerbach. Protocol standard for a netbios service on a tcp/udp transport. IETF Network Working Group RFC 1001, March 1987.
[2]
N. Agrawal, et al. A five-year study of file-system metadata. In Proc. of FAST'07, Feb. 2007.
[3]
M. G. Baker, et al. Measurements of a distributed file system. In Proc. SOSP '91, Oct. 1991.
[4]
J. R. Douceur and W. J. Bolosky. A large-scale study of file-system contents. In Proc. of SIGMETRICS '99, 1999.
[5]
D. Ellard, et al. Passive NFS tracing of email and research workloads. In Proc. of FAST '03. 2003.
[6]
D. Ellard, et al. Attribute-based prediction of file properties. Technical Report TR-14-03, Harvard, 2004.
[7]
K. Evans and G. H. Kuenning. A study of irregularities in file-size distributions. In Proceedings of SPECTS '02.
[8]
T. J. Gibson and E. L. Miller. Long-term file activity patterns in a UNIX workstation environment. In Proc. of the 15th IEEE Symposium on Mass Storage Systems, pages 355-372, Mar. 1998.
[9]
C. Gini. Measurement of inequality and incomes. The Economic Journal, 31:124-126, 1921.
[10]
S. Gribble, et al. Self-similarity in file systems: Measurement and applications. In Proc. of SIGMETRICS '98.
[11]
D. Hitz, J. Lau, and M. Malcom. File system design for an NFS file server appliance. In Proc. of USENIX '94.
[12]
J. J. Kistler and M. Satyanarayanan. Disconnected operation in the Coda file system. ACM ToCS, 10(1), 1992.
[13]
P. J. Leach and D. C. Naik. A common internet file system (cifs/1.0) protocol. IETF Network Working Group RFC Draft, March 1997.
[14]
M. O. Lorenz. Methods of measuring the concentration of wealth. Publications of the American Statistical Associations , 9:209-219, 1905.
[15]
J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability , pages 281-297, 1967. University of California Press.
[16]
M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry. A fast file system for UNIX. ACM ToCS, 2(3), Aug. 1984.
[17]
M. Mesnier, et al. File classification in self-* storage systems. In Proc. of ICAC '04.
[18]
S. J. Mullender and A. S. Tanenbaum. Immediate files. Software-Practice and Experience, 14(4), April 1984.
[19]
M. N. Nelson, B. B. Welch, and J. K. Ousterhout. Caching in the Sprite network file system. ACM ToCS, 6(1), 1988.
[20]
B. C. Neumann, et al. Kerberos: An authentication service for open network systems. In Proc. of USENIX '88.
[21]
J. K. Ousterhout, et al. A trace-driven analysis of the Unix 4.2 BSD file system. In Proc. of SOSP '85, 1985.
[22]
K. K. Ramakrishnan, P. Biswas, and R. Karedla. Analysis of file i/o traces in commercial computing environments. In Proc. of SIGMETRICS '92, 1992.
[23]
A. Riska and E. Riedel. Disk drive level workload characterization. In Proc. of USENIX '06, May 2006.
[24]
D. Roselli, J. Lorch, and T. Anderson. A comparison of file system workloads. In Proc. of USENIX '00.
[25]
M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured file system. ACM ToCS, 10(1):26-52, Feb. 1992.
[26]
M. Satyanarayanan. A study of file sizes and functional lifetimes. In Proc. of SOSP '81, Dec. 1981.
[27]
Spec benchmarks. http://www.spec.org/benchmarks.html.
[28]
Tcpdump/libpcap. http://www.tcpdump.org/.
[29]
W. Vogels. File system usage in Windows NT 4.0. In Proc. of SOSP '99, Dec. 1999.
[30]
Wireshark: Go deep. http://www.wireshark.org/.
[31]
T. M. Wong and J. Wilkes. My cache or yours? making storage more exclusive. In Proc. of USENIX '02.
[32]
M. Zhou and A. J. Smith. Analysis of personal computer workloads. In Proc. of MASCOTS '99, 1999.

Cited By

View all
  • (2024)Exploiting Flat Namespace to Improve File System Metadata Performance on Ultra-Fast, Byte-Addressable NVMsACM Transactions on Storage10.1145/362067320:1(1-47)Online publication date: 30-Jan-2024
  • (2023)λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless FunctionsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624765(394-411)Online publication date: 25-Mar-2023
  • (2023)Oasis: Controlling Data Migration in Expansion of Object-based Storage SystemsACM Transactions on Storage10.1145/356842419:1(1-22)Online publication date: 19-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ATC'08: USENIX 2008 Annual Technical Conference
June 2008
432 pages

Publisher

USENIX Association

United States

Publication History

Published: 22 June 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploiting Flat Namespace to Improve File System Metadata Performance on Ultra-Fast, Byte-Addressable NVMsACM Transactions on Storage10.1145/362067320:1(1-47)Online publication date: 30-Jan-2024
  • (2023)λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless FunctionsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624765(394-411)Online publication date: 25-Mar-2023
  • (2023)Oasis: Controlling Data Migration in Expansion of Object-based Storage SystemsACM Transactions on Storage10.1145/356842419:1(1-22)Online publication date: 19-Jan-2023
  • (2023)CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical SectionsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587443(331-346)Online publication date: 8-May-2023
  • (2021)Client layer becomes bottleneckProceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing Companion10.1145/3492323.3495625(1-6)Online publication date: 6-Dec-2021
  • (2021)LunuleProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476196(1-16)Online publication date: 14-Nov-2021
  • (2021)Lightweight Dynamic Redundancy Control with Adaptive Encoding for Server-based StorageACM Transactions on Storage10.1145/345629217:4(1-38)Online publication date: 15-Oct-2021
  • (2020)A trace-based study of SMB network file system workloads in an academic enterpriseProceedings of the 2020 Summer Simulation Conference10.5555/3427510.3427552(1-8)Online publication date: 20-Jul-2020
  • (2020)Characterizing, modeling, and benchmarking RocksDB key-value workloads at facebookProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386712(209-224)Online publication date: 24-Feb-2020
  • (2020)MAPXProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386693(1-12)Online publication date: 24-Feb-2020
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media