Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/378420.378824acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

The structural cause of file size distributions

Published: 01 June 2001 Publication History

Abstract

We propose a user model that explains the shape of the distribution of file sizes in local file systems and in the World Wide Web. We examine evidence from 562 file systems, 38 web clients and 6 web servers, and find that the model is a good description of these systems. These results cast doubt on the widespread view that the distribution of file sizes is long-tailed and that long-tailed distributions are the cause of self-similarity in the Internet.

References

[1]
M. E. Crovella and A. Bestavros. Self-similarity in world-wide web traffic: evidence and possible causes. In ACM SIGMETRICS'96, pages 160-169, May 1996.
[2]
M. E. Crovella, M. S. Taqqu, and A. Bestavros. Heavy-tailed probability distributions in the World Wide Web. In A Practical Cuide To Heavy Tails, pages 3-26. Chapman & Hall, 1998.
[3]
A. Feldmann, A. C. Gilbert, P. Huang, and W. Willinger. Dynamics of IP traffic: a study of the role of variability and the impact of control. In ACM SICCOMM'99, pages 301-313, 1999.
[4]
G. Irlam. Unix file size survey, http://www, base. com/gordoni/uf s93. html. Accessed 31 May 2000, 1994.
[5]
M. Parulekar and A. M. Makowski. M/G/co input process: a versatile class of models for network traffic. Technical Report T.R. 96-59, Institute for Systems Research, 1996.
[6]
V. Paxson and S. Floyd. Wide-area traffic: the failure of Poisson modeling. IEEEf ACM Transactions on Networking, 3:226-244, 1995.
[7]
W. Willinger, M. S. Taqqu, R. Sherman, and D. V. Wilson. Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level. In ACM SICCOMM'95, pages 100-113, 1995.

Cited By

View all
  • (2019)A generic approach to scheduling and checkpointing workflowsThe International Journal of High Performance Computing Applications10.1177/1094342019866891(109434201986689)Online publication date: 12-Aug-2019
  • (2017)File systems fated for senescence? nonsense, says science!Proceedings of the 15th Usenix Conference on File and Storage Technologies10.5555/3129633.3129639(45-58)Online publication date: 27-Feb-2017
  • (2013)Efficient Operational Management of Enterprise File Server with File Size Distribution ModelIAENG Transactions on Engineering Technologies10.1007/978-94-007-6818-5_42(599-609)Online publication date: 12-Sep-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMETRICS '01: Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
June 2001
347 pages
ISBN:1581133340
DOI:10.1145/378420
  • Chairman:
  • Mary Vernon
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2001

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. file sizes
  2. long-tailed distributions
  3. self-similarity

Qualifiers

  • Article

Conference

SIGMETRICS01
Sponsor:

Acceptance Rates

SIGMETRICS '01 Paper Acceptance Rate 29 of 233 submissions, 12%;
Overall Acceptance Rate 459 of 2,691 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)A generic approach to scheduling and checkpointing workflowsThe International Journal of High Performance Computing Applications10.1177/1094342019866891(109434201986689)Online publication date: 12-Aug-2019
  • (2017)File systems fated for senescence? nonsense, says science!Proceedings of the 15th Usenix Conference on File and Storage Technologies10.5555/3129633.3129639(45-58)Online publication date: 27-Feb-2017
  • (2013)Efficient Operational Management of Enterprise File Server with File Size Distribution ModelIAENG Transactions on Engineering Technologies10.1007/978-94-007-6818-5_42(599-609)Online publication date: 12-Sep-2013
  • (2012)Delay tails in MapReduce schedulingACM SIGMETRICS Performance Evaluation Review10.1145/2318857.225476140:1(5-16)Online publication date: 11-Jun-2012
  • (2012)Delay tails in MapReduce schedulingProceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems10.1145/2254756.2254761(5-16)Online publication date: 11-Jun-2012
  • (2011)Double Pareto Lognormal Distributions in Complex NetworksHandbook of Optimization in Complex Networks10.1007/978-1-4614-0754-6_3(55-80)Online publication date: 29-Sep-2011
  • (2010)Modulated Branching Processes, Origins of Power Laws, and Queueing DualityMathematics of Operations Research10.1287/moor.1100.046435:4(807-829)Online publication date: 1-Nov-2010
  • (2009)Internet Search Result Probabilities: Heaps' Law and Word Associativity*Journal of Quantitative Linguistics10.1080/0929617080251415316:1(40-66)Online publication date: Feb-2009
  • (2008)Distributed, large-scale latent semantic analysis by index interpolationProceedings of the 3rd international conference on Scalable information systems10.5555/1459693.1459718(1-10)Online publication date: 4-Jun-2008
  • (2007)A five-year study of file-system metadataACM Transactions on Storage10.1145/1288783.12887883:3(9-es)Online publication date: 1-Oct-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media