Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Understanding I/O Performance Behaviors of Cloud Storage from a Client’s Perspective

Published: 22 May 2017 Publication History

Abstract

Cloud storage has gained increasing popularity in the past few years. In cloud storage, data is stored in the service provider’s data centers, and users access data via the network. For such a new storage model, our prior wisdom about conventional storage may not remain valid nor applicable to the emerging cloud storage. In this article, we present a comprehensive study to gain insight into the unique characteristics of cloud storage and optimize user experiences with cloud storage from a client’s perspective. Unlike prior measurement work that mostly aims to characterize cloud storage providers or specific client applications, we focus on analyzing the effects of various client-side factors on the user-experienced performance. Through extensive experiments and quantitative analysis, we have obtained several important findings. For example, we find that (1) a proper combination of parallelism and request size can achieve optimized bandwidths, (2) a client’s capabilities and geographical location play an important role in determining the end-to-end user-perceivable performance, and (3) the interference among mixed cloud storage requests may cause performance degradation. Based on our findings, we showcase a sampling- and inference-based method to determine a proper combination for different optimization goals. We further present a set of case studies on client-side chunking and parallelization for typical cloud-based applications. Our studies show that specific attention should be paid to fully exploiting the capabilities of clients and the great potential of cloud storage services.

References

[1]
Hussam Abu-Libdeh, Lonnie Princehouse, and Hakim Weatherspoon. 2010. RACS: A case for cloud storage diversity. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Indianapolis, IN.
[2]
Amazon. 2010. Amazon S3 Object Size Limit Now 5 TB. Retrieved from https://aws.amazon.com/blogs/aws/amazon-s3-object-size-limit/.
[3]
Amazon. 2015a. Amazon EBS. Retrieved from https://aws.amazon.com/ebs/.
[4]
Amazon. 2015b. Amazon EFS. Retrieved from https://aws.amazon.com/efs/.
[5]
Amazon. 2015c. Amazon S3. Retrieved from https://aws.amazon.com/s3/.
[6]
Amazon. 2015d. Amazon S3 TCP Window Scaling. Retrieved from http://docs.aws.amazon.com/AmazonS3/latest/dev/TCPWindowScaling.html.
[7]
Andreas Bergen, Yvonne Coady, and Rick McGeer. 2011. Client bandwidth: The forgotten metric of online storage providers. In Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim’11).
[8]
Ignacio Bermudez, Stefano Traverso, Marco Mellia, and Maurizio Munafo. 2013. Exploring the cloud from passive measurement: The amazon AWS case. In Proceedings of the 32nd IEEE International Conference on Computer Communications (INFOCOM’13).
[9]
Bessani, Alysson, Ricardo Mendes, Tiago Oliveira, Nuno Neves, Miguel Correia, Marcelo Pasin, and Paulo Verissimo. 2014. SCFS: A shared cloud-backed file system. In Proceedings of the 2014 USENIX Annual Technical Conference (ATC’14).
[10]
Enrico Bocchi, Idilio Drago, and Marco Mellia. 2015. Personal cloud storage benchmarks and comparison. IEEE Transactions on Cloud Computing 99 (2015), 1--14.
[11]
Enrico Bocchi, Marco Mellia, and Sofiane Sarni. 2014. Cloud storage service benchmarking: Methodologies and experimentations. In Proceedings of the 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet’14). 395--400.
[12]
Nicolas Bonvin, Thanasis G. Papaioannou, and Karl Aberer. 2010. A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10).
[13]
Boto. 2015a. An Introduction to Boto’s S3 Interface. Retrieved from http://boto.readthedocs.org/en/latest/s3_tut.html.
[14]
Boto. 2015b. S3 API Reference. https://boto.readthedocs.org/en/latest/ref/s3.html.
[15]
Calder Brad, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam McKelvie, and Yikang Xu et al. 2011. Windows azure storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 119--132.
[16]
Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’09). ACM Press.
[17]
Feng Chen, Michael P. Mesnier, and Scott Hahn. 2014. Client-aware cloud storage. In Proceedings of the 30th International Conference on Massive Storage Systems and Technology (MSST’14).
[18]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). ACM Press.
[19]
Yong Cui, Zeqi Lai, Xin Wang, Ningwei Dai, and Congcong Miao. 2015. QuickSync: Improving synchronization efficiency for mobile cloud storage services. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (MobiCom’15). 582--603.
[20]
Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. 2007. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In Proceedings of the 2007 USENIX Annual Technical Conference (ATC’07). USENIX Association.
[21]
Yuan Dong, Jinzhan Peng, Dawei Wang, Haiyang Zhu, Fang Wang, Sun C. Chan, and Michael P. Mesnier. 2011. RFS: A network file system for mobile devices and the cloud. In SIGOPS Operating System Review, Vol. 45. 101--111.
[22]
Idilio Drago, Enrico Bocchi, Marco Mellia, Herman Slatman, and Aiko Pras. 2013. Benchmarking personal cloud storage. In Proceedings of the 2013 ACM Conference on Internet Measurement Conference (IMC’13).
[23]
Idilio Drago, Enrico Bocchi, Macro Mellia, Herman Slatman, and Aiko Pras. 2014. Modeling the dropbox client behavior. In Proceedings of the 2014 IEEE International Conference on Communications (ICC’14).
[24]
Idilio Drago, Marco Mellia, Maurizio M. Munafo, Anna Sperotto, Ramin Sadre, and Aiko Pras. 2012. Inside dropbox: Understanding personal cloud storage services. In Proceedings of the 2012 ACM Conference on Internet Measurement Conference (IMC’12).
[25]
Dropbox. 2015. Dropbox. Retrieved from https://www.dropbox.com/.
[26]
Daniel Ellard, Jonathan Ledlie, Pia Malkani, and Margo Seltzer. 2003. Passive NFS tracing of email and research workloads. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association.
[27]
D. Ford, F. Labelle, Florentina I. Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10).
[28]
Google. 2015. Google Drive. https://www.google.com/drive/.
[29]
Raul Gracia-Tinedo, Marc Sanchez Artigas, Adrian Moreno-Martinez, Cristian Cotes, and Pedro Garcia Lopez. 2013. Actively measuring personal cloud storage. Proceedings of the 2013 IEEE 6th International Conference on Cloud Computing (CLOUD’13). 301--308.
[30]
Ajay Gulati, Ganesha Shanmuganathan, Irfan Ahmad, Carl Waldspurger, and Mustafa Uysal. 2011. Pesto: Online storage performance management in virtualized datacenters. In Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC’11).
[31]
Keqiang He, Alexis Fisher, Liang Wang, Aaron Gember, Aditya Akella, and Thomas Ristenpart. 2013. Next stop, the cloud: understanding modern web service deployment in EC2 and azure. In Proceedings of the 2013 Conference on Internet Measurement Conference (IMC’13). ACM, 177--190.
[32]
Brett D. Higgins, Jason Flinn, T. J. Giuli, Brian Noble, Christopher Peplin, and David Watson. 2012. Informed mobile prefetching. In Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services (MobiSys’12). 155--158.
[33]
Binbing Hou, Feng Chen, Zhonghong Ou, Ren Wang, and Michael Mesnier. 2016. Understanding I/O performance behaviors of cloud storage from a clients perspective. In Proceedings of the 32nd International Conference on Massive Storage Systems and Technology (MSST’16).
[34]
Wenjin Hu, Tao Yang, and Jeanna N. Matthews. 2010. The good, the bad and the ugly of consumer cloud storage. In ACM SIGOPS Operating Systems Review, Vol. 44:3.
[35]
Yuchong Hu, Henry C. H. Chen, Patrick P.C. Lee, and Yang Tang. 2012. NCCloud: Applying network coding for the storage repair in a cloud-of-clouds. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12).
[36]
Liu Huan. 2002. A trace driven study of packet level parallelism. In Proceedings of the IEEE International Conference on Communications (ICC’02) 4, 1, 2191--2195.
[37]
IHS. 2012. Subscriptions to Cloud Storage Services to Reach Half-Billion Level This Year. Retrieved from https://technology.ihs.com/410084/subscriptions-to-cloud-storage-services-to-reach-half-billion-level-this-year.
[38]
Van Jacobson, Robert Braden, Dave Borman, M. Satyanarayanan, J. J. Kistler, L. B. Mummert, and M. R. Ebling. 1992. RFC 1323: TCP Extensions for High Performance.
[39]
Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang. 2005. DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). USENIX Association.
[40]
Ang Li, Xiaowei Yang, and Ming Zhang. 2010. CloudCmp: Comparing public cloud providers. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC’10). ACM Press.
[41]
Zhenhua Li, Christo Wilson, Zhefu Jiang, Yao Liu, Ben Y. Zhao, Cheng Jin, Zhi-Li Zhang, and Yafei Dai. 2013. Efficient batched synchronization in dropbox-like cloud storage services. In Proceedings of International Middleware Conference (Middleware’13).
[42]
Thomas Mager, Ernst Biersack, and Pietro Michiardi. 2012. A measurement study of the wuala on-line storage service. In Proceedings of the 12th IEEE International Conference on Peer-to-Peer Computing (P2P’12).
[43]
MarketsandMarkets. 2015. Cloud Storage Market by Solutions. (August 2015). http://www.marketsandmarkets.com/Market-Reports/cloud-storage-market-902.html.
[44]
Xiaofeng Meng, Ying Chen, Jianliang Xu, and Jiaheng Lu. 2010. Benchmarking cloud-based data management systems. In Proceedings of the 2nd International Workshop on Cloud Data Management in Cloud Systems (CloudDB’10). ACM Press
[45]
Microsoft. 2015. OneDrive. https://onedrive.live.com/.
[46]
OpenStack. 2011. OpenStack Swift. http://www.openstack.org/.
[47]
Zhonghong Ou, Zhen-Huan Hwang, Antti Ylä-Jääski, Feng Chen, and Ren Wang. 2015. Is cloud storage ready? A comprehensive study of IP-based storage systems. In Proceedings of the 8th IEEE/ACM International Conference on Utility and Cloud Computing (UCC’15).
[48]
S3Backer. 2015. S3Backer. Retrieved from https://code.google.com/p/s3backer/.
[49]
S3FS. 2015. S3FS. Retrieved from https://code.google.com/p/s3fs/.
[50]
Michael Vrable, Stefan Savage, and Geoffrey M. Voelker. 2012. BlueSky: A cloud-backed file system for the enterprise. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12).
[51]
Haiyang Wang, Ryan Shea, Feng Wang, and Jiangchuan Liu. 2012. On the impact of virtualization on dropbox-like cloud file storage/synchronization services. In Proceedings of International Workshop on Quality of Service (IWQoS’12).
[52]
Zhe Wu, Curtis Yu, and Harsha V. Madhyastha. 2015. CosTLO: Cost-effective redundancy for lower latency variance on cloud storage services. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI’15). 543--557.
[53]
Rui Zhang, Ramani Routray, David Eyers, David Chambliss, Prasenjit Sarkar, Douglas Willcocks, and Peter Pietzuch. 2011. IO tetris: Deep storage consolidation for the cloud via fine-grained workload analysis. In Proceedings of the 4th International IEEE Conference on Cloud Computing (CLOUD’11).
[54]
Yupu Zhang, Charis Dragga, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. ViewBox: Integrating local file systems with cloud storage services. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 119--132.

Cited By

View all
  • (2025)ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge PartitioningIEEE Transactions on Computers10.1109/TC.2024.347556874:1(116-130)Online publication date: 1-Jan-2025
  • (2024)IoLens: Visual Analytics System for Exploring Storage I/O Tracking Process2024 IEEE 17th Pacific Visualization Conference (PacificVis)10.1109/PacificVis60374.2024.00046(325-330)Online publication date: 23-Apr-2024
  • (2022)BSCache: A Brisk Semantic Caching Scheme for Cloud-based Performance Monitoring Timeseries SystemsProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3546183(1-10)Online publication date: 29-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage
ACM Transactions on Storage  Volume 13, Issue 2
Special Issue on MSST 2016 and Regular Papers
May 2017
199 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/3098275
  • Editor:
  • Sam H. Noh
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 2017
Accepted: 01 March 2017
Revised: 01 February 2017
Received: 01 July 2016
Published in TOS Volume 13, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cloud storage
  2. measurement
  3. performance analysis
  4. performance optimization
  5. storage systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)164
  • Downloads (Last 6 weeks)18
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge PartitioningIEEE Transactions on Computers10.1109/TC.2024.347556874:1(116-130)Online publication date: 1-Jan-2025
  • (2024)IoLens: Visual Analytics System for Exploring Storage I/O Tracking Process2024 IEEE 17th Pacific Visualization Conference (PacificVis)10.1109/PacificVis60374.2024.00046(325-330)Online publication date: 23-Apr-2024
  • (2022)BSCache: A Brisk Semantic Caching Scheme for Cloud-based Performance Monitoring Timeseries SystemsProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3546183(1-10)Online publication date: 29-Aug-2022
  • (2022)CSEdge: Enabling Collaborative Edge Storage for Multi-Access Edge Computing Based on BlockchainIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.313168033:8(1873-1887)Online publication date: 1-Aug-2022
  • (2022)Radio: Reconciling Disk I/O Interference in a Para-virtualized Cloud2022 IEEE 15th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD55607.2022.00034(144-156)Online publication date: Jul-2022
  • (2021)Multi-objective Optimization of Data Placement in a Storage-as-a-Service Federated CloudACM Transactions on Storage10.1145/345274117:3(1-32)Online publication date: 16-Aug-2021
  • (2021)Enabling Conflict-free Collaborations with Cloud Storage Services2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS53394.2021.00082(615-621)Online publication date: Dec-2021
  • (2021)Modeling Inter-process Dynamics in Competitive Temporal Point ProcessesJournal of the Indian Institute of Science10.1007/s41745-021-00224-6Online publication date: 9-Jul-2021
  • (2019)Learning Network Traffic Dynamics Using Temporal Point ProcessIEEE INFOCOM 2019 - IEEE Conference on Computer Communications10.1109/INFOCOM.2019.8737622(1927-1935)Online publication date: Apr-2019
  • (2018)Fine Granularity and Adaptive Cache Update Mechanism for Client CachingIEEE Systems Journal10.1109/JSYST.2018.2866905(1-12)Online publication date: 2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media