Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1807128.1807152acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Benchmarking cloud serving systems with YCSB

Published: 10 June 2010 Publication History

Abstract

While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud OLTP" applications, though they typically do not support ACID transactions. Examples of systems proposed for cloud serving use include BigTable, PNUTS, Cassandra, HBase, Azure, CouchDB, SimpleDB, Voldemort, and many others. Further, they are being applied to a diverse range of applications that differ considerably from traditional (e.g., TPC-C like) serving workloads. The number of emerging cloud serving systems and the wide range of proposed applications, coupled with a lack of apples-to-apples performance comparisons, makes it difficult to understand the tradeoffs between systems and the workloads for which they are suited. We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple sharded MySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible--it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.

References

[1]
Amazon SimpleDB. http://aws.amazon.com/simpledb/.
[2]
Apache Cassandra. http://incubator.apache.org/cassandra/.
[3]
Apache CouchDB. http://couchdb.apache.org/.
[4]
Apache HBase. http://hadoop.apache.org/hbase/.
[5]
Dynomite Framework. http://wiki.github.com/cliffmoon/-dynomite/dynomite-framework.
[6]
Google App Engine. http://appengine.google.com.
[7]
Hypertable. http://www.hypertable.org/.
[8]
mongodb. http://www.mongodb.org/.
[9]
Project Voldemort. http://project-voldemort.com/.
[10]
Solaris FileBench. http://www.solarisinternals.com/wiki/index.php/FileBench.
[11]
SQL Data Services/Azure Services Platform. http://www.microsoft.com/azure/data.mspx.
[12]
Storage Performance Council. http://www.storageperformance.org/home.
[13]
Yahoo! Query Language. http://developer.yahoo.com/yql/.
[14]
A. Arasu et al. Linear Road: a stream data management benchmark. In VLDB, 2004.
[15]
F. C. Botelho, D. Belazzougui, and M. Dietzfelbinger. Compress, hash and displace. In Proc. of the 17th European Symposium on Algorithms, 2009.
[16]
F. Chang et al. Bigtable: A distributed storage system for structured data. In OSDI, 2006.
[17]
B. F. Cooper et al. PNUTS: Yahoo!'s hosted data serving platform. In VLDB, 2008.
[18]
G. DeCandia et al. Dynamo: Amazon's highly available key-value store. In SOSP, 2007.
[19]
D. J. DeWitt. The Wisconsin Benchmark: Past, present and future. In J. Gray, editor, The Benchmark Handbook. Morgan Kaufmann, 1993.
[20]
I. Eure. Looking to the future with Cassandra. http://blog.digg.com/?p=966.
[21]
S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2):51--59, 2002.
[22]
J. Gray, editor. The Benchmark Handbook For Database and Transaction Processing Systems. Morgan Kaufmann, 1993.
[23]
J. Gray et al. Quickly generating billion-record syntheti databases. In SIGMOD, 1994.
[24]
A. Lakshman, P. Malik, and K. Ranganathan. Cassandra: A structured storage system on a P2P network. In SIGMOD, 2008.
[25]
B. C. Ooi and S. Parthasarathy. Special issue on data management on cloud computing platforms. IEEE Data Engineering Bul letin, vol. 32, 2009.
[26]
A. Pavlo et al. A comparison of approaches to large-scale data analysis. In SIGMOD, 2009.
[27]
R. Rawson. HBase intro. In NoSQL Oakland, 2009.
[28]
A. Schmidt et al. Xmark: A benchmark for XML data management. In VLDB, 2002.
[29]
R. Sears, M. Callaghan, and E. Brewer. Rose: Compressed, log-structured replication. In VLDB, 2008.
[30]
M. Seltzer, D. Krinsky, K. A. Smith, and X. Zhang. The case for application-specific benchmarking. In Proc. HotOS, 1999.
[31]
P. Shivam et al. Cutting corners: Workbench automation for server benchmarking. In Proc. USENIX Annual Technical Conference, 2008.
[32]
M. Stonebraker et al. C-store: a column-oriented DBMS. In VLDB, 2005.
[33]
B. White et al. An integrated experimental environment for distributed systems and networks. In OSDI, 2002.
[34]
K. Yocum et al. Scalability and accuracy in a large-scale network emulator. In OSDI, 2002.

Cited By

View all
  • (2025)Cataphract: A Batch Processing Method Specialized for BFT DatabasesInternational Journal of Networking and Computing10.15803/ijnc.15.1_3215:1(32-50)Online publication date: 2025
  • (2025)HLN-Tree: A memory-efficient B+-Tree with huge leaf nodes and locality predictorsACM Transactions on Storage10.1145/3707641Online publication date: 6-Jan-2025
  • (2025)Enabling High Performance and Resource Utilization in Clustered Cache via Hotness Identification, Data Copying, and Instance MergingIEEE Transactions on Computers10.1109/TC.2024.347799474:2(371-385)Online publication date: Feb-2025
  • Show More Cited By

Index Terms

  1. Benchmarking cloud serving systems with YCSB

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SoCC '10: Proceedings of the 1st ACM symposium on Cloud computing
    June 2010
    264 pages
    ISBN:9781450300360
    DOI:10.1145/1807128
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 June 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. benchmarking
    2. cloud serving database

    Qualifiers

    • Research-article

    Conference

    SOCC '10
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 169 of 722 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)703
    • Downloads (Last 6 weeks)63
    Reflects downloads up to 23 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Cataphract: A Batch Processing Method Specialized for BFT DatabasesInternational Journal of Networking and Computing10.15803/ijnc.15.1_3215:1(32-50)Online publication date: 2025
    • (2025)HLN-Tree: A memory-efficient B+-Tree with huge leaf nodes and locality predictorsACM Transactions on Storage10.1145/3707641Online publication date: 6-Jan-2025
    • (2025)Enabling High Performance and Resource Utilization in Clustered Cache via Hotness Identification, Data Copying, and Instance MergingIEEE Transactions on Computers10.1109/TC.2024.347799474:2(371-385)Online publication date: Feb-2025
    • (2025)Dhcache: a dual-hash cache for optimizing the read performance in key-value storeThe Journal of Supercomputing10.1007/s11227-024-06828-w81:2Online publication date: 19-Jan-2025
    • (2025)Scalable Data Management on Next-Generation Data Center NetworksScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_8(199-221)Online publication date: 24-Jan-2025
    • (2025)MxKernel: A Bare-Metal Runtime System for Database Operations on Heterogeneous Many-Core HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_5(117-143)Online publication date: 24-Jan-2025
    • (2024)Sync+SyncProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699088(3349-3366)Online publication date: 14-Aug-2024
    • (2024)Taming hot bloat under virtualization with HUGESCOPEProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692053(999-1012)Online publication date: 10-Jul-2024
    • (2024)MangosteenProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692041(799-815)Online publication date: 10-Jul-2024
    • (2024)UniMemProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692021(463-477)Online publication date: 10-Jul-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media