Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Performance and Replica Consistency Simulation for Quorum-Based NoSQL System Cassandra

  • Conference paper
  • First Online:
Application and Theory of Petri Nets and Concurrency (PETRI NETS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10258))

Abstract

Distributed NoSQL systems such as Cassandra are popular nowadays. However, it is complicated and tedious to configure these systems to achieve their maximum performance for a given environment. This paper focuses on the application of a Coloured Petri Net-based simulation method on a quorum-based system, Cassandra. By analyzing the read and write process of Cassandra, we propose a CPN model, which can be used for performance analysis, optimization, and replica consistency detection. To help users understanding the NoSQL well, a CPN-based simulator called QuoVis is developed. Using QuoVis, users can visualize the read and write process of Cassandra, try different hardware parameters for performance simulation, optimizing system parameters such as timeout and data partitioning strategy, and detecting replica consistency. Experiments show our model fits the real Cassandra cluster well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://issues.apache.org/jira/browse/CASSANDRA-2434.

  2. 2.

    How to model the two stage is not shown in this paper, but the source code of the model and all the source codes of CPN models in this paper can be found from the Github website: https://github.com/jixuan1989/color-petri-net-cassandra.

  3. 3.

    http://cpntools.org/documentation/tasks/editing/constructs/queuesstacks.

  4. 4.

    A demo video is available from https://www.youtube.com/watch?v=nvIBDUubFvM.

  5. 5.

    We found the performance is terrible for simulating a large scale clusters, and thereby we have built a faster execution engine. The engine is not the focus of this paper so that we omit it.

References

  1. Aguilera-Mendoza, L., Llorente-Quesada, M.T.: Modeling and simulation of hadoop distributed file system in a cluster of workstations. In: Cuzzocrea, A., Maabout, S. (eds.) MEDI 2013. LNCS, vol. 8216, pp. 1–12. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41366-7_1

    Chapter  Google Scholar 

  2. Bailis, P., Ghodsi, A.: Eventual consistency today: limitations, extensions, and beyond. Commun. ACM 56(5), 55–63 (2013)

    Article  Google Scholar 

  3. Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Probabilistically bounded staleness for practical partial quorums. VLDB 5(8), 776–787 (2012)

    Google Scholar 

  4. Bao, X., Liu, L., Xiao, N., Zhou, Y., Zhang, Q.: Policy-driven configuration management for NoSQL. In: 2015 IEEE 8th International Conference on Cloud Computing (CLOUD), pp. 245–252, June 2015

    Google Scholar 

  5. Barbot, B., Kwiatkowska, M.: On quantitative modelling and verification of DNA walker circuits using stochastic Petri Nets. In: Devillers, R., Valmari, A. (eds.) PETRI NETS 2015. LNCS, vol. 9115, pp. 1–32. Springer, Cham (2015). doi:10.1007/978-3-319-19488-2_1

    Chapter  Google Scholar 

  6. Bermbach, D.: Benchmarking Eventually Consistent Distributed Storage Systems. KIT Scientific Publishing, Karlsruhe (2014)

    Google Scholar 

  7. Bermbach, D., Tai, S.: Eventual consistency: how soon is eventual? An evaluation of amazon s3’s consistency behavior. In: Proceedings of the 6th Workshop on Middleware for Service Oriented Computing, p. 1. ACM (2011)

    Google Scholar 

  8. Bushik, S.: A Vendor-Independent Comparison of NoSQL Databases: Cassandra, HBase, MongoDB, Riak. Network World, 22 October 2012

    Google Scholar 

  9. van Der Aalst, W.M., Ter Hofstede, A.H., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003)

    Article  Google Scholar 

  10. Di, S., Cappello, F.: GloudSim: Google trace based cloud simulator with virtual machines. Softw. Pract. Experience 45(11), 1571–1590 (2015)

    Article  Google Scholar 

  11. Feinberg, A.: Project voldemort: reliable distributed storage. In: Proceedings of the 10th IEEE International Conference on Data Engineering (2011)

    Google Scholar 

  12. Gaudel, Q., Ribot, P., Chanthery, E., Daigle, M.J.: Health monitoring of a planetary rover using hybrid particle Petri Nets. In: Kordon, F., Moldt, D. (eds.) PETRI NETS 2016. LNCS, vol. 9698, pp. 196–215. Springer, Cham (2016). doi:10.1007/978-3-319-39086-4_13

    Chapter  Google Scholar 

  13. Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles, pp. 150–162. ACM (1979)

    Google Scholar 

  14. Golab, W., Li, X., Shah, M.A.: Analyzing consistency properties for fun and profit. In: Proceedings of the 30th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pp. 197–206. ACM (2011)

    Google Scholar 

  15. Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytics. CIDR 11, 261–272 (2011)

    Google Scholar 

  16. Huang, X., Wang, J., Bai, J., Ding, G., Long, M.: Inherent replica inconsistency in Cassandra. In: 2014 IEEE International Congress on Big Data, pp. 740–747. IEEE (2014)

    Google Scholar 

  17. Huang, X., Wang, J., Zhong, Y., Song, S., Yu, P.S.: Optimizing data partition for scaling out NoSQL cluster. Concurrency Comput. Pract. Experience 17, 5793–5809 (2015)

    Article  Google Scholar 

  18. Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the 29th ACM Symposium on Theory of Computing, pp. 654–663. ACM (1997)

    Google Scholar 

  19. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  20. Liao, W., Hou, K., Zheng, Y., He, X.: Modeling and simulation of troubleshooting process for automobile based on Petri Net and flexsim. In: Qi, E., Shen, J., Dou, R. (eds.) The 19th International Conference on Industrial Engineering and Engineering Management, pp. 1141–1153. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38391-5_121

  21. Mace, J., Roelke, R., Fonseca, R.: Pivot tracing: dynamic causal monitoring for distributed systems. In: Proceedings of the 25th SOSP, pp. 378–393. ACM (2015)

    Google Scholar 

  22. Majidi, F., Harounabadi, A.: Presentation of an executable model for evaluation of software architecture using blackboard technique and formal models. JACST 4(1), 23–31 (2015)

    Article  Google Scholar 

  23. Montresor, A., Jelasity, M.: Peersim: A scalable P2P simulator. In: 2009 IEEE Ninth International Conference on Peer-to-Peer Computing, pp. 99–100. IEEE (2009)

    Google Scholar 

  24. Osman, R., Piazzolla, P.: Modelling replication in NoSQL datastores. In: Norman, G., Sanders, W. (eds.) QEST 2014. LNCS, vol. 8657, pp. 194–209. Springer, Cham (2014). doi:10.1007/978-3-319-10696-0_16

    Google Scholar 

  25. Shi, J., Qiu, Y., Minhas, U.F., Jiao, L., Wang, C., Reinwald, B., Özcan, F.: Clash of the titans: MapReduce vs. Spark for large scale data analytics. VLDB 8(13), 2110–2121 (2015)

    Google Scholar 

  26. Shi, J., Zou, J., Lu, J., Cao, Z., Li, S., Wang, C.: MRTuner: a toolkit to enable holistic optimization for MapReduce jobs. VLDB 7(13), 1319–1330 (2014)

    Google Scholar 

  27. Terry, D.B., Demers, A.J., Petersen, K., Spreitzer, M.J., Theimer, M.M., Welch, B.B.: Session guarantees for weakly consistent replicated data. In: 1994 Proceedings of the Third International Conference on Parallel and Distributed Information Systems, pp. 140–149. IEEE (1994)

    Google Scholar 

  28. Thomas, R.H.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst. (TODS) 4(2), 180–209 (1979)

    Article  Google Scholar 

  29. Wagenhals, L.W., Liles, S.W., Levis, A.H.: Toward executable architectures to support evaluation. In: 2009 International Symposium on Collaborative Technologies and Systems, pp. 502–511, May 2009

    Google Scholar 

  30. Wang, K., Kulkarni, A., Lang, M., Arnold, D., Raicu, I.: Using simulation to explore distributed key-value stores for extreme-scale system services. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 9. ACM (2013)

    Google Scholar 

  31. Westergaard, M.: Access/CPN 2.0: a high-level interface to coloured Petri Net models. In: Kristensen, L.M., Petrucci, L. (eds.) PETRI NETS 2011. LNCS, vol. 6709, pp. 328–337. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21834-7_19

    Chapter  Google Scholar 

Download references

Acknowledgement

The paper is supported by The National Key R&D Program of China (2016YFB1000701) and NSFC (61325008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangdong Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Huang, X., Wang, J., Qiao, J., Zheng, L., Zhang, J., Wong, R.K. (2017). Performance and Replica Consistency Simulation for Quorum-Based NoSQL System Cassandra. In: van der Aalst, W., Best, E. (eds) Application and Theory of Petri Nets and Concurrency. PETRI NETS 2017. Lecture Notes in Computer Science(), vol 10258. Springer, Cham. https://doi.org/10.1007/978-3-319-57861-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57861-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57860-6

  • Online ISBN: 978-3-319-57861-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics