Abstract
Distributed NoSQL systems such as Cassandra are popular nowadays. However, it is complicated and tedious to configure these systems to achieve their maximum performance for a given environment. This paper focuses on the application of a Coloured Petri Net-based simulation method on a quorum-based system, Cassandra. By analyzing the read and write process of Cassandra, we propose a CPN model, which can be used for performance analysis, optimization, and replica consistency detection. To help users understanding the NoSQL well, a CPN-based simulator called QuoVis is developed. Using QuoVis, users can visualize the read and write process of Cassandra, try different hardware parameters for performance simulation, optimizing system parameters such as timeout and data partitioning strategy, and detecting replica consistency. Experiments show our model fits the real Cassandra cluster well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
How to model the two stage is not shown in this paper, but the source code of the model and all the source codes of CPN models in this paper can be found from the Github website: https://github.com/jixuan1989/color-petri-net-cassandra.
- 3.
- 4.
A demo video is available from https://www.youtube.com/watch?v=nvIBDUubFvM.
- 5.
We found the performance is terrible for simulating a large scale clusters, and thereby we have built a faster execution engine. The engine is not the focus of this paper so that we omit it.
References
Aguilera-Mendoza, L., Llorente-Quesada, M.T.: Modeling and simulation of hadoop distributed file system in a cluster of workstations. In: Cuzzocrea, A., Maabout, S. (eds.) MEDI 2013. LNCS, vol. 8216, pp. 1–12. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41366-7_1
Bailis, P., Ghodsi, A.: Eventual consistency today: limitations, extensions, and beyond. Commun. ACM 56(5), 55–63 (2013)
Bailis, P., Venkataraman, S., Franklin, M.J., Hellerstein, J.M., Stoica, I.: Probabilistically bounded staleness for practical partial quorums. VLDB 5(8), 776–787 (2012)
Bao, X., Liu, L., Xiao, N., Zhou, Y., Zhang, Q.: Policy-driven configuration management for NoSQL. In: 2015 IEEE 8th International Conference on Cloud Computing (CLOUD), pp. 245–252, June 2015
Barbot, B., Kwiatkowska, M.: On quantitative modelling and verification of DNA walker circuits using stochastic Petri Nets. In: Devillers, R., Valmari, A. (eds.) PETRI NETS 2015. LNCS, vol. 9115, pp. 1–32. Springer, Cham (2015). doi:10.1007/978-3-319-19488-2_1
Bermbach, D.: Benchmarking Eventually Consistent Distributed Storage Systems. KIT Scientific Publishing, Karlsruhe (2014)
Bermbach, D., Tai, S.: Eventual consistency: how soon is eventual? An evaluation of amazon s3’s consistency behavior. In: Proceedings of the 6th Workshop on Middleware for Service Oriented Computing, p. 1. ACM (2011)
Bushik, S.: A Vendor-Independent Comparison of NoSQL Databases: Cassandra, HBase, MongoDB, Riak. Network World, 22 October 2012
van Der Aalst, W.M., Ter Hofstede, A.H., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003)
Di, S., Cappello, F.: GloudSim: Google trace based cloud simulator with virtual machines. Softw. Pract. Experience 45(11), 1571–1590 (2015)
Feinberg, A.: Project voldemort: reliable distributed storage. In: Proceedings of the 10th IEEE International Conference on Data Engineering (2011)
Gaudel, Q., Ribot, P., Chanthery, E., Daigle, M.J.: Health monitoring of a planetary rover using hybrid particle Petri Nets. In: Kordon, F., Moldt, D. (eds.) PETRI NETS 2016. LNCS, vol. 9698, pp. 196–215. Springer, Cham (2016). doi:10.1007/978-3-319-39086-4_13
Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles, pp. 150–162. ACM (1979)
Golab, W., Li, X., Shah, M.A.: Analyzing consistency properties for fun and profit. In: Proceedings of the 30th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pp. 197–206. ACM (2011)
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytics. CIDR 11, 261–272 (2011)
Huang, X., Wang, J., Bai, J., Ding, G., Long, M.: Inherent replica inconsistency in Cassandra. In: 2014 IEEE International Congress on Big Data, pp. 740–747. IEEE (2014)
Huang, X., Wang, J., Zhong, Y., Song, S., Yu, P.S.: Optimizing data partition for scaling out NoSQL cluster. Concurrency Comput. Pract. Experience 17, 5793–5809 (2015)
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the 29th ACM Symposium on Theory of Computing, pp. 654–663. ACM (1997)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Syst. Rev. 44(2), 35–40 (2010)
Liao, W., Hou, K., Zheng, Y., He, X.: Modeling and simulation of troubleshooting process for automobile based on Petri Net and flexsim. In: Qi, E., Shen, J., Dou, R. (eds.) The 19th International Conference on Industrial Engineering and Engineering Management, pp. 1141–1153. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38391-5_121
Mace, J., Roelke, R., Fonseca, R.: Pivot tracing: dynamic causal monitoring for distributed systems. In: Proceedings of the 25th SOSP, pp. 378–393. ACM (2015)
Majidi, F., Harounabadi, A.: Presentation of an executable model for evaluation of software architecture using blackboard technique and formal models. JACST 4(1), 23–31 (2015)
Montresor, A., Jelasity, M.: Peersim: A scalable P2P simulator. In: 2009 IEEE Ninth International Conference on Peer-to-Peer Computing, pp. 99–100. IEEE (2009)
Osman, R., Piazzolla, P.: Modelling replication in NoSQL datastores. In: Norman, G., Sanders, W. (eds.) QEST 2014. LNCS, vol. 8657, pp. 194–209. Springer, Cham (2014). doi:10.1007/978-3-319-10696-0_16
Shi, J., Qiu, Y., Minhas, U.F., Jiao, L., Wang, C., Reinwald, B., Özcan, F.: Clash of the titans: MapReduce vs. Spark for large scale data analytics. VLDB 8(13), 2110–2121 (2015)
Shi, J., Zou, J., Lu, J., Cao, Z., Li, S., Wang, C.: MRTuner: a toolkit to enable holistic optimization for MapReduce jobs. VLDB 7(13), 1319–1330 (2014)
Terry, D.B., Demers, A.J., Petersen, K., Spreitzer, M.J., Theimer, M.M., Welch, B.B.: Session guarantees for weakly consistent replicated data. In: 1994 Proceedings of the Third International Conference on Parallel and Distributed Information Systems, pp. 140–149. IEEE (1994)
Thomas, R.H.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst. (TODS) 4(2), 180–209 (1979)
Wagenhals, L.W., Liles, S.W., Levis, A.H.: Toward executable architectures to support evaluation. In: 2009 International Symposium on Collaborative Technologies and Systems, pp. 502–511, May 2009
Wang, K., Kulkarni, A., Lang, M., Arnold, D., Raicu, I.: Using simulation to explore distributed key-value stores for extreme-scale system services. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 9. ACM (2013)
Westergaard, M.: Access/CPN 2.0: a high-level interface to coloured Petri Net models. In: Kristensen, L.M., Petrucci, L. (eds.) PETRI NETS 2011. LNCS, vol. 6709, pp. 328–337. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21834-7_19
Acknowledgement
The paper is supported by The National Key R&D Program of China (2016YFB1000701) and NSFC (61325008).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Huang, X., Wang, J., Qiao, J., Zheng, L., Zhang, J., Wong, R.K. (2017). Performance and Replica Consistency Simulation for Quorum-Based NoSQL System Cassandra. In: van der Aalst, W., Best, E. (eds) Application and Theory of Petri Nets and Concurrency. PETRI NETS 2017. Lecture Notes in Computer Science(), vol 10258. Springer, Cham. https://doi.org/10.1007/978-3-319-57861-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-57861-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57860-6
Online ISBN: 978-3-319-57861-3
eBook Packages: Computer ScienceComputer Science (R0)