Abstract
Cloud computing has evolved as an efficient paradigm to process big data applications. Performance evaluation of cloud center is a necessary prerequisite to guarantee quality of service. However, it is a challenge task to effectively analyze the performance of cloud service due to the complexity of cloud resources and the diversity of big data applications. In this paper, we leverage queuing theory and probabilistic statistics to propose a performance evaluation model for cloud center under big data application arrivals. In this model, the tasks (i.e., big data applications) are with Poisson arrivals, each task is divided into lots of parallel subtasks, and the number of subtasks follows a general distribution. The model allows to calculate the important performance indicators such as mean number of subtasks in the system, the probability that a task obtains immediate service, task waiting time and blocking probability. The model can also be used to predict the time cost of performing application. Finally, we use the simulations and benchmarking running WordCount and TeraSort applications on a Hadoop platform to demonstrate the utility of the model.
Similar content being viewed by others
References
Vaquero LM, Rodero-Merino L, Caceres J, Lindner M (2009) A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput Commun Rev 39(1):50–55
Amazon Elastic Compute Cloud, Amazon EC2 (2015) An Amazon.com Company. http://aws.amazon.com/ec2
Google Cloud Platform (2015) Google. http://www.ancoris.com/cloud-computing/google-cloud-platform.html
IBM Cloud Computing (2015) IBM. http://www.ibm.com/cloud-computing/
Khazaei H, Misic J, Misic Vojislav B (2012) Performance analysis of cloud computing centers using m/g/m/m+r queuing systems. IEEE Trans Parallel Distrib Syst 23(5):936–943
Ghosh R, Trivedi KS, Naik VK, Kim DS (2010) End-to-end performability analysis for infrastructure-as-a-service cloud. In: Proceedings of IEEE 16th Pacific Rim International Symposium on Dependable Computing. pp 125–132
Suresh Varma P, Satyanarayana A, Sundari R (2012) Performance analysis of cloud computing using queuing models. In: International Conference on Cloud Computing, Technologies, Applications and Management. pp 12–15
Xiong K, Perros H (2009) Service performance and analysis in cloud computing. In: World Conference on Services. pp 693–700
Qian H, Medhi D, Trivedi KS (2011) A hierarchical model to evaluate quality of experience of online services hosted by cloud computing. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management (IM). pp 105–112
Ghosh R, Longo F, Naik VK, Trivedi KS (2010) Quantifying resiliency of IaaS cloud. In: Proceedings of IEEE Symposium on Reliable Distributed Systems. pp 343–347
Khazaei H, Misic J, Misic VB, Rashwand S (2013) Analysis of a pool management scheme for cloud computing centers. IEEE Trans Parallel Distrib Syst 24(5):849–861
Khazaei H, Misic J, Misic Vojislav B (2013) Performance of cloud centers with high degree of virtualization under batch task arrivals. IEEE Trans Parallel Distrib Syst 24(12):2429–2438
Khazaei H, Misic J, Misic VB (2013) A fine-grained performance model of cloud computing centers. IEEE Trans Parallel Distrib Syst 24(11):2138–2147
Yang B, Tan F, Dai YS (2013) Performance evaluation of cloud service considering fault recovery. J Supercomput 65(1):426–444
Liu X, Tong W, Zhi X, ZhiRen F, WenZhao Liao (2014) Performance analysis of cloud computing services considering resources sharing among virtual machines. J Supercomput 69(1):357–374
Khazaei H, Misic J, Misic VB, Mohammadi NB (2013) Modeling the performance of heterogeneous IaaS cloud centers. In: 33rd International Conference on Distributed Computing Systems Workshops. pp 232–237
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Valiant Leslie G (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111
Bolch G, Greiner S, de Meer H, Trivedi KS (2006) Q ueueing networks and markov chains, 2nd edn. Wiley, Hoboken
Doulkeridis C, Norvag Kjetil (2014) A survey of large-scale analytical query processing in MapReduce. Very Large Data Bases J 23:355–380
Pace MF (2012) BSP vs MapReduce. Procedia Comput Sci 9:246–255
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, Washington, DC, USA. IEEE Computer Society
Garfinkel SL (2007) An evaluation of Amazons grid computing services: EC2, S3 and SQS. Tech. Rep., \(\#\) TR-08-07
Jackson KR, Ramakrishnan L, Muriki K et al. (2010) Performance analysis of high performance computing applications on the Amazon web services cloud. In: 2nd IEEE International Conference on Cloud Computing Technology and Science. pp 159–168
Iosup A, Ostermann S, Yigitbasi N, Prodan R, Fahringer T, Epema D (2011) Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Parallel Distrib Syst 22(6):931–945
Yigitbasi N, Iosup A, Epema D, Ostermann S (2009) C-meter: a framework for performance analysis of computing clouds. In: CCGRID ’09: Proceedings of Ninth IEEE/ACM International Symposium on Cluster Computing and the Grid. pp 472–477
Liu X, Li S, Tong W (2015) A queuing model considering resources sharing for cloud service performance. J Supercomput 71(11):1–14
Chao S, Weiqin T, Kausar S (2015) Predicting the performance of parallel computing models using queuing system. In: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. pp 757–760
Dai Yuan-Shun, Pan Yi, Zou Xukai (2007) A hierarchical modeling and analysis for grid service reliability. IEEE Trans Comput 56(5):681–691
Maple 18 (2015) Maplesoft. http://www.maplesoft.com/
Hadoop, Apache (2015) http://hadoop.apache.org/
Laplace transform (2016) https://en.wikipedia.org/wiki/Laplace_transform
Xiong W, Yu Z, Bei Z, Zhao J, Zhang F, Zou Y, Bai X, Li Y, Xu C (2013) A characterization of big data benchmarks. In: IEEE International Conference on Big Data. pp 118–125
Xiong W, Yu Z, Eeckhout L, Bei Z, Zhang F, Xu C (2015) SZTS: A novel big data transportation system benchmark suite. In: 44th International Conference on Parallel Processing. pp 819–828
Wang L, Zhan J, Luo C, Zhu Y, Yang Q, He Y, Gao W, Jia Z, Shi Y, Zhang S, Zheng C, Lu G, Zhan K, Li X, Qiu B (2014) Bigdatabench: a big data benchmark suite from internetservices. In: IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). pp 488–499
Wasi-ur-Rahman M, Lu X, Islam NS, Panda DK (2014) Performance modeling for RDMA-enhanced hadoop MapReduce. In: 43rd International Conference on Parallel Processing. pp 50–59
Acknowledgements
This work is supported by Innovation Action Plan supported by Science and Technology Commission of Shanghai Municipality (15DZ1100305).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shen, C., Tong, W., Hwang, JN. et al. Performance modeling of big data applications in the cloud centers. J Supercomput 73, 2258–2283 (2017). https://doi.org/10.1007/s11227-017-2005-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-017-2005-y