Abstract
In the big data era, the speed of analytical processing is influenced by the storage and retrieval capabilities to handle large amounts of data. While the distributed crunching applications themselves can yield useful information, the analysts face difficult challenges: they need to predict how much data to process and where, such that to get an optimum data crunching cost, while also respect deadlines and service level agreements within a limited budget. In today’s data centers, data processing on demand and data transfers requests coming from distributed applications are usually expressed as aperiodic tasks. In this paper, we challenge the problem of tasks scheduling with deadline constraints of aperiodic tasks within inter-Cloud environments. In massively multithreaded computing systems that deal with data-intensive applications, Hadoop and BaTs tasks arrive periodically, which challenges traditional scheduling approaches previously proposed for supercomputing. Here, we consider the deadline as the main constraint, and propose a method to estimate the number of resources needed to schedule a set of aperiodic tasks, considering both execution and data transfers costs. Starting from classical scheduling techniques, and considering asynchronous tasks handling, we analyze the possibility of decoupling task arriving from task creation, scheduling and execution, sets of actions that can be put into a peer-to-peer relation over a network or over a client–server architecture in the Cloud. Based on a mathematical model, and using different simulation scenarios, we prove the following statements: (1) multiple source of independent aperiodic tasks can be considered similar to a single one; (2) with respect to the global deadline, the tasks migration between different regional centers is the appropriate solution when the number of estimated resources exceed a data center capacity; and (3) in a heterogeneous data center, we need a higher number of resources for the same request in order to respect the deadline constraints. We believe such results will benefit researchers and practitioners alike, who are interested in optimizing the resource management in data centers according to novel challenges coming from next-generation big data applications.
Similar content being viewed by others
References
Ba W, Dabo Z, Qi L, Wei W (2012) The partitioned scheduling of sporadic task systems on multiprocessors. J Supercomput 59(1):227–245
Brucker P, Knust S (2011) Complex scheduling. Springer, Berlin, Heidelberg
Buttazzo GC (2011) Hard real-time computing systems: predictable scheduling algorithms and applications. Springer, 3rd edn
Conway RW, Maxwell WL, Miller LW (2003) Theory of scheduling. DoverPublications
Cox DR, Smith WL (1961) Queues, methuen & co., ltd. New York
Davis R, Wellings A (1995) Dual priority scheduling. In: Proceedings of the 16th IEEE Real-Time Systems Symposium, RTSS ’95, pages 100. IEEE Computer Society
Dobre C, Pop F, Cristea V (2008) A simulation framework for dependable distributed systems. In Proceedings of the 2008 International Conference on Parallel Processing—Workshops, ICPPW ’08, pp 181–187, Washington, DC, USA. IEEE Computer Society
Gotoh Y, Yoshihisa T, Taniguchi H, Kanazawa M, Rahayu W, Phoebe Chen Y-P (2012) A scheduling method for node relay-based webcast considering reconnection. In Proceedings of the 2012 IEEE 26th International Conference on Advanced Information Networking and Applications, AINA ’12, pp 787–794, Washington, DC, USA, 2012. IEEE Computer Society
Huang Ye, Bessis Nik, Norrington Peter, Kuonen Pierre, Hirsbrunner Beat (2013) Exploring decentralized dynamic scheduling for grids and Clouds using the community-aware scheduling algorithm. Future Gener Comput Syst 29(1):402–415
Kato S, Yamasaki N, Ishikawa Y (2009) Semi-partitioned scheduling of sporadic task systems on multiprocessors. In Proceedings of the 2009 21st Euromicro Conference on Real-Time Systems, ECRTS ’09, pp 249–258, Washington, DC, USA, 2009. IEEE Computer Society
Kamal Kc, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science, CLOUDCOM ’10, pp 388–392, Washington, DC, USA, 2010. IEEE Computer Society
Kebarighotbi A, Cassandras CG (2011) Optimal scheduling of parallel queues using stochastic flow models. Discret Event Dyn Syst 21(4):547–576
Kolodziej J, Xhafa F (October 2011) Enhancing the genetic-based scheduling in computational grids by a structured hierarchical population. Future Gener Comput Syst 27(8):1035–1046
Lehoczky JP, Thuel SR (1995) Advances in real-time systems. In: Son Sang H (ed) Advances in real-time systems., chapter Scheduling periodic and aperiodic tasks using the slack stealing algorithm. Prentice-Hall Inc, Upper Saddle River, pp 175–198
Leung J, Kelly L, Anderson JH (2004) Handbook of scheduling: algorithms, models, and performance analysis. CRC Press Inc, FL
JWS Liu (2000) Real-time systems, 1st edn. Prentice Hall. http://www.amazon.com/Real-Time-Systems-Jane-W-Liu/dp/0130996513
Liu K, Jin H, Chen J, Liu X, Yuan D, Yang Y (2010) A compromised-time-cost scheduling algorithm in swindew-c for instance-intensive cost-constrained workflows on a Cloud computing platform. Int J High Perform Comput Appl 24(4):445–456
Ming M, Marty H (2011) Auto-scaling to minimize cost and meet application deadlines in Cloud workflows. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pp 49:1–49:12, New York, NY, USA, 2011. ACM
McKnight C, Stubens C, Coady Y, Li KF (2012) Multi-server mmo middleware: Unlocked. In: Proceedings of the 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC ’12, pp 218–225, Washington, DC, USA, 2012. IEEE Computer Society
Mocanu M, Craciun A (2012) Monitoring watershed parameters through software services. In Emerging Intelligent Data and Web Technologies (EIDWT), 2012 Third International Conference on, pp 287–292
Moschakis IA, Karatza HD (2012) Evaluation of gang scheduling performance and cost in a Cloud computing system. J Supercomput 59(2):975–992
Oprescu A-M, Kielmann T, Leahu H (2012) Stochastic tail-phase optimization for bag-of-tasks execution in clouds. In Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing, UCC ’12, pp 204–208, Washington, DC, USA, 2012. IEEE Computer Society
Pop F, Dobre C, Cristea V, Bessis N (2013) Scheduling of sporadic tasks with deadline constrains in cloud environments. In Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications, AINA ’13, pp 764–771, Washington, DC, USA, 2013. IEEE Computer Society
Kalim Q, Babar M, Jawad HK, Sajjad AM (March 2012) Task partitioning, scheduling and load balancing strategy for mixed nature of tasks. J Supercomput 59(3):1348–1359
Serbanescu C (1998) Noncommutative markov processes as stochastic equations’ solutions. Bull Math Soc Sci Math Roumanie Tome 41, 89(3):219–228
Serbanescu C (1998) Stochastic differential equations and unitary processes. Bull Math Soc Sci Math Roumanie Tome 41, 89(3):311–322
Sprunt B (1990) Aperiodic task scheduling for real-time systems. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA, AAI9107570
Strosnider JK, Lehoczky JP, Sha L (1995) The deferrable server algorithm for enhanced aperiodic responsiveness in hard real-time environments. IEEE Trans Comput 44(1):73–91
Wang L, Khan SU, Chen D, Kolodziej J, Ranjan R, Cheng-Zhong X, Zomaya A (2013) Energy-aware parallel task scheduling in a cluster. Future Gener Comput Syst 29(7):1661–1670
Jia Y, Rajkumar B (2006) Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci Program 14(3,4):217–230
Acknowledgments
The research presented in this paper is supported by projects: ”SideSTEP–Scheduling Methods for Dynamic Distributed Systems: a self-* approach”, ID: PN-II-CT-RO-FR-2012-1-0084; CyberWater: grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow–PN-II-PT-PCCA-2013-4-0321; clueFarm: Information system based on cloud services accessible through mobile devices, to increase product quality and business development farms–PN-II-PT-PCCA-2013-4-0870.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research presented in this paper is supported by the following projects: “SideSTEP - Scheduling Methods for Dynamic Distributed Systems: a self-* approach”, (PN-II-CT-RO-FR-2012-1-0084); CyberWater grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012.
This paper is based on [23]: Florin Pop, Ciprian Dobre, Valentin Cristea, and Nik Bessis. Scheduling of sporadic tasks with deadline constrains in Cloud environments. In Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications, AINA ’13, pages 764–771, 2013. IEEE Computer Society.
Rights and permissions
About this article
Cite this article
Pop, F., Dobre, C., Cristea, V. et al. Deadline scheduling for aperiodic tasks in inter-Cloud environments: a new approach to resource management. J Supercomput 71, 1754–1765 (2015). https://doi.org/10.1007/s11227-014-1285-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1285-8