Abstract
Originally, MapReduce implementations such as Hadoop employed First In First Out (fifo) scheduling, but such simple schemes cause job starvation. The Hadoop Fair Scheduler (hfs) is a slot-based MapReduce scheme designed to ensure a degree of fairness among the jobs, by guaranteeing each job at least some minimum number of allocated slots. Our prime contribution in this paper is a different, flexible scheduling allocation scheme, known as flex. Our goal is to optimize any of a variety of standard scheduling theory metrics (response time, stretch, makespan and Service Level Agreements (slas), among others) while ensuring the same minimum job slot guarantees as in hfs, and maximum job slot guarantees as well. The flex allocation scheduler can be regarded as an add-on module that works synergistically with hfs. We describe the mathematical basis for flex, and compare it with fifo and hfs in a variety of experiments.
Chapter PDF
Similar content being viewed by others
References
Dean, J., Ghemawat, S.: Mapreduce: Simplified Data Processing on Large Clusters. ACM Transactions on Computer Systems 51(1), 107–113 (2008)
Hadoop, http://hadoop.apache.org
Zaharia, M., Borthakur, D., Sarma, J., Elmeleegy, K., Schenker, S., Stoica, I.: Job Scheduling for Multi-user Mapreduce Clusters. Technical Report EECS-2009-55, UC Berkeley Technical Report (2009)
Hadoop Fair Scheduler Design Document, http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/src/contrib/fairscheduler/designdoc/fair_scheduler_design_doc.pdf
Zaharia, M., Borthakur, D., Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In: EuroSys 2010: Proceedings of the 5th European Conference on Computer Systems, pp. 265–278. ACM, New York (2010)
Agrawal, P., Kifer, D., Olston, C.: Scheduling Shared Scans of Large Data Files. Proceedings of the VLDB Endowment 1(1), 958–969 (2008)
Pinedo, M.: Scheduling: Theory, Algorithms and Systems. Prentice Hall, Englewood Cliffs (1995)
Blazewicz, J., Ecker, K., Schmidt, G., Weglarz, J.: Scheduling in Computer and Manufacturing Systems. Springer, Secaucus (1993)
Leung, J.E.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. CRC, Boca Raton (2004)
Coffman, E., Garey, M., Johnson, D., Tarjan, R.: Performance Bounds for Level-oriented Two-dimensional Packing Problems. SIAM Journal on Computing 9(4), 808–826 (1980)
Schwiegelshohn, U., Ludwig, W., Wolf, J., Turek, J., Yu, P.: Smart SMART Bounds for Weighted Response Time Scheduling. SIAM Journal on Computing 28, 237–253 (1999)
Turek, J., Wolf, J., Yu, P.: Approximate Algorithms for Scheduling Parallelizable Tasks. In: SPAA 1992: Proceedings of the Fourth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 323–332. ACM, New York (1992)
Blazewicz, J., Kovalyov, M., Machowiak, M., Trystram, D., Weglarz, J.: Malleable Task Scheduling to Minimize the Makespan. Annals of Operations Research 129, 65–80 (2004)
Ibaraki, T., Katoh, N.: Resource Allocation Problems: Algorithmic Approaches. MIT Press, Cambridge (1988)
Fox, B.: Discrete Optimization via Marginal Analysis. Management Science 13, 210–216 (1966)
Galil, Z., Megiddo, N.: A Fast Selection Algorithm and the Problem of Optimum Distribution of Effort. Journal of the ACM 26(1), 58–64 (1979)
Frederickson, G., Johnson, D.: Generalized Selection and Ranking. In: STOC 1980: Proceedings of the Twelfth Annual ACM Symposium on Theory of Computing, pp. 420–428. ACM, New York (1980)
Jaql Query Language for JavaScript Object Notation, http://code.google.com/p/jaql
Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving Mapreduce Performance in Heterogeneous Environments. In: 8th USENIX Symposium on Operating Systems Design and Implementation, pp. 29–42. USENIX Association (2008)
Isard, M., Prabhakaran, V., Curry, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: Fair Scheduling for Distributed Computing Clusters. In: SOSP 2009: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 261–276. ACM, New York (2009)
Sandholm, T., Lai, K.: Mapreduce Optimization using Regulated Dynamic Prioritization. In: SIGMETRICS 2009: Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems, pp. 299–310. ACM, New York (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 IFIP International Federation for Information Processing
About this paper
Cite this paper
Wolf, J. et al. (2010). FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads. In: Gupta, I., Mascolo, C. (eds) Middleware 2010. Middleware 2010. Lecture Notes in Computer Science, vol 6452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16955-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-16955-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16954-0
Online ISBN: 978-3-642-16955-7
eBook Packages: Computer ScienceComputer Science (R0)