tutorial

Techniques for Handling Error in User-estimated Execution Times During Resource Management on Systems Processing MapReduce Jobs

Authors:

Norman Lim,

Shikharesh Majumdar,

Peter Ashwood-SmithAuthors Info & Claims

CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

Pages 788 - 793

https://doi.org/10.1109/CCGRID.2017.70

Published: 14 May 2017 Publication History

Get Access

Abstract

In our previous work, we described a resource allocation and scheduling technique for processing an open stream of MapReduce jobs with SLAs (characterized by an earliest start time, an execution time, and a deadline) called the Hadoop Constraint Programming based Resource Management technique (HCP-RM). Since the user-estimated job execution times are used to perform resource allocation and scheduling, error/inaccuracies in the execution times can hinder the ability of HCP-RM from making effective scheduling decisions, leading to a degradation in system performance. This paper focuses on improving the robustness of HCP-RM by introducing a mechanism to handle errors/inaccuracies in user estimates of job execution times that are submitted as part of the job's SLA. A Prescheduling Error Handling technique (PSEH) is devised to adjust the user-estimated execution times of the jobs to make them more accurate before they are used by the resource management algorithm. Results of experiments conducted on a Hadoop cluster deployed on Amazon EC2 demonstrate the effectiveness of the PSEH technique in improving system performance.

References

[1]

N. Lim, S. Majumdar, P. Ashwood-Smith, "MRCP-RM: a Technique for Resource Allocation and Scheduling of MapReduce Jobs with Deadlines", IEEE Trans. on Parallel and Distributed Systems, in press. Preprint available:

Digital Library

Google Scholar

[2]

J. Dean and S. Ghemawat, "MapReduce: Simplified data processing on large clusters", Int'l Symp. on Operating System Design and Implementation, 6-8 Dec 2004, pp. 137--150.

Digital Library

Google Scholar

[3]

The Apache Software Foundation, "Hadoop". {Online}. Available: http://hadoop.apache.org

Google Scholar

[4]

C. Bailey Lee, Y. Scwartzman, J. Hardy, and A. Snavely, "Are User Runtime Estimates Inherently Inaccurate?", Job Scheduling Strategies for Parallel Processing, D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn, Eds. Berlin, Germany: Springer, 2005, pp. 253--263.

Digital Library

Google Scholar

[5]

D. Tsafrir, Y. Etsion, and D. G. Feitelson, "Backfilling Using System-Generated Predictions Rather than User Runtime Estimates", IEEE Trans. on Parallel and Distributed Systems, vol. 18, no. 6, 2007, pp. 789--803.

Digital Library

Google Scholar

[6]

W. Tang, N. Desai, D. Buettner, and Z. Lan, "Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P", Int'l Symp. on Parallel & Distributed Processing, 19-23 April 2010, pp. 1--11.

Google Scholar

[7]

U. Farooq, S. Majumdar, and E. W. Parsons, "Achieving efficiency, quality of service and robustness in multi-organizational Grids", Journal of Systems and Software, vol. 82, pp. 23--38, Jan. 2009.

Digital Library

Google Scholar

[8]

G. Birkenheuer, A. Brinkmann, and H. Karl, "Risk aware overbooking for commercial grids", Int'l Workshop on Job Scheduling Strategies for Parallel Processing, 23 April 2010, pp. 51--76.

Digital Library

Google Scholar

[9]

A. Matsunaga and J. Fortes, "On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications", Int'l Conf. on Cluster, Cloud and Grid Computing, 17-20 May 2010, pp. 495--504.

Digital Library

Google Scholar

[10]

Y. Murata, R. Egawa, M. Higashida, and H. Kobayashi, "A History-Based Job Scheduling Mechanism for the Vector Computing Cloud", Int'l Symp. on Applications and the Internet, 19-23 July 2010, pp. 125--128.

Digital Library

Google Scholar

[11]

IBM. IBM ILOG CPLEX Optimization Studio. {Online}. Available: http://pic.dhe.ibm.com/infocenter/cosinfoc/v12r5/index.jsp

Google Scholar

[12]

D. England, J. Weissman, and J. Sadagopan, "A new metric for robustness with application to job scheduling", Int'l Symp. on High Performance Distributed Computing, 24-27 July 2005, pp.135 - 143.

Digital Library

Google Scholar

[13]

A. W. Mu'alem and D. G. Feitelson, "Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling", IEEE Trans. on Parallel and Distributed Systems, vol. 12, no. 6, Jun. 2001, pp. 529 -543.

Digital Library

Google Scholar

Cited By

View all

Zeng XGarg SBarika MZomaya AWang LVillari MChen DRanjan R(2020)SLA Management for Big Data Analytical Applications in CloudsACM Computing Surveys10.1145/338346453:3(1-40)Online publication date: 12-Jun-2020
https://dl.acm.org/doi/10.1145/3383464

Recommendations

Scheduling jobs with agreeable processing times and due dates on a single batch processing machine

In this paper we study the problems of scheduling jobs with agreeable processing times and due dates on a single batch processing machine to minimize total tardiness, and weighted number of tardy jobs. We prove that the problem of minimizing total ...
Scheduling Jobs with Exponentially Distributed Processing Times on Two Machines with Resource Constraints

We consider the problem of minimizing the expected makespan of n jobs with independent exponentially distributed processing times on two parallel machines, under resource constraints. Job j has expected processing time 1/µ_j and requires ...
A Constraint Programming-Based Resource Management Technique for Processing MapReduce Jobs with SLAs on Clouds
BRACIS '14: Proceedings of the 2014 Brazilian Conference on Intelligent Systems

Clouds that are rapidly gaining in popularity require an effective resource manager that can harness the power of the underlying resource pool, and provide resources on demand to its users. This paper focuses on resource management on clouds for ...

Comments

Information & Contributors

Information

Published In

CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

May 2017

1167 pages

ISBN:9781509066100

Publisher

IEEE Press

Publication History

Published: 14 May 2017

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

CCGrid '17

Sponsor:

SIGARCH

CCGrid '17: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

May 14 - 17, 2017

Madrid, Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
37
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zeng XGarg SBarika MZomaya AWang LVillari MChen DRanjan R(2020)SLA Management for Big Data Analytical Applications in CloudsACM Computing Surveys10.1145/338346453:3(1-40)Online publication date: 12-Jun-2020
https://dl.acm.org/doi/10.1145/3383464

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Scheduling jobs with agreeable processing times and due dates on a single batch processing machine

Scheduling Jobs with Exponentially Distributed Processing Times on Two Machines with Resource Constraints

A Constraint Programming-Based Resource Management Technique for Processing MapReduce Jobs with SLAs on Clouds

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

Scheduling jobs with agreeable processing times and due dates on a single batch processing machine

Scheduling Jobs with Exponentially Distributed Processing Times on Two Machines with Resource Constraints

A Constraint Programming-Based Resource Management Technique for Processing MapReduce Jobs with SLAs on Clouds

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations