research-article

Public Access

Cooperative Job Scheduling and Data Allocation for Busy Data-Intensive Parallel Computing Clusters

Authors:

Haiying Shen, and

Haoyu WangAuthors Info & Claims

ICPP '19: Proceedings of the 48th International Conference on Parallel Processing

August 2019

Article No.: 33, Pages 1 - 11

https://doi.org/10.1145/3337821.3337864

Published: 05 August 2019 Publication History

Abstract

In data-intensive parallel computing clusters, it is important to provide deadline-guaranteed service to jobs while minimizing resource usage (e.g., network bandwidth and energy). Under the current computing framework (that first allocates data and then schedules jobs), in a busy cluster with many jobs, it is difficult to achieve these objectives simultaneously. We model the problem to simultaneously achieve the objectives using integer programming, and propose a heuristic Cooperative job Scheduling and data Allocation method (CSA). CSA novelly reverses the order of data allocation and job scheduling in the current computing framework, i.e., changing data-first-job-second to job-first-data-second. It enables CSA to proactively consolidate tasks with more common requested data to the same server when conducting deadline-aware scheduling, and also consolidate the tasks to as few servers as possible to maximize energy savings. This facilitates the subsequent data allocation step to allocate a data block to the server that hosts most of this data's requester tasks, thus maximally enhancing data locality and reduce bandwidth consumption. CSA also has a recursive schedule refinement process to adjust the job and data allocation schedules to improve system performance regarding the three objectives and achieve the tradeoff between data locality and energy savings with specified weights. We implemented CSA and a number of previous job schedulers on Apache Hadoop on a real supercomputing cluster. Trace-driven experiments in the simulation and the real cluster show that CSA outperforms other schedulers in supplying deadline-guarantee and resource-efficient services.

References

[1]

{n. d.}. Apache Capacity Scheduler Guide. In http://hadoop.apache.org/docs/r1.2.1/capacity _scheduler.html {accessed in Apr. 2019}.

[2]

{n. d.}. Apache Fair Scheduler. In http://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html {accessed in Apr. 2019}.

[3]

{n. d.}. Apache Hadoop FileSystem and its Usage in Facebook. In http://cloud.berkeley.edu/data/hdfs.pdf {accessed in Apr. 2019}.

[4]

{n. d.}. Apache Hadoop NextGen MapReduce (YARN). In http://hadoop.apache.org/docs/r2.5.1/hadoop-yarn/hadoop-yarn-site/ {accessed in Apr. 2019}.

[5]

{n. d.}. Apache Spark. In https://spark.apache.org/. {accessed in Apr. 2019}.

[6]

{n. d.}. MESOS. In http://mesos.apache.org/ {accessed in Apr. 2019}.

[7]

{n. d.}. Palmetto Cluster. In http://citi.clemson.edu/palmetto/ {accessed in Apr. 2019}.

[8]

M. Al-Fares, A. Loukissas, and A. Vahdat. 2008. A Scalable, Commodity Data Center Network Architecture. In Proc. of SIGCOMM.

Digital Library

[9]

H. Amur, J. Cipar, V. Gupta, and K. Schwan. 2010. Robust and Flexible Power-Proportional Storage. In Proc. of SoCC.

Digital Library

[10]

R. Appuswamy, C. Gkantsidis, and A. Rowstron. 2013. Scale-up vs Scale-out for Hadoop: Time to rethink?. In Proc. of SoCC.

Digital Library

[11]

A. Beloglazov and R. Buyya. 2011. Optimal Online Deterministic Algorithms and Adaptive Heuristics for Energy and Performance Efficient dDnamic Consolidation of Virtual Machines in Cloud Data Centers. (2011).

Digital Library

[12]

N. Bonvin, T. G. Papaioannou, and K. Aberer. 2010. A Self-Organized, Fault-Tolerant and Scalable Replication Scheme for Cloud Storage. In Proc. of SoCC.

Digital Library

[13]

R. Chaiken, B.Jenkins, P. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. 2008. SCOPE: easy and efficient parallel processing of massive data sets. (2008).

Digital Library

[14]

C. Chen, J. Lin, and S. Kuo. 2018. MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems. (2018).

[15]

Y. Chen, A. Ganapathi, R. Griffith, and R. Katz. 2011. The Case for Evaluating MapReduce Performance Using Workload Suites. In Proc. of MASCOTS.

Digital Library

[16]

D. Cheng, J. Rao, C. Jiang, and X. Zhou. 2015. Resource and deadline-aware job scheduling in dynamic hadoop clusters. In Proc. of IPDPS.

Digital Library

[17]

C. Delimitrou and C. Kozyrakis. 2015. Tarcil: reconciling scheduling speed and quality in large shared clusters. In Proc. of SOCC.

Digital Library

[18]

M. Ehsan, Y. Chen, and H. Kang. 2013. EcoHadoop: A Cost-Efficient Data and Task Co-Scheduler for MapReduce. In Proc. of HiPC.

Digital Library

[19]

A. D. Ferguson, P. Bodik, S. Kandula, E. Boutin, and R. Fonseca. 2013. Jockey: Guaranteed Job Latency in Data Parallel Clusters. In Proc. of EuroSys.

Digital Library

[20]

R. Gandhi, Y. C. Hu, C. Koh, H. H. Liu, and M. Zhang. 2015. Rubik: Unlocking the Power of Locality and End-point Flexibility in Cloud Scale Load Balancing. In Proc. of USENIX ATC.

Digital Library

[21]

M. R. Garey and D. S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.

Digital Library

[22]

I. Gupta, B. Cho, M. R. Rahman, T. Chajed, N. Abad, C. L.and Roberts, and P. Lin. 2013. Natjam: Eviction Policies For Supporting Priorities and Deadlines in Mapreduce Clusters. In Proc. of SoCC.

Digital Library

[23]

B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. 2011. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In proc. of NSDI.

Digital Library

[24]

C. Hung, L. Golubchik, and M. Yu. 2015. Scheduling jobs across geo-distributed datacenters. In Proc. of SOCC.

Digital Library

[25]

V. Jalaparti, P. Bodik, I. Menache, S. Rao, K. Makarychev, and M. Caesar. 2015. Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can. In Proc. of SIGCOMM.

Digital Library

[26]

H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica. 2014. Tachyon: Reliable, memory speed storage for cluster computing frameworks. In Proc. of SOCC.

Digital Library

[27]

A. Munir, T. He, R. Raghavendra, and F. Le. 2016. Network Scheduling Aware Task Placement in Datacenters. In Proc. of CONEXT.

Digital Library

[28]

C. Peng and Z. Zhang. 2012. VDN: Virtual Machine Image Distribution Network for Cloud Data Centers. In Proc. of INFOCOM.

[29]

Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica. 2015. Low latency geo-distributed data analytics. (2015).

Digital Library

[30]

S.Seny,J.R.Lorch, R. Hughes, C. G. J. Suarez, B. Zill, W. Cordeiroz, and J. Padhye. 2012. Don't Lose Sleep Over Availability: The GreenUp Decentralized Wakeup Service. In Proc. of NSDI.

Digital Library

[31]

K. Shvachko, H. Kuang, S. Radia, and R. Chansler. 2010. The Hadoop Distributed File System. In Proc. of MSST.

Digital Library

[32]

J. Tan, X. Meng, and L. Zhang. 2013. Coupling task progress for mapreduce resource-aware scheduling. In Proc. of INFOCOM.

Digital Library

[33]

P. Vagata and K. Wilfong. {n. d.}. Scaling the Facebook data warehouse to 300 PB. In https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/.

[34]

C. J. van Rijsbergen. 1979. Information Retrieval. Butterworth.

Digital Library

[35]

V. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, and S. Seth. 2013. Apache hadoop yarn: Yet another resource negotiator. In Proc. of SOCC.

Digital Library

[36]

H. Wang and H. Shen. 2018. Proactive Incast Congestion Control in a Datacenter Serving Web Applications. In Proc. of INFOCOM.

[37]

H. Wang, H. Shen, and Z. Li. 2018. Approaches for Resilience Against Cascading Failures in Cloud Datacenters. In Proc. of ICDCS.

[38]

H. Wang, H. Shen, and G. Liu. 2017. Swarm-based Incast Congestion Control in Datacenters Serving Web Applications. In Proc. of SPAA.

Digital Library

[39]

W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang. 2016. Maptask scheduling in mapreduce with data locality: Throughput and heavy-traffic optimality. (2016).

Digital Library

[40]

Y. Wang, M. Kapritsos, Z. Ren, P. Mahajan, J. Kirubanandam, L. Alvisi, and M. Dahlin. 2013. Robustness in the Salus Scalable Block Store. In Proc. of NSDI.

Digital Library

[41]

A. Wieder, P. Bhatotia, A. Post, and R. Rodrigues. 2012. Orchestrating the Deployment of Computations in the Cloud with Conductor. In Proc. of NSDI.

Digital Library

[42]

N. J. Yadwadkar, G. Ananthanarayanan, and R. Katz. 2014. Wrangler: Predictable and faster jobs using fewer resources. In Proc. of SOCC.

Digital Library

[43]

M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. 2010. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proc. of EuroSys.

Digital Library

Cited By

Wang HLiu GShen H(2022)Cooperative Job Scheduling and Data Allocation in Data-Intensive Parallel Computing ClustersIEEE Transactions on Cloud Computing10.1109/TCC.2022.3206206(1-14)Online publication date: 2022
https://doi.org/10.1109/TCC.2022.3206206

Recommendations

Parallel job scheduling algorithms
Read More
Revisiting SRPT for Job Scheduling in Computing Clusters
Queueing Theory and Network Applications
Abstract
As the scheduling principle of Shortest Remaining Processing Time (SRPT) has been proven to be optimal in the single-machine setting, it’s a natural thought that SRPT shall also be extended to yield various scheduling algorithms with theoretical ...
Read More
Job scheduling and processor allocation for grid computing on metacomputers
Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II

Scheduling is a fundamental issue in achieving high performance on metacomputers and computational grids. For the first time, the job scheduling problem for grid computing on metacomputers is studied as a combinatorial optimization problem. A cost model ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '19: Proceedings of the 48th International Conference on Parallel Processing

August 2019

1107 pages

ISBN:9781450362955

DOI:10.1145/3337821

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

CNS
NSF
OAC
CCF
Microsoft Research Faculty Fellowship
ACI

Conference

ICPP 2019

ICPP 2019: 48th International Conference on Parallel Processing

August 5 - 8, 2019

Kyoto, Japan

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
329
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)8

Other Metrics

View Author Metrics

Citations

Cited By

Wang HLiu GShen H(2022)Cooperative Job Scheduling and Data Allocation in Data-Intensive Parallel Computing ClustersIEEE Transactions on Cloud Computing10.1109/TCC.2022.3206206(1-14)Online publication date: 2022
https://doi.org/10.1109/TCC.2022.3206206

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents