How to Optimally Allocate Resources for Coded Distributed Computing?

Yu, Qian; Li, Songze; Maddah-Ali, Mohammad Ali; Avestimehr, A. Salman

Abstract:Today's data centers have an abundance of computing resources, hosting server clusters consisting of as many as tens or hundreds of thousands of machines. To execute a complex computing task over a data center, it is natural to distribute computations across many nodes to take advantage of parallel processing. However, as we allocate more and more computing resources to a computation task and further distribute the computations, large amounts of (partially) computed data must be moved between consecutive stages of computation tasks among the nodes, hence the communication load can become the bottleneck. In this paper, we study the optimal allocation of computing resources in distributed computing, in order to minimize the total execution time in distributed computing accounting for both the duration of computation and communication phases. In particular, we consider a general MapReduce-type distributed computing framework, in which the computation is decomposed into three stages: \emph{Map}, \emph{Shuffle}, and \emph{Reduce}. We focus on a recently proposed \emph{Coded Distributed Computing} approach for MapReduce and study the optimal allocation of computing resources in this framework. For all values of problem parameters, we characterize the optimal number of servers that should be used for distributed processing, provide the optimal placements of the Map and Reduce tasks, and propose an optimal coded data shuffling scheme, in order to minimize the total execution time. To prove the optimality of the proposed scheme, we first derive a matching information-theoretic converse on the execution time, then we prove that among all possible resource allocation schemes that achieve the minimum execution time, our proposed scheme uses the exactly minimum possible number of servers.

Subjects:	Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1702.07297 [cs.IT]
	(or arXiv:1702.07297v1 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.1702.07297

Computer Science > Information Theory

Title:How to Optimally Allocate Resources for Coded Distributed Computing?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators