Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1855741.1855744acmotherconferencesArticle/Chapter ViewAbstractPublication PagesosdiConference Proceedingsconference-collections
Article

Improving MapReduce performance in heterogeneous environments

Published: 08 December 2008 Publication History

Abstract

MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-source implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is critical. Hadoop's performance is closely tied to its task scheduler, which implicitly assumes that cluster nodes are homogeneous and tasks make progress linearly, and uses these assumptions to decide when to speculatively re-execute tasks that appear to be stragglers. In practice, the homogeneity assumptions do not always hold. An especially compelling setting where this occurs is a virtualized data center, such as Amazon's Elastic Compute Cloud (EC2). We show that Hadoop's scheduler can cause severe performance degradation in heterogeneous environments. We design a new scheduling algorithm, Longest Approximate Time to End (LATE), that is highly robust to heterogeneity. LATE can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2.

References

[1]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Communications of the ACM, 51 (1): 107-113, 2008.
[2]
Hadoop, http://lucene.apache.org/hadoop
[3]
Amazon Elastic Compute Cloud, http://aws. amazon.com/ec2
[4]
Yahoo! Launches World's Largest Hadoop Production Application, http://tinyurl.com/2hgzv7
[5]
Applications powered by Hadoop: http://wiki. apache.org/hadoop/PoweredBy
[6]
Presentations by S. Schlosser and J. Lin at the 2008 Hadoop Summit. tinyurl.com/4a6lza
[7]
D. Gottfrid, Self-service, Prorated Super Computing Fun, New York Times Blog, tinyurl.com/2pjh5n
[8]
Figure from slide deck on MapReduce from Google academic cluster, tinyurl.com/4zl6f5. Available under Creative Commons Attribution 2.5 License.
[9]
R. Pike, S. Dorward, R. Griesemer, S. Quinlan. Interpreting the Data: Parallel Analysis with Sawzall, Scientific Programming Journal, 13 (4): 227-298, Oct. 2005.
[10]
C. Olston, B. Reed, U. Srivastava, R. Kumar and A. Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. ACM SIGMOD 2008, June 2008.
[11]
E.B. Nightingale, P.M. Chen, and J. Flinn. Speculative execution in a distributed file system. ACM Trans. Comput. Syst., 24 (4): 361-392, November 2006.
[12]
Amazon EC2 Instance Types, tinyurl.com/3zjlrd
[13]
B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, I. Pratt, A. Warfield, P. Barham, and R. Neugebauer. Xen and the art of virtualization. ACM SOSP 2003.
[14]
Personal communication with the Yahoo! Hadoop team and with Joydeep Sen Sarma from Facebook.
[15]
J. Bernardin, P. Lee, J. Lewis, DataSynapse, Inc. Using Execution statistics to select tasks for redundant assignment in a distributed computing platform. Patent number 7093004, filed Nov 27, 2002, issued Aug 15, 2006.
[16]
G. E. Blelloch, L. Dagum, S. J. Smith, K. Thearling, M. Zagha. An evaluation of sorting as a supercomputer benchmark. NASA Technical Reports, Jan 1993.
[17]
EC2 Case Studies, tinyurl.com/46vyut
[18]
Mor Harchol-Balter, Task Assignment with Unknown Duration. Journal of the ACM, 49 (2): 260-288, 2002.
[19]
M. Crovella, M. Harchol-Balter, and C.D. Murta. Task assignment in a distributed system: Improving performance by unbalancing load. In Measurement and Modeling of Computer Systems, pp. 268-269, 1998.
[20]
B. Ucar, C. Aykanat, K. Kaya, and M. Ikinci. Task assignment in heterogeneous computing systems. J. of Parallel and Distributed Computing, 66 (1): 32-46, Jan 2006.
[21]
S. Manoharan. Effect of task duplication on the assignment of dependency graphs. Parallel Comput., 27 (3): 257-268, 2001.
[22]
Y. Su, M. Attariyan, J. Flinn AutoBash: improving configuration management with operating system causality analysis. ACM SOSP 2007.
[23]
G. Barish. Speculative plan execution for information agents. PhD dissertation, University of Southernt California. Dec 2003.

Cited By

View all
  • (2023)QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in SparkProceedings of the ACM on Management of Data10.1145/35892791:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Extending and Programming the NVMe I/O Determinism Interface for Flash ArraysACM Transactions on Storage10.1145/356842719:1(1-33)Online publication date: 11-Jan-2023
  • (2023)Safe and Practical GPU Computation in TrustZoneProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567483(505-520)Online publication date: 8-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
OSDI'08: Proceedings of the 8th USENIX conference on Operating systems design and implementation
December 2008
384 pages

Sponsors

  • USENIX Assoc: USENIX Assoc

In-Cooperation

Publisher

USENIX Association

United States

Publication History

Published: 08 December 2008

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in SparkProceedings of the ACM on Management of Data10.1145/35892791:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Extending and Programming the NVMe I/O Determinism Interface for Flash ArraysACM Transactions on Storage10.1145/356842719:1(1-33)Online publication date: 11-Jan-2023
  • (2023)Safe and Practical GPU Computation in TrustZoneProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567483(505-520)Online publication date: 8-May-2023
  • (2022)Parallelism-Optimizing Data Placement for Faster Data-Parallel ComputationsProceedings of the VLDB Endowment10.14778/3574245.357426016:4(760-771)Online publication date: 1-Dec-2022
  • (2022)Enabling emerging edge applications through a 5G control plane interventionProceedings of the 18th International Conference on emerging Networking EXperiments and Technologies10.1145/3555050.3569130(386-400)Online publication date: 30-Nov-2022
  • (2022)Using trioProceedings of the ACM SIGCOMM 2022 Conference10.1145/3544216.3544262(633-648)Online publication date: 22-Aug-2022
  • (2022)Multimedia streaming analyticsProceedings of the 1st Mile-High Video Conference10.1145/3510450.3517321(62-69)Online publication date: 1-Mar-2022
  • (2022)Historical data based approach to mitigate stragglers from the Reduce phase of MapReduce in a heterogeneous Hadoop clusterCluster Computing10.1007/s10586-021-03530-x25:5(3193-3211)Online publication date: 1-Oct-2022
  • (2021)FangornProceedings of the VLDB Endowment10.14778/3476311.347637614:12(2972-2985)Online publication date: 1-Jul-2021
  • (2021)Declarative data servingProceedings of the VLDB Endowment10.14778/3476249.347630214:11(2555-2562)Online publication date: 27-Oct-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media