research-article

Optimus: a dynamic rewriting framework for data-parallel execution plans

Authors:

Yuan YuAuthors Info & Claims

EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems

Pages 15 - 28

https://doi.org/10.1145/2465351.2465354

Published: 15 April 2013 Publication History

Abstract

In distributed data-parallel computing, a user program is compiled into an execution plan graph (EPG), typically a directed acyclic graph. This EPG is the core data structure used by modern distributed execution engines for task distribution, job management, and fault tolerance. Once submitted for execution, the EPG remains largely unchanged at runtime except for some limited modifications. This makes it difficult to employ dynamic optimization techniques that could substantially improve the distributed execution based on runtime information.

This paper presents Optimus, a framework for dynamically rewriting an EPG at runtime. Optimus extends dynamic rewrite mechanisms present in systems such as Dryad and CIEL by integrating rewrite policy with a high-level data-parallel language, in this case DryadLINQ. This integration enables optimizations that require knowledge of the semantics of the computation, such as language customizations for domain-specific computations including matrix algebra. We describe the design and implementation of Optimus, outline its interfaces, and detail a number of rewriting techniques that address problems arising in distributed execution including data skew, dynamic data re-partitioning, handling unbounded iterative computations, and protecting important intermediate data for fault tolerance. We evaluate Optimus with real applications and data and show significant performance gains compared to manual optimization or customized systems. We demonstrate the versatility of dynamic EPG rewriting for data-parallel computing, and argue that it is an essential feature of any general-purpose distributed dataflow execution engine.

References

[1]

P. K. Agarwal, G. Cormode, Z. Huang, J. M. Phillips, Z. Wei, and K. Yi. Mergeable summaries. In PODS, 2012.

Digital Library

[2]

S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Re-optimizing data parallel computing. In NSDI '12, April 2012.

Digital Library

[3]

G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in mapreduce clusters using mantri. In OSDI'10.

Digital Library

[4]

R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In ACM SIGMOD, 2000.

Digital Library

[5]

S. Babu, P. Bizarro, and D. DeWitt. Proactive re-optimization with Rio. In ACM SIGMOD, 2005.

Digital Library

[6]

J. Bennett and S. Lanning. The Netflix prize. In ACM SIGKDD 2007.

[7]

K. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, and R. Gemulla. On synopses for distinct-value estimation under multiset operations. In SIGMOD 2007.

Digital Library

[8]

Y. Bu, B. Howe, M. Balazinska, and M. Ernst. Haloop: Efficient iterative data processing on large clusters. In VLDB, 2010.

Digital Library

[9]

C. Chambers, A. Raniwala, F. Perry, S. Adams, R. Henry, R. Bradshaw, and N. Weizenbaum. FlumeJava: easy, efficient data-parallel pipelines. In PLDI'10.

Digital Library

[10]

S. Chaudhuri, R. Motwani, and V. Narasayya. On random sampling over joins. In SIGMOD 1999.

Digital Library

[11]

S. Chaudhuri, R. Motwani, and V. R. Narasayya. Random sampling for histogram construction: How much is enough? In SIGMOD 1998.

Digital Library

[12]

G. Cormode and M. Hadjieleftheriou. Finding frequent items in data streams. In VLDB 2008.

Digital Library

[13]

J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, 2008.

Digital Library

[14]

A. Deshpande, Z. Ives, and V. Raman. Adaptive query processing. Foundations and Trends in Databases, 1(1):1--140, 2007.

Digital Library

[15]

D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical skew handling in parallel joins. In VLDB 1992.

Digital Library

[16]

E. Gonina, A. Kannan, J. Shafer, and M. Budiu. Parallelizing large-scale data processing applications with data skew: a case study in product-offer matching. In Intl. Workshop on MapReduce and its Applications (MAPREDUCE), 2011.

Digital Library

[17]

G. Graefe. Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng., 6(1):120--135, 1994.

Digital Library

[18]

P. Gunda, L. Ravindranath, C. A. Thekkath, Y. Yu, and L. Zhuang. Nectar: Automatic Management of Data and Computation in Datacenters. In OSDI'10.

Digital Library

[19]

The Hadoop project. http://hadoop.apache.org/.

[20]

Hadoop NextGen MapReduce (YARN). http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html.

[21]

B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI '11, March 2011.

Digital Library

[22]

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In EuroSys 2007.

Digital Library

[23]

N. Kabra and D. J. DeWitt. Efficient mid-query re-optimization of sub-optimal query execution plans. In ACM SIGMOD, 1998.

Digital Library

[24]

A. Kannan, I. E. Givoni, R. Agrawal, and A. Fuxman. Matching unstructured product offers to structured product specifications. In ACM SIGKDD 2011.

Digital Library

[25]

Q. Ke, V. Prabhakaran, Y. Xie, Y. Yu, J. Wu, and J. Yang. Optimizing data partitioning for data-parallel computing. In HotOS '11, 2011.

Digital Library

[26]

S. Y. Ko, I. Hoque, B. Cho, and I. Gupta. On availability of intermediate data in cloud computations. In HotOS '09, 2009.

Digital Library

[27]

Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. SkewTune: Mitigating skew in MapReduce applications. In ACM SIGMOD, 2012.

Digital Library

[28]

Mahout project. http://mahout.apache.org/.

[29]

V. Markl, V. Raman, D. E. Simmen, G. M. Lohman, and H. Pirahesh. Robust query processing through progressive optimization. In ACM SIGMOD, pages 659--670, 2004.

Digital Library

[30]

P. Mishra and M. H. Eich. Join processing in relational databases. ACM Comput. Surv., 24(1):63--113, 1992.

Digital Library

[31]

D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. Ciel: a universal execution engine for distributed data-flow computing. In NSDI 2011.

Digital Library

[32]

A. Okcan and M. Riedewald. Processing theta-joins using mapreduce. In ACM SIGMOD, pages 949--960, 2011.

Digital Library

[33]

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: a not-so-foreign language for data processing. In SIGMOD 2008.

Digital Library

[34]

A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A comparison of approaches to large-scale data analysis. In ACM SIGMOD, 2009.

Digital Library

[35]

R. Power and J. Li. Piccolo: Building fast, distributed programs with partitioned tables. In OSDI, 2010.

Digital Library

[36]

Z. Qian, X. Chen, N. Kang, M. Chen, Y. Yu, T. Moscibroda, and Z. Zhang. MadLINQ: Large-scale distributed matrix computation for the cloud. In EuroSys 2012.

Digital Library

[37]

A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive: a warehousing solution over a Map-Reduce framework. Proc. VLDB Endow., 2(2), 2009.

Digital Library

[38]

Y. Xie, F. Yu, and M. Abadi. De-anonymizing the internet using unreliable IDs. In SIGCOMM 2009, August 2009.

Digital Library

[39]

Y. Yu, P. K. Gunda, and M. Isard. Distributed aggregation for data-parallel computing: interfaces and implementations. In SOSP, 2009.

Digital Library

[40]

Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008.

Digital Library

[41]

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI 2012.

Digital Library

[42]

J. Zhang, R. C. Hucheng Zhou, X. Fan, Z. Guo, H. Lin, J. Y. Li, W. Lin, J. Zhou, and L. Zhou. Optimizing data shuffling in data-parallel computation by understanding user-defined functions. In NSDI, 2012.

Digital Library

Cited By

Jung KMitra GGopalakrishnan SPattabiraman K(2024)ImmunoPlane: Middleware for Providing Adaptivity to Distributed Internet-of-Things Applications2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)10.1109/IoTDI61053.2024.00006(13-24)Online publication date: 13-May-2024
https://doi.org/10.1109/IoTDI61053.2024.00006
Benkrid SBellatreche LMestoui YOrdonez C(2022)PROADAPT: Proactive framework for adaptive partitioning for big data warehousesData & Knowledge Engineering10.1016/j.datak.2022.102102142(102102)Online publication date: Nov-2022
https://doi.org/10.1016/j.datak.2022.102102
Murray DŠimša JKlimovic AIndyk I(2021)tf.dataProceedings of the VLDB Endowment10.14778/3476311.347637414:12(2945-2958)Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.14778/3476311.3476374
Show More Cited By

Index Terms

Optimus: a dynamic rewriting framework for data-parallel execution plans

Recommendations

Optimus: an efficient dynamic resource scheduler for deep learning clusters
EuroSys '18: Proceedings of the Thirteenth EuroSys Conference

Deep learning workloads are common in today's production clusters due to the proliferation of deep learning driven AI services (e.g., speech recognition, machine translation). A deep learning training job is resource-intensive and time-consuming. ...
Optimus: efficient realization of streaming applications on FPGAs
CASES '08: Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems

In this paper, we introduce Optimus: an optimizing synthesis compiler for streaming applications. Optimus compiles programs written in a high level streaming language to either software or hardware implementations. The compiler uses a hierarchical ...
Optimus: A Parallel Optimization Framework with Topology Aware PSO and Applications
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

This research presents a parallel metaheuristic optimization framework, Optimus (Optimization Methods for Universal Simulators) for integration of a desired population-based search method with a target scientific application. Optimus includes a parallel ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems

April 2013

401 pages

ISBN:9781450319942

DOI:10.1145/2465351

General Chairs:
Zdenek Hanzálek
Czech Technical University Prague
,
Hermann Härtig
Technische Universität Dresden
,
Program Chairs:
Miguel Castro
Microsoft Research Cambridge
,
M. Frans Kaashoek
MIT

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

EuroSys '13

Sponsor:

SIGOPS

EuroSys '13: Eighth Eurosys Conference 2013

April 15 - 17, 2013

Prague, Czech Republic

Acceptance Rates

EuroSys '13 Paper Acceptance Rate 28 of 143 submissions, 20%;

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
372
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)5

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jung KMitra GGopalakrishnan SPattabiraman K(2024)ImmunoPlane: Middleware for Providing Adaptivity to Distributed Internet-of-Things Applications2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)10.1109/IoTDI61053.2024.00006(13-24)Online publication date: 13-May-2024
https://doi.org/10.1109/IoTDI61053.2024.00006
Benkrid SBellatreche LMestoui YOrdonez C(2022)PROADAPT: Proactive framework for adaptive partitioning for big data warehousesData & Knowledge Engineering10.1016/j.datak.2022.102102142(102102)Online publication date: Nov-2022
https://doi.org/10.1016/j.datak.2022.102102
Murray DŠimša JKlimovic AIndyk I(2021)tf.dataProceedings of the VLDB Endowment10.14778/3476311.347637414:12(2945-2958)Online publication date: 1-Jul-2021
https://dl.acm.org/doi/10.14778/3476311.3476374
Gévay GSoto JMarkl V(2021)Handling Iterations in Distributed Dataflow SystemsACM Computing Surveys10.1145/347760254:9(1-38)Online publication date: 8-Oct-2021
https://dl.acm.org/doi/10.1145/3477602
Song WYang YEo JSeo JKim JLee SLee GUm TCho HChun B(2021)Apache Nemo: A Framework for Optimizing Distributed Data ProcessingACM Transactions on Computer Systems10.1145/346814438:3-4(1-31)Online publication date: 15-Oct-2021
https://dl.acm.org/doi/10.1145/3468144
Li YInterlandi MPsallidas FWang WZaniolo C(2021)SEIZE: Runtime Inspection for Parallel Dataflow SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.303517032:4(842-854)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TPDS.2020.3035170
Meddad MMoujahdi CMikram MRziza M(2020)Optimization of Multi-user Face Identification Systems in Big Data EnvironmentsAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0506915:6(762-767)Online publication date: Nov-2020
https://doi.org/10.25046/aj050691
Li YInterlandi MPsallidas FWang WZaniolo C(2020)SEIZE User Desired Moments: Runtime Inspection for Parallel Dataflow Systems2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS47774.2020.00147(1199-1200)Online publication date: Nov-2020
https://doi.org/10.1109/ICDCS47774.2020.00147
Yang YEo JKim GKim JLee SSeo JSong WChun BDan TDahlia M(2019)Apache nemoProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358824(177-190)Online publication date: 10-Jul-2019
https://dl.acm.org/doi/10.5555/3358807.3358824
Li RGuo PHu BHu W(2019)Libra and the Art of Task Sizing in Big-Data Analytic SystemsProceedings of the ACM Symposium on Cloud Computing10.1145/3357223.3362720(364-376)Online publication date: 20-Nov-2019
https://dl.acm.org/doi/10.1145/3357223.3362720
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents