Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2903150.2903162acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Towards co-designed optimizations in parallel frameworks: a MapReduce case study

Published: 16 May 2016 Publication History

Abstract

The explosion of Big Data was followed by the proliferation of numerous complex parallel software stacks whose aim is to tackle the challenges of data deluge. A drawback of a such multi-layered hierarchical deployment is the inability to maintain and delegate vital semantic information between layers in the stack. Software abstractions increase the semantic distance between an application and its generated code. However, parallel software frameworks contain inherent semantic information that general purpose compilers are not designed to exploit.
This paper presents a case study demonstrating how the specific semantic information of the MapReduce paradigm can be exploited on multicore architectures. MR4J has been implemented in Java and evaluated against hand-optimized C and C++ equivalents. The initial observed results led to the design of a semantically aware optimizer that runs automatically without requiring modification to application code.
The optimizer is able to speedup the execution time of MR4J by up to 2.0x. The introduced optimization not only improves the performance of the generated code, during the map phase, but also reduces the pressure on the garbage collector. This demonstrates how semantic information can be harnessed without sacrificing sound software engineering practices when using parallel software frameworks.

References

[1]
R. Appuswamy, C. Gkantsidis, D. Narayanan, O. Hodson, and A. Rowstron. Nobody ever got fired for buying a cluster. Technical report, Technical Report MSR-TR-2013-2, Microsoft Research, 2013.
[2]
W. Binder, J. Hulaas, and P. Moret. Advanced Java Bytecode Instrumentation. In Proceedings of the 5th International Symposium on Principles and Practice of Programming in Java, pages 135--144, 2007.
[3]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 207--216, 1995.
[4]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: An Object-Oriented Approach to Non-Uniform Cluster Computing. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 519--538, 2005.
[5]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1):107--113, 2008.
[6]
G. Duboscq, L. Stadler, T. Würthinger, D. Simon, C. Wimmer, and H. Mössenböck. Graal IR: An Extensible Declarative Intermediate Representation. In Proceedings of the Second Asia-Pacific Programming Languages and Compilers Workshop, 2013.
[7]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 29--43, 2003.
[8]
R. Jones, A. Hosking, and E. Moss. The Garbage Collection Handbook: The Art of Automatic Memory Management. Chapman & Hall/CRC, 2011.
[9]
D. Lea. A Java Fork/Join Framework. In Proceedings of the ACM 2000 Conference on Java Grande, pages 36--43, 2000.
[10]
MongoDB, Inc. MongoDB Map-Reduce Website. http://docs.mongodb.org/manual/core/map-reduce/. Online; last accessed 14-October-2013.
[11]
S. S. Muchnick. Advanced Compiler Design & Implementation. Morgan Kaufmann Publishers, Inc., 1997.
[12]
Oracle, Inc. Java SE. http://www.oracle.com/technetwork/java/javase/overview/index.html.
[13]
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In Proceedings of the 13th IEEE International Symposium on High Performance Computer Architecture, pages 13--24, 2007.
[14]
J. Talbot, R. M. Yoo, and C. Kozyrakis. Phoenix++: Modular MapReduce for Shared-Memory Systems. In Proceedings of the 2nd International Workshop on MapReduce and its Applications, pages 9--16, 2011.
[15]
The Apache Software Foundation. Hadoop Project Website. http://hadoop.apache.org/.
[16]
The Apache Software Foundation. Mahout Project Website. http://mahout.apache.org/. Online; last accessed 23-July-2013.
[17]
T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to Rule Them All. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software, pages 187--204, 2013.
[18]
R. M. Yoo, A. Romano, and C. Kozyrakis. Phoenix Rebirth: Scalable MapReduce on a Large-scale Shared-Memory System. In IEEE International Symposium on Workload Characterization, pages 198--207, 2009.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '16: Proceedings of the ACM International Conference on Computing Frontiers
May 2016
487 pages
ISBN:9781450341288
DOI:10.1145/2903150
  • General Chairs:
  • Gianluca Palermo,
  • John Feo,
  • Program Chairs:
  • Antonino Tumeo,
  • Hubertus Franke
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

CF'16
Sponsor:
CF'16: Computing Frontiers Conference
May 16 - 19, 2016
Como, Italy

Acceptance Rates

CF '16 Paper Acceptance Rate 30 of 94 submissions, 32%;
Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 81
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media