Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Polyhedral parallelization of binary code

Published: 26 January 2012 Publication History

Abstract

Many automatic software parallelization systems have been proposed in the past decades, but most of them are dedicated to source-to-source transformations. This paper shows that parallelizing executable programs is feasible, even if they require complex transformations, and in effect decouples parallelization from compilation, for example, for closed-source or legacy software, where binary code is the only available representation.
We propose an automatic parallelizer, which is able to perform advanced parallelization on binary code. It first parses the binary code and extracts high-level information. From this information, a C program is generated. This program captures only a subset of the program semantics, namely, loops and memory accesses. This C program is then parallelized using existing, state-of-the-art parallelizers, including advanced polyhedral parallelizers. The original program semantics is then re-injected, and the transformed parallel loop nests are recompiled by a standard C compiler.
We show on the PolyBench benchmark suite that our system successfully detects and parallelizes almost all the loop nests from the binary code, using a recent polyhedral loop parallelizer as a backend. The paper ends by elaborating a strategy to parallelize more complex programs, such as those containing non-linear accesses to memory, and provides a few example case-studies.

References

[1]
Appel, A. and Ginsburg, M. 2004. Modern Compiler Implementation in C. Cambridge University Press.
[2]
Bae, H., Bachega, L., Dave, C., Lee, S.-I., Lee, S., Min, S.-J., Eigenmann, R., and Midkiff, S. 2009. Cetus: A source-to-source compiler infrastructure for multicores. In Proceedings of CPC '09.
[3]
Bastoul, C. 2004. Code generation in the polyhedral model is easier than you think. In Proceedings of PACT'04. IEEE Computer Society, Los Alamitos, CA.
[4]
Bondhugula, U., Hartono, A., Ramanujam, J., and Sadayappan, P. 2008. A practical automatic polyhedral parallelizer and locality optimizer. In Proceedings of PLDI '08. ACM, New York, USA.
[5]
Clauss, P., Fernández, F. J., Garbervetsky, D., and Verdoolaege, S. 2009. Symbolic polynomial maximization over convex sets and its application to memory requirement estimation. IEEE Trans. VLSI Syst. 17, 983--996.
[6]
Clauss, P. and Loechner, V. 1998. Parametric analysis of polyhedral iteration spaces. J. VLSI Signal Process. Syst. 19, 179--194.
[7]
Clauss, P. and Tchoupaeva, I. 2004. A symbolic spproach to Bernstein expansion for program analysis and optimization. In Proceedings of CC'04. Springer.
[8]
Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Zadeck, F. K. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Long. Syst. 13, 4, 451--490.
[9]
DeVuyst, M., Tullsen, D. M., and Kim, S. W. 2011. Runtime parallelization of legacy code on a transactional memory system. In Proceedings of HiPEAC '11. ACM, New York, NY.
[10]
Feautrier, P. 1988. Parametric integer programming. RAIRO Recherche Opérationnelle 22, 3, 243--268.
[11]
Feautrier, P. 1992. Some efficient solutions to the affine scheduling problem. i. one-dimensional time. Int. J. Parall. Program. 21, 313--347.
[12]
Hertzberg, B. and Olukotun, K. 2011. Runtime automatic speculative parallelization. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization. 64--73.
[13]
Kotha, A., Anand, K., Smithson, M., Yellareddy, G., and Barua, R. 2010. Automatic parallelization in a binary rewriter. In Proceedings of MICRO '43. IEEE Computer Society, Los Alamitos, CA.
[14]
Laurenzano, M. A., Tikir, M. M., Carrington, L., and Snavely, A. 2010. PEBIL: Efficient static binary instrumentation for linux. In Proceedings of ISPASS '10.
[15]
Liu, W., Tuck, J., Ceze, L., Ahn, W., Strauss, K., Renau, J., and Torrellas, J. 2006. Posh: a tls compiler that exploits program structure. In Proceedings of PPoPP '06. ACM, New York, NY, 158--167.
[16]
Muchnick, S. S. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann.
[17]
Ootsu, K., Yokota, T., Ono, T., and Baba, T. 2002. Preliminary evaluation of a binary translation system for multithreaded processors. In Proceedings of the Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems. 77--84.
[18]
openmp 2008. The OpenMP API specification. http://www.openmp.org.
[19]
polybench 2010. Polybenchs 1.0. http://www-rocq.inria.fr/pouchet/software/polybenchs.
[20]
Pouchet, L.-N., Bondhugula, U., Bastoul, C., Cohen, A., Ramanujam, J., and Sadayappan, P. 2010. Combined iterative and model-driven optimization in an automatic parallelization framework. In Proceedings of the Conference on Supercomputing (SC '10). IEEE Computer Society Press, Los Alamitos, CA.
[21]
Schwarz, B., Debray, S., Andrews, G., and Legendre, M. 2001. Plto: A link-time optimizer for the intel ia-32 architecture. In Proceedings of the Workshop on Binary Translation.
[22]
SpecOMP 2001. SPEC OMP benchmark suite. http://www.spec.org/omp/.
[23]
Van Put, L., Chanet, D., De Bus, B., De Sutler, B., and De Bosschere, K. 2005. Diablo: A reliable, retargetable and extensible link-time rewriting framework. In Proceedings of the IEEE Symposium on Signal Processing and Information Technology.
[24]
Verdoolaege, S. 2010. ISL: An integer set library for the polyhedral model. In Proceedings of ICMS.
[25]
Verdoolaege, S., Seghir, R., Beyls, K., Loechner, V., and Bruynooghe, M. 2007. Counting integer points in parametric polytopes using Barvinok's rational functions. Algorithmica 48, 1, 37--66.
[26]
Yang, J., Skadron, K., Soffa, M.-L., and Whitehouse, K. Feasibility of dynamic binary parallelization. http://www.usenix.org/event/hotpar11/tech/final_files/Yang.pdf.
[27]
Yardimci, E. and Franz, M. 2006. Dynamic parallelization and mapping of binary executables on hierarchical platforms. In Proceedings of CF '06. ACM, New York, NY, 127--138.

Cited By

View all
  • (2021)Enhancing Dynamic Binary Translation in Mobile Computing by Leveraging Polyhedral OptimizationWireless Communications & Mobile Computing10.1155/2021/66118672021Online publication date: 1-Jan-2021
  • (2020)IR-Level Dynamic Data Dependence Using Abstract Interpretation Towards Speculative ParallelizationIEEE Access10.1109/ACCESS.2020.29977158(99910-99921)Online publication date: 2020
  • (2020)Redundancy Analysis and Elimination on Access Patterns of the Windows Applications based on I/O Log DataIEEE Access10.1109/ACCESS.2020.2964260(1-1)Online publication date: 2020
  • Show More Cited By

Index Terms

  1. Polyhedral parallelization of binary code

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 4
    Special Issue on High-Performance Embedded Architectures and Compilers
    January 2012
    765 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/2086696
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 January 2012
    Accepted: 01 December 2011
    Revised: 01 October 2011
    Received: 01 July 2011
    Published in TACO Volume 8, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)89
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Enhancing Dynamic Binary Translation in Mobile Computing by Leveraging Polyhedral OptimizationWireless Communications & Mobile Computing10.1155/2021/66118672021Online publication date: 1-Jan-2021
    • (2020)IR-Level Dynamic Data Dependence Using Abstract Interpretation Towards Speculative ParallelizationIEEE Access10.1109/ACCESS.2020.29977158(99910-99921)Online publication date: 2020
    • (2020)Redundancy Analysis and Elimination on Access Patterns of the Windows Applications based on I/O Log DataIEEE Access10.1109/ACCESS.2020.2964260(1-1)Online publication date: 2020
    • (2019)Runtime On-Stack Parallelization of Dependence-Free For-Loops in Binary ProgramsIEEE Letters of the Computer Society10.1109/LOCS.2019.28965592:1(1-4)Online publication date: 1-Mar-2019
    • (2018)Automatically Migrating Sequential Applications to Heterogeneous System Architecture2018 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS.2018.00033(114-121)Online publication date: Jul-2018
    • (2017)ExanaDBTProceedings of the Computing Frontiers Conference10.1145/3075564.3077627(191-200)Online publication date: 15-May-2017
    • (2017)Runtime Vectorization Transformations of Binary CodeInternational Journal of Parallel Programming10.1007/s10766-016-0480-z45:6(1536-1565)Online publication date: 1-Dec-2017
    • (2016)Scalable Hierarchical Polyhedral Compilation2016 45th International Conference on Parallel Processing (ICPP)10.1109/ICPP.2016.56(432-441)Online publication date: Aug-2016
    • (2015)Affine Parallelization Using Dependence and Cache Analysis in a Binary RewriterIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.234950126:8(2154-2163)Online publication date: 1-Aug-2015
    • (2014)Recovering memory access patterns of executable programsScience of Computer Programming10.5555/2748144.274839880:PB(440-456)Online publication date: 1-Feb-2014
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media