Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1952682.1952711acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
research-article

Hybrid binary rewriting for memory access instrumentation

Published: 09 March 2011 Publication History

Abstract

Memory access instrumentation is fundamental to many applications such as software transactional memory systems, profiling tools and race detectors. We examine the problem of efficiently instrumenting memory accesses in x86 machine code to support software transactional memory and profiling. We aim to automatically instrument all shared memory accesses in critical sections of x86 binaries, while achieving overhead close to that obtained when performing manual instrumentation at the source code level.
The two primary options in building such an instrumentation system are static and dynamic binary rewriting: the former instruments binaries at link time before execution, while the latter binary rewriting instruments binaries at runtime. Static binary rewriting offers extremely low overhead but is hampered by the limits of static analysis. Dynamic binary rewriting is able to use runtime information but typically incurs higher overhead. This paper proposes an alternative: hybrid binary rewriting. Hybrid binary rewriting is built around the idea of a persistent instrumentation cache (PIC) that is associated with a binary and contains instrumented code from it. It supports two execution modes when using instrumentation: active and passive modes. In the active execution mode, a dynamic binary rewriting engine (PIN) is used to intercept execution, and generate instrumentation into the PIC, which is an on-disk file. This execution mode can take full advantage of runtime information. Later, passive execution can be used where instrumented code is executed out of the PIC. This allows us to attain overheads similar to those incurred with static binary rewriting.
This instrumentation methodology enables a variety of static and dynamic techniques to be applied. For example, in passive mode, execution occurs directly from the original executable save for regions that require instrumentation. This has allowed us to build a low-overhead transactional memory profiler. We also demonstrate how we can use the combination of static and dynamic techniques to eliminate instrumentation for accesses to locations that are thread-private.

References

[1]
M. Abadi, T. Harris, and M. Mehrara. Transactional memory with strong atomicity using off-the-shelf memory protection hardware. In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 185--196, 2009.
[2]
V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, pages 1--12, 2000.
[3]
D. Bruening, E. Duesterwald, and S. Amarasinghe. Design and implementation of a dynamic optimization framework for windows. In 4th ACM Workshop on Feedback-Directed and Dynamic Optimization, 2000.
[4]
D. Bruening and V. Kiriansky. Process-shared and persistent code caches. In VEE '08: Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 61--70, 2008.
[5]
C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In Proceedings of the IEEE International Symposium on Workload Characterization, pages 35--46, 2008.
[6]
D. Dice, O. Shalev, and N. Shavit. Transactional locking II. In Proceedings of the 20th International Symposium on Distributed Computing, pages 194--208, 2006.
[7]
A. Dragojevic, Y. Ni, and A.-R. Adl-Tabatabai. Optimizing transactions for captured memory. In Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, pages 214--222, 2009.
[8]
A. Eustace and A. Srivastava. Atom: a flexible interface for building high performance program analysis tools. In Proceedings of the USENIX 1995 Technical Conference Proceedings, pages 303--314, 1995.
[9]
V. Gajinov, F. Zyulkyarov, O. S. Unsal, A. Cristal, E. Ayguade, T. Harris, and M. Valero. Quake™: parallelizing a complex sequential application using transactional memory. In Proceedings of the 23rd international conference on Supercomputing, pages 126--135, 2009.
[10]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 190--200, 2005.
[11]
N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, pages 89--100, 2007.
[12]
M. Olszewski, J. Cutler, and J. G. Steffan. Judostm: A dynamic binary-rewriting approach to software transactional memory. In Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, pages 365--375, 2007.
[13]
M. Payer and T. R. Gross. Generating low-overhead dynamic binary translators. In Proceedings of the 3rd Annual Haifa Experimental Systems Conference, pages 22:1--22:14, 2010.
[14]
P. Ratanaworabhan, M. Burtscher, D. Kirovski, B. Zorn, R. Nagpal, and K. Pattabiraman. Detecting and tolerating asymmetric races. In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 173--184, 2009.
[15]
V. J. Reddi, D. Connors, R. Cohn, and M. D. Smith. Persistent code caching: Exploiting code reuse across executions and applications. In Proceedings of the International Symposium on Code Generation and Optimization, pages 74--88, 2007.
[16]
A. Roy, S. Hand, and T. Harris. Exploring the limits of disjoint access parallelism. In Proceedings of the First USENIX conference on Hot topics in parallelism, 2009.
[17]
B. Schwarz, S. Debray, G. Andrews, and M. Legendre. PLTO: A link-time optimizer for the Intel IA-32 architecture. In Proceedings of the 2001 Workshop on Binary Translation, 2001.
[18]
J. Seward and N. Nethercote. Using valgrind to detect undefined value errors with bit-precision. In Proceedings of the annual conference on USENIX Annual Technical Conference, 2005.
[19]
S. Sridhar, J. S. Shapiro, and P. P. Bungale. HDTrans: a low-overhead dynamic translator. SIGARCH Computer Architecture News, 35(1):135--140, 2007.
[20]
T. Usui, R. Behrends, J. Evans, and Y. Smaragdakis. Adaptive locks: Combining transactions and locks for efficient concurrency. In Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pages 3--14, 2009.
[21]
L. Van Put, D. Chanet, B. De Bus, B. De Sutter, and K. De Bosschere. DIABLO: a reliable, retargetable and extensible link-time rewriting framework. In Proceedings of the 2005 IEEE International Symposium On Signal Processing And Information Technology, pages 7--12, 2005.
[22]
C. von Praun, R. Bordawekar, and C. Cascaval. Modeling optimistic concurrency using quantitative dependence analysis. In Proceedings of the 13th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 185--196, 2008.
[23]
A. Welc, B. Saha, and A.-R. Adl-Tabatabai. Irrevocable transactions and their applications. In Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, pages 285--296, 2008.

Cited By

View all
  • (2024)Towards Realistic Results for Instrumentation-Based Profilers for JIT-Compiled SystemsProceedings of the 21st ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes10.1145/3679007.3685058(82-89)Online publication date: 13-Sep-2024
  • (2023)Efficient Dynamic Binary Translation with Accumulative Persistent Code Caching2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00066(458-466)Online publication date: Jan-2023
  • (2014)Computation Integrity Measurement Based on Branch TransferProceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications10.1109/TrustCom.2014.75(590-597)Online publication date: 24-Sep-2014
  • Show More Cited By

Index Terms

  1. Hybrid binary rewriting for memory access instrumentation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    VEE '11: Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
    March 2011
    250 pages
    ISBN:9781450306874
    DOI:10.1145/1952682
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 7
      VEE '11
      July 2011
      231 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2007477
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 March 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. binary rewriting
    2. transactional memory

    Qualifiers

    • Research-article

    Conference

    VEE '11

    Acceptance Rates

    Overall Acceptance Rate 80 of 235 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Towards Realistic Results for Instrumentation-Based Profilers for JIT-Compiled SystemsProceedings of the 21st ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes10.1145/3679007.3685058(82-89)Online publication date: 13-Sep-2024
    • (2023)Efficient Dynamic Binary Translation with Accumulative Persistent Code Caching2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00066(458-466)Online publication date: Jan-2023
    • (2014)Computation Integrity Measurement Based on Branch TransferProceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications10.1109/TrustCom.2014.75(590-597)Online publication date: 24-Sep-2014
    • (2012)A lightweight hybrid hardware/software approach for object-relative memory profilingProceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software10.1109/ISPASS.2012.6189205(46-57)Online publication date: 1-Apr-2012
    • (2023)Informed Memory Access MonitoringPerformance Analysis of Parallel Applications for HPC10.1007/978-981-99-4366-1_4(73-97)Online publication date: 19-Jun-2023
    • (2018)NUMAPROF, A NUMA Memory ProfilerEuro-Par 2018: Parallel Processing Workshops10.1007/978-3-030-10549-5_13(159-170)Online publication date: 27-Aug-2018
    • (2015)An Instruction-Set Randomization Using Length-Preserving PermutationProceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 0110.1109/Trustcom.2015.397(376-383)Online publication date: 20-Aug-2015

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media