Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/CGO51591.2021.9370313acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Cinnamon: a domain-specific language for binary profiling and monitoring

Published: 17 September 2021 Publication History

Abstract

Binary instrumentation and rewriting frameworks provide a powerful way of implementing custom analysis and transformation techniques for applications ranging from performance profiling to security monitoring. However, using these frameworks to write even simple analyses and transformations is non-trivial. Developers often need to write framework-specific boilerplate code and work with low-level and complex programming details. This not only results in hundreds (or thousands) of lines of code, but also leaves significant room for error.
To address this, we introduce Cinnamon, a domain-specific language designed to write programs for binary profiling and monitoring. Cinnamon's abstractions allow the programmer to focus on implementing their technique in a platform-independent way, without worrying about complex lower-level details. Programmers can use these abstractions to perform analysis and instrumentation at different locations and granularity levels in the binary. The flexibility of Cinnamon also enables its programs to be mapped to static, dynamic or hybrid analysis and instrumentation approaches. As a proof of concept, we target Cinnamon to three different binary frameworks by implementing a custom Cinnamon to C/C++ compiler and integrating the generated code within these frameworks. We further demonstrate the ability of Cinnamon to express a range of profiling and monitoring tools through different use-cases.

References

[1]
J. Kinder, "Static analysis of x86 executables," Ph.D. dissertation, Technische Universität Darmstadt, 2010.
[2]
X. Meng and B. P. Miller, "Binary code is not easy," in ISSTA, 2016.
[3]
S. Wang, P. Wang, and D. Wu, "Reassembleable disassembling," in {USENIX} Security Symposium, 2015.
[4]
"Hex Rays, IDA Pro," https://www.hex-rays.com/products/ida/.
[5]
S. B. Yadavalli and A. Smith, "Raising binaries to LLVM IR with MCTOLL," in LCTES, 2019.
[6]
A. Dinaburg and A. Ruef, "Mcsema: Static translation of x86 instructions to llvm," in ReCon, 2014.
[7]
D. Brumley, I. Jager, T. Avgerinos, and E. J. Schwartz, "BAP: A binary analysis platform," in Computer Aided Verification, 2011.
[8]
F. Bellard, "QEMU, a fast and portable dynamic translator," in USENIX Annual Technical Conference, 2005.
[9]
N. Nethercote and J. Seward, "Valgrind: A framework for heavyweight dynamic binary instrumentation," in PLDI, 2007.
[10]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building customized program analysis tools with dynamic instrumentation," in PLDI, 2005.
[11]
D. Bruening, T. Garnett, and S. Amarasinghe, "An infrastructure for adaptive dynamic optimization," in CGO, 2003.
[12]
D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena, "BitBlaze: A new approach to computer security via binary analysis," in International Conference on Information Systems Security, 2008.
[13]
A. R. Bernat and B. P. Miller, "Anywhere, any-time binary instrumentation," in PASTE, 2011.
[14]
A. Eustace and A. Srivastava, "ATOM: A flexible interface for building high performance program analysis tools," in USENIX Technical Conference, 1995.
[15]
M. A. Laurenzano, M. M. Tikir, L. Carrington, and A. Snavely, "PEBIL: Efficient static binary instrumentation for linux," in ISPASS, 2010.
[16]
D. L. Bruening, "Efficient, transparent, and comprehensive runtime code manipulation," Ph.D. dissertation, Massachusetts Institute of Technology, 2004.
[17]
N. Nethercote, R. Walsh, and J. Fitzhardinge, "Building workload characterization tools with valgrind," in IISWC, 2006.
[18]
J. Pewny, B. Garmany, R. Gawlik, C. Rossow, and T. Holz, "Cross-architecture bug search in binary executables," in Security and Privacy, 2015.
[19]
D. Brumley, J. Newsome, D. Song, H. Wang, and S. Jha, "Towards automatic generation of vulnerability-based signatures," in Security and Privacy, 2006.
[20]
M. Zhang and R. Sekar, "Control flow integrity for COTS binaries," in USENIX Security Symposium, 2013.
[21]
V. Kiriansky, D. Bruening, and S. P. Amarasinghe, "Secure execution via program shepherding," in USENIX Security Symposium, 2002.
[22]
D. Bruening and Q. Zhao, "Practical memory checking with Dr. Memory," in CGO, 2011.
[23]
S. Amarasinghe, "Secure execution environment via program shepherding," in USENIX Security Symposium, 2002.
[24]
M. Prasad and T.-c. Chiueh, "A binary rewriting defense against stack based buffer overflow attacks." in USENIX Annual Technical Conference, 2003.
[25]
N. Nethercote and J. Fitzhardinge, "Bounds-checking entire programs without recompiling," in SPACE, 2004.
[26]
R. Zhou, G. Wort, M. Erdős, and T. M. Jones, "The Janus triad: Exploiting parallelism through dynamic binary modification," in VEE, 2019.
[27]
B. De Sutter, B. De Bus, and K. De Bosschere, "Link-time binary rewriting techniques for program compaction," ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 27, no. 5, 2005.
[28]
S. Bansal and A. Aiken, "Binary translation using peephole superoptimizers," in OSDI, 2008.
[29]
M. Panchenko, R. Auler, B. Nell, and G. Ottoni, "BOLT: A practical binary optimizer for data centers and beyond," in CGO, 2019.
[30]
K. Anand, M. Smithson, K. Elwazeer, A. Kotha, J. Gruen, N. Giles, and R. Barua, "A compiler-level intermediate representation based binary analysis and rewriting system," in EuroSys, 2013.
[31]
GrammaTech, "https://www.grammatech.com/codesurfer-binaries," 2020.
[32]
G. Balakrishnan and T. Reps, "Analyzing memory accesses in x86 executables," in Compiler Construction, 2004.
[33]
P. O'Sullivan, K. Anand, A. Kotha, M. Smithson, R. Barua, and A. D. Keromytis, "Retrofitting security in COTS software with binary rewriting," in Future Challenges in Security and Privacy for Academia and Industry, 2011.
[34]
X. Chen, A. Slowinska, D. Andriesse, H. Bos, and C. Giuffrida, "StackArmor: Comprehensive protection from stack-based memory error vulnerabilities for binaries," in NDSS, 2015.
[35]
B. De Sutter, B. De Bus, K. De Bosschere, and S. Debray, "Combining global code and data compaction," in LCTES, 2001.
[36]
L. Van Put, D. Chanet, B. De Bus, B. De Sutter, and K. De Bosschere, "Diablo: a reliable, retargetable and extensible link-time rewriting framework," in International Symposium on Signal Processing and Information Technology, 2005.
[37]
R. Muth, S. Debray, S. Watterson, and K. De Bosschere, "Alto: A link-time optimizer for the Compaq Alpha," Software: Practice and Experience, vol. 31, 2001.
[38]
Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino, A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vigna, "SOK: (state of) the art of war: Offensive techniques in binary analysis," in Security and Privacy, 2016.
[39]
Valgrind Developers, "Callgrind: a call-graph generating cache and branch prediction profiler," 2010.
[40]
Valgrind Developers, "Cachegrind: a cache and branch-prediction profiler," 2009.
[41]
J. Seward and N. Nethercote, "Using valgrind to detect undefined value errors with bit-precision," in USENIX Annual Technical Conference, 2005.
[42]
C. Wang, S. Hu, H.-s. Kim, S. R. Nair, M. Breternitz, Z. Ying, and Y. Wu, "StarDBT: An efficient multi-platform dynamic binary translation system," in Advances in Computer Systems Architecture, L. Choi, Y. Paek, and S. Cho, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 4--15.
[43]
M. Bach, M. Charney, R. Cohn, E. Demikhovsky, T. Devor, K. Hazelwood, A. Jaleel, C.-K. Luk, G. Lyons, H. Patil et al., "Analyzing parallel programs with pin," IEEE Computer, vol. 43, no. 3, 2010.
[44]
W. Cheng, Qin Zhao, Bei Yu, and S. Hiroshige, "TaintTrace: Efficient flow tracing with dynamic binary rewriting," in ISCC, 2006.
[45]
F. Qin, C. Wang, Z. Li, H.-s. Kim, Y. Zhou, and Y. Wu, "Lift: A low-overhead practical information flow tracking system for detecting security attacks," in MICRO, 2006.
[46]
J. Marathe, F. Mueller, T. Mohan, S. A. Mckee, B. R. De Supinski, and A. Yoo, "METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies," ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 29, no. 2, 2007.
[47]
S. Makarov, "An event-based language for programmable debugging," Ph.D. dissertation, University of Toronto, 2018.
[48]
M. Zhang, "Static binary instrumentation with applications to COTS software security," Ph.D. dissertation, Stony Brook University, 2015.
[49]
T. Moseley, D. Grunwald, D. A. Connors, R. Ramanujam, V. Tovinkere, and R. Peri, "Loopprof: Dynamic techniques for loop detection and profiling," in WBIA, 2006.
[50]
T. S. F. X. Teixeira, C. Ancourt, D. Padua, and W. Gropp, "Locus: A system and a language for program optimization," in CGO, 2019.
[51]
P. Klint, T. van der Storm, and J. J. Vinju, "RASCAL: A domain specific language for source code analysis and manipulation," International Working Conference on Source Code Analysis and Manipulation, 2009.
[52]
P. Ginsbach, L. Crawford, and M. F. P. O'Boyle, "CAnDL: A domain specific language for compiler analysis," in CC, 2018.
[53]
N. P. Lopes, D. Menendez, S. Nagarakatte, and J. Regehr, "Provably correct peephole optimizations with Alive," in PLDI, 2015.
[54]
P. Ginsbach, T. Remmelg, M. Steuwer, B. Bodin, C. Dubach, and M. F. P. O'Boyle, "Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach," in ASPLOS, 2018.
[55]
D. L. Whitfield and M. L. Soffa, "An approach for exploring code improving transformations," ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 19, no. 6, 1997.
[56]
A. S. Charif-Rubial, D. Barthou, C. Valensi, S. Shende, A. Malony, and W. Jalby, "MIL: A language to build program analysis tools through static binary instrumentation," in International Conference on High Performance Computing, 2013.
[57]
L. Marek, A. Villazon, Y. Zheng, D. Ansaloni, W. Binder, and Z. Qi, "DiSL: A domain-specific language for bytecode instrumentation," in AOSD, 2012.
[58]
J. Mußler, D. Lorenz, and F. Wolf, "Reducing the overhead of direct application instrumentation using prior static analysis," in Euro-Par, 2011.
[59]
J. K. Hollingsworth, O. Niam, B. P. Miller, Zhichen Xu, M. J. R. Goncalves, and Ling Zheng, "MDL: a language and compiler for dynamic program instrumentation," in PACT, 1997.
[60]
J. Vanegue, J. Medeiros, E. Bisolfati, A. Desnos, T. Figueredo, T. Garnier, R. Lesniak, J. Palencia, S. Roy, S. Soudan et al., "The ERESI reverse engineering software interface," 2009.
[61]
J. Vanegue, "Static binary analysis with a domain specific language," 2008.
[62]
T. Dullien and S. Porst, "REIL: A platform-independent intermediate representation of disassembled code for static code analysis," 2009.
[63]
C. Heitman and I. Arce, "BARF: a multiplatform open source binary analysis and reverse engineering framework," in Congreso Argentino de Ciencias de la Computación, 2014.
[64]
M. Steuwer, C. Fensch, S. Lindley, and C. Dubach, "Generating performance portable code using rewrite rules: From high-level functional expressions to high-performance OpenCL code," in ICFP, 2015.
[65]
J. Ragan-Kelley, A. Adams, S. Paris, M. Levoy, S. Amarasinghe, and F. Durand, "Decoupling algorithms from schedules for easy optimization of image processing pipelines," ACM Transactions on Graphics, vol. 31, no. 4, 2012.

Index Terms

  1. Cinnamon: a domain-specific language for binary profiling and monitoring
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            CGO '21: Proceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization
            February 2021
            395 pages
            ISBN:9781728186139
            • General Chair:
            • Jae W. Lee

            Sponsors

            In-Cooperation

            • IEEE CS

            Publisher

            IEEE Press

            Publication History

            Published: 17 September 2021

            Check for updates

            Author Tags

            1. binary analysis and instrumentation
            2. domain-specific language
            3. profiling

            Qualifiers

            • Research-article

            Conference

            CGO '21
            CGO '21: 19th ACM/IEEE International Symposium on Code Generation and Optimization
            February 27 - March 3, 2021
            Virtual Event, Republic of Korea

            Acceptance Rates

            Overall Acceptance Rate 312 of 1,061 submissions, 29%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 49
              Total Downloads
            • Downloads (Last 12 months)8
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 01 Nov 2024

            Other Metrics

            Citations

            View Options

            Get Access

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media