Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Neural reverse engineering of stripped binaries using augmented control flow graphs

Published: 13 November 2020 Publication History

Abstract

We address the problem of reverse engineering of stripped executables, which contain no debug information. This is a challenging problem because of the low amount of syntactic information available in stripped executables, and the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with neural models. The main idea is to use static analysis to obtain augmented representations of call sites; encode the structure of these call sites using the control-flow graph (CFG) and finally, generate a target name while attending to these call sites. We use our representation to drive graph-based, LSTM-based and Transformer-based architectures. Our evaluation shows that our models produce predictions that are difficult and time consuming for humans, while improving on existing methods by 28% and by 100% over state-of-the-art neural textual models that do not use any static analysis. Code and data for this evaluation are available at https://github.com/tech-srl/Nero.

Supplementary Material

Auxiliary Presentation Video (oopsla20main-p526-p-video.mp4)
This is a presentation video of our talk @ OOPSLA'20. In this paper we address the problem of reverse engineering of stripped executables. This is a challenging problem because of the low amount of syntactic information available, and the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with neural models. The main idea is to use static analysis to obtain augmented representations of call sites; encode the structure of these call sites using the CFG and finally, generate a target name while attending to these call sites. We use our representation to drive graph-based, LSTM-based and Transformer-based architectures. Our evaluation shows that our models produce predictions that are difficult and time consuming for humans, while improving on existing methods by 28% and by 100% over state-of-the-art neural textual models.

References

[1]
Miltiadis Allamanis. 2018. The Adverse Efects of Code Duplication in Machine Learning Models of Code. arXiv preprint arXiv: 1812. 06469 ( 2018 ).
[2]
Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015a. Suggesting Accurate Method and Class Names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015 ). ACM, New York, NY, USA, 38-49. https://doi.org/10.1145/2786805.2786849
[3]
Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In ICLR.
[4]
Miltiadis Allamanis, Hao Peng, and Charles A. Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2091-2100. http://jmlr.org/proceedings/papers/v48/allamanis16.html
[5]
Miltiadis Allamanis, Daniel Tarlow, Andrew D. Gordon, and Yi Wei. 2015b. Bimodal Modelling of Source Code and Natural Language. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37 (ICML'15). JMLR.org, 2123-2132. http://dl.acm.org/citation.cfm?id= 3045118. 3045344
[6]
Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019a. code2seq: Generating Sequences from Structured Representations of Code. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gKYo09tX
[7]
Uri Alon, Roy Sadaka, Omer Levy, and Eran Yahav. 2019b. Structural Language Models for Any-Code Generation. arXiv preprint arXiv: 1910. 00577 ( 2019 ).
[8]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-based Representation for Predicting Program Properties. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018 ). ACM, New York, NY, USA, 404-419. https://doi.org/10.1145/3192366.3192412
[9]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019c. Code2Vec: Learning Distributed Representations of Code. Proc. ACM Program. Lang. 3, POPL, Article 40 ( 2019 ), 29 pages. https://doi.org/10.1145/3290353
[10]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 ( 2014 ). http://arxiv.org/abs/1409.0473
[11]
Tifany Bao, Jonathan Burket, Maverick Woo, Rafael Turner, and David Brumley. 2014. BYTEWEIGHT: Learning to recognize functions in binary code. Proceedings of the 23rd USENIX Security Symposium ( 2014 ), 845-860.
[12]
Rohan Bavishi, Michael Pradel, and Koushik Sen. 2018. Context2Name: A deep learning-based approach to infer natural variable names from usage contexts. arXiv preprint arXiv: 1809. 05193 ( 2018 ).
[13]
Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942. http://jmlr.org/proceedings/papers/v48/bielik16.html
[14]
Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Gaunt, and Oleksandr Polozov. 2019. Generative Code Modeling with Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=Bke4KsA5FX
[15]
Chung-Cheng Chiu, Tara N Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J Weiss, Kanishka Rao, Ekaterina Gonina, et al. 2018. State-of-the-art speech recognition with sequence-to-sequence models. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4774-4778.
[16]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 ( 2014 ).
[17]
Yaniv David, Nimrod Partush, and Eran Yahav. 2017. Similarity of Binaries Through Re-optimization. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017 ). ACM, New York, NY, USA, 79-94. https://doi.org/10.1145/3062341.3062387
[18]
Daniel DeFreez, Aditya V. Thakur, and Cindy Rubio-González. 2018. Path-based Function Embedding and Its Application to Error-handling Specification Mining. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018 ). ACM, New York, NY, USA, 423-433. https://doi.org/10.1145/3236024.3236059
[19]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171-4186.
[20]
Steven H H Ding, Benjamin C M Fung, and Philippe Charland. 2019. Asm2Vec : Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. S&P ( 2019 ), 5-6.
[21]
R. Edmonds. 2006. PolyUnpack : Automating the Hidden-Code Extraction of.
[22]
Patrick Fernandes, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Structured Neural Summarization. In International Conference on Learning Representations. https://openreview.net/forum?id=H1ersoRqtm
[23]
Martin Fowler and Kent Beck. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.
[24]
Jingxuan He, Pesho Ivanov, Petar Tsankov, Veselin Raychev, and Martin Vechev. 2018. Debin: Predicting Debug Information in Stripped Binaries. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS '18). ACM, New York, NY, USA, 1667-1680. https://doi.org/10.1145/3243734.3243866
[25]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997 ), 1735-1780. https://doi.org/10.1162/neco. 1997. 9.8. 1735
[26]
Einar W. Høst and Bjarte M. Østvold. 2009. Debugging Method Names. In Proceedings of the 23rd European Conference on ECOOP 2009-Object-Oriented Programming (Genoa). Springer-Verlag, Berlin, Heidelberg, 294-317. https://doi.org/10. 1007/978-3-642-03013-0_14
[27]
Intel. [n. d.]. Linux64-abi LINUXABI. https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf.
[28]
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2018. Mapping Language to Code in Programmatic Context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1643-1652.
[29]
Emily R. Jacobson, Nathan E. Rosenblum, and Barton P. Miller. 2011. Labeling library functions in stripped binaries. In Proceedings of the 10th SIGPLAN-SIGSOFT workshop on Program analysis for software tools, PASTE'11. 1-8. https: //doi.org/10.1145/2024569.2024571
[30]
Omer Katz, Noam Rinetzky, and Eran Yahav. 2018. Statistical Reconstruction of Class Hierarchies in Binaries. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '18). ACM, New York, NY, USA, 363-376. https://doi.org/10.1145/3173162.3173202
[31]
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 ( 2014 ).
[32]
Thomas Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
[33]
Jeremy Lacomis, Pengcheng Yin, Edward J Schwartz, Miltiadis Allamanis, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. 2019. DIRE: A Neural Approach to Decompiled Identifier Naming. arXiv preprint arXiv: 1909. 09029 ( 2019 ).
[34]
JongHyup Lee, Thanassis Avgerinos, and David Brumley. 2011. TIE: Principled reverse engineering of types in binary programs. ( 2011 ).
[35]
Cristina V Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DéjàVu: a map of code duplicates on GitHub. Proceedings of the ACM on Programming Languages 1, OOPSLA ( 2017 ), 84.
[36]
Yanxin Lu, Swarat Chaudhuri, Chris Jermaine, and David Melski. 2017. Data-Driven Program Completion. CoRR abs/1705.09042 ( 2017 ). arXiv: 1705.09042 http://arxiv.org/abs/1705.09042
[37]
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Efective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015. 1412-1421. http://aclweb.org/anthology/D/D15/D15-1166.pdf
[38]
James R Lyle and David Binkley. 1993. Program slicing in the presence of pointers. In Proceedings of the 1993 Software Engineering Research Forum. Citeseer, 255-260.
[39]
Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In International Conference on Machine Learning. 649-657.
[40]
Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine. 2017. Bayesian Sketch Learning for Program Synthesis. CoRR abs/1703.05698 ( 2017 ). arXiv: 1703.05698 http://arxiv.org/abs/1703.05698
[41]
Jannik Pewny, Behrad Garmany, Robert Gawlik, Christian Rossow, and Thorsten Holz. 2015. Cross-Architecture Bug Search in Binary Executables. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP '15). IEEE Computer Society, Washington, DC, USA, 709-724. https://doi.org/10.1109/SP. 2015.49
[42]
Michael Pradel and Koushik Sen. 2018. DeepBugs: A Learning Approach to Name-based Bug Detection. Proc. ACM Program. Lang. 2, OOPSLA, Article 147 (Oct. 2018 ), 25 pages. https://doi.org/10.1145/3276517
[43]
Alec Radford, Jefrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2018. Language models are unsupervised multitask learners. ( 2018 ).
[44]
Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016a. Probabilistic Model for Code with Decision Trees. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016 ). ACM, New York, NY, USA, 731-747. https://doi.org/10.1145/2983990.2984041
[45]
Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. 2016b. Learning Programs from Noisy Data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '16). ACM, New York, NY, USA, 761-774. https://doi.org/10.1145/2837614.2837671
[46]
Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program Properties from "Big Code". In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '15). ACM, New York, NY, USA, 111-124. https://doi.org/10.1145/2676726.2677009
[47]
Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '14). ACM, New York, NY, USA, 419-428. https://doi.org/10.1145/2594291.2594321
[48]
T. Reps, G. Balakrishnan, J. Lim, and T. Teitelbaum. 2005. A Next-generation Platform for Analyzing Executables. In Proceedings of the Third Asian Conference on Programming Languages and Systems (APLAS'05). Springer-Verlag, Berlin, Heidelberg, 212-229. https://doi.org/10.1007/11575467_15
[49]
Andrew Rice, Edward Aftandilian, Ciera Jaspan, Emily Johnston, Michael Pradel, and Yulissa Arroyo-Paredes. 2017. Detecting argument selection defects. Proceedings of the ACM on Programming Languages 1, OOPSLA ( 2017 ), 104.
[50]
Saksham Sachdev, Hongyu Li, Sifei Luan, Seohyun Kim, Koushik Sen, and Satish Chandra. 2018. Retrieval on source code: a neural code search. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, MAPL@PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018. 31-41. https://doi.org/10.1145/3211346.3211353
[51]
Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing Functions in Binaries with Neural Networks. In USENIX Security Symposium. 611-626.
[52]
Nitish Srivastava, Geofrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research 15, 1 ( 2014 ), 1929-1958.
[53]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-3112.
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998-6008.
[55]
Daniel Votipka, Seth Rabin, Kristopher Micinski, Jefrey S Foster, and Michelle L Mazurek. 2020. An Observational Investigation of Reverse Engineers' Processes. In 29th USENIX Security Symposium (USENIX Security 20). 1875-1892.
[56]
Mark Weiser. 1984. Program Slicing. IEEE Transactions on Software Engineering SE-10, 4 (jul 1984 ), 352-357. https: //doi.org/10.1109/TSE. 1984.5010248
[57]
Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 363-376.

Cited By

View all
  • (2024)Intelligent code search aids edge software developmentJournal of Cloud Computing10.1186/s13677-024-00629-513:1Online publication date: 1-Apr-2024
  • (2024)TypeFSL: Type Prediction from Binaries via Inter-procedural Data-flow Analysis and Few-shot LearningProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695502(1269-1281)Online publication date: 27-Oct-2024
  • (2024)How Effectively Do Code Language Models Understand Poor-Readability Code?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695072(795-806)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
November 2020
3108 pages
EISSN:2475-1421
DOI:10.1145/3436718
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2020
Published in PACMPL Volume 4, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Neural Reverse Engineering
  2. Static Binary Analysis

Qualifiers

  • Research-article

Funding Sources

  • Israel Ministry of Science and Technology

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)505
  • Downloads (Last 6 weeks)58
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Intelligent code search aids edge software developmentJournal of Cloud Computing10.1186/s13677-024-00629-513:1Online publication date: 1-Apr-2024
  • (2024)TypeFSL: Type Prediction from Binaries via Inter-procedural Data-flow Analysis and Few-shot LearningProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695502(1269-1281)Online publication date: 27-Oct-2024
  • (2024)How Effectively Do Code Language Models Understand Poor-Readability Code?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695072(795-806)Online publication date: 27-Oct-2024
  • (2024)Enhancing Function Name Prediction using Votes-Based Name Tokenization and Multi-task LearningProceedings of the ACM on Software Engineering10.1145/36607821:FSE(1679-1702)Online publication date: 12-Jul-2024
  • (2024)Bin2Summary: Beyond Function Name Prediction in Stripped Binaries with Functionality-Specific Code EmbeddingsProceedings of the ACM on Software Engineering10.1145/36437291:FSE(47-69)Online publication date: 12-Jul-2024
  • (2024)CodeQueries: A Dataset of Semantic Queries over CodeProceedings of the 17th Innovations in Software Engineering Conference10.1145/3641399.3641408(1-11)Online publication date: 22-Feb-2024
  • (2024)BinAdapter: Leveraging Continual Learning for Inferring Function Symbol Names in a BinaryProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3645006(1200-1213)Online publication date: 1-Jul-2024
  • (2024)BinaryAI: Binary Software Composition Analysis via Intelligent Binary Source Code MatchingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639100(1-13)Online publication date: 20-May-2024
  • (2024)GRRLNJournal of Software: Evolution and Process10.1002/smr.264936:7Online publication date: 14-Jul-2024
  • (2023)BinBench: a benchmark for x64 portable operating system interface binary function representationsPeerJ Computer Science10.7717/peerj-cs.12869(e1286)Online publication date: 1-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media