Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474369.3486865acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

StackBERT: Machine Learning Assisted Static Stack Frame Size Recovery on Stripped and Optimized Binaries

Published: 15 November 2021 Publication History

Abstract

The call stack represents one of the core abstractions that compiler-generated programs leverage to organize binary execution at runtime. For many use cases reasoning about stack accesses of binary functions is crucial: security-sensitive applications may require patching even after deployment, and binary instrumentation, rewriting, and lifting all necessitate detailed knowledge about the function frame layout of the affected program. As no comprehensive solution to the stack symbolization problem exists to date, existing approaches have to resort to workarounds like emulated stack environments, resulting in increased runtime overheads.
In this paper we present StackBERT, a framework to statically reason about and reliably recover stack frame information of binary functions in stripped and highly optimized programs. The core idea behind our approach is to formulate binary analysis as a self-supervised learning problem by automatically generating ground truth data from a large corpus of open-source programs. We train a state-of-the-art Transformer model with self-attention and finetune for stack frame size prediction. We show that our finetuned model yields highly accurate estimates of a binary function's stack size from its function body alone across different instruction-set architectures, compiler toolchains, and optimization levels. We successfully verify the static estimates against runtime data through dynamic executions of standard benchmarks and additional studies, demonstrating that StackBERT's predictions generalize to 93.44% of stripped and highly optimized test binaries not seen during training. We envision these results to be useful for improving binary rewriting and lifting approaches in the future.

Supplementary Material

MP4 File (AISec21-fp21.mp4)
In this talk, we present our work StackBERT - a framework to statically reason about and reliably recover stack frame information of binary functions in stripped and optimized binaries. We observe that the function call stack is a critical abstraction to reason about for binary lifting engines. To aid this task, we focus on solving the problem of statically predicting the function stack frame size given its raw disassembly. StackBERT formulates this as a supervised learning problem by automatically generating ground truth data from a large corpus of open-source programs. We train a state-of-the-art Transformer model and finetune it for stack-frame size prediction. We show that our finetuned model yields highly accurate estimates of a binary function's stack size from its function body alone across different instruction-set architectures, compiler toolchains, and optimization levels. We demonstrate that StackBERT?s predictions generalize to 93.44% of test binaries not seen during training.

References

[1]
Toufique Ahmed, Premkumar Devanbu, and Anand Ashok Sawant. 2021. Finding Inlined Functions in Optimized Binaries. https://arxiv.org/pdf/2103.05221.pdf. (2021).
[2]
Anil Altinay, Joseph Nash, Taddeus Kroes, Prabhu Rajasekaran, Dixin Zhou, Adrian Dabrowski, David Gens, Yeoul Na, Stijn Volckaert, Cristiano Giuffrida, et al. 2020. BinRec: dynamic binary lifting and recompilation. In Proceedings of the Fifteenth European Conference on Computer Systems. 1--16.
[3]
Gogul Balakrishnan and Thomas Reps. 2004. Analyzing memory accesses in x86 executables. In International conference on compiler construction. Springer, 5--23.
[4]
Gogul Balakrishnan and Thomas Reps. 2010. WYSINWYX: What you see is not what you eXecute. ACM Transactions on Programming Languages and Systems (TOPLAS) 32, 6 (2010), 1--84.
[5]
Tiffany Bao, Jonathan Burket, Maverick Woo, Rafael Turner, and David Brumley. 2014. BYTEWEIGHT: Learning to recognize functions in binary code. In 23rd USENIX Security Symposium (USENIX Security 14). 845--860.
[6]
Théophile Bastian, Stephen Kell, and Francesco Zappa Nardelli. 2019. Reliable and fast DWARF-based stack unwinding. In Proceedings of the ACM on Programming Languages (OOPSLA), Vol. 3. ACM New York, NY, USA, 1--24.
[7]
Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The long- document transformer. arXiv preprint arXiv:2004.05150 (2020).
[8]
Eli Bendersky. 2011. pyelftools: Parsing ELF and DWARF in Python. https: //github.com/eliben/pyelftools. (2011).
[9]
Satwik Bhattamishra, Kabir Ahuja, and Navin Goyal. 2020. On the ability and limitations of transformers to recognize formal languages. arXiv preprint arXiv:2009.11264 (2020).
[10]
Eric Botcazou, Cyrille Comar, and Olivier Hainque. 2005. Compile-time stack requirements analysis with GCC. In Proceedings of the 2005 GCC Developer's Summit. Citeseer, 93.
[11]
Derek Bruening and Saman Amarasinghe. 2004. Efficient, transparent, and com- prehensive runtime code manipulation. Ph.D. Dissertation. Massachusetts Institute of Technology, Department of Electrical Engineering âĂȩ.
[12]
Derek Bruening, Qin Zhao, and Saman Amarasinghe. 2012. Transparent dynamic instrumentation. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments. 133--144.
[13]
David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J Schwartz. 2011. BAP: A binary analysis platform. In International Conference on Computer Aided Verification. Springer, 463--469.
[14]
Zheng Leong Chua, Shiqi Shen, Prateek Saxena, and Zhenkai Liang. 2017. Neural nets can learn function type signatures from binaries. In 26th USENIX Security Symposium (USENIX Security 17). 99--116.
[15]
Artem Dinaburg and Andrew Ruef. 2014. Mcsema: Static translation of x86 instructions to llvm. In ReCon 2014 Conference, Montreal, Canada.
[16]
Cheng Fu, Huili Chen, Haolan Liu, Xinyun Chen, Yuandong Tian, Farinaz Koushanfar, and Jishen Zhao. 2019. Coda: An end-to-end neural program decompiler. In Advances in Neural Information Processing Systems, Vol. 32. 3708--3719.
[17]
Cheng Fu, Kunlin Yang, Xinyun Chen, Yuandong Tian, and Jishen Zhao. 2020. N-Bref: A High-fidelity Decompiler Exploiting Programming Structures. (2020).
[18]
Michael Hahn. 2020. Theoretical limitations of self-attention in neural sequence models. Transactions of the Association for Computational Linguistics 8 (2020), 156--171.
[19]
Jingxuan He, Pesho Ivanov, Petar Tsankov, Veselin Raychev, and Martin Vechev. 2018. Debin: Predicting debug information in stripped binaries. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 1667--1680.
[20]
R. Nigel Horspool and Nenad Marovac. 1980. An approach to the problem of detranslation of computer programs. Comput. J. 23, 3 (1980), 223--229.
[21]
The kernel development community. 2021. Kernel livepatching consistency model-stack checking. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/livepatch/livepatch.rst#n97. (2021).
[22]
The kernel development community. 2021. ORC unwinder. https://www.kernel. org/doc/html/latest/x86/orc-unwinder.html. (2021).
[23]
Nikita Kitaev, Łukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020).
[24]
Mitja Kolsek and the 0patch Team. 2017. Did Microsoft Just Manually Patch Their Equation Editor Executable? Why Yes, Yes They Did. (CVE-2017--11882). https:// blog.0patch.com/2017/11/did-microsoft-just-manually-patch-their.html. (2017).
[25]
Michael Larabel. 2012. The Linux Kernel Is Now VLA-Free: A Win For Security, Less Overhead and Better For Clang. https://www.phoronix.com/scan.php?page= news_item&px=Linux-Kills-The-VLA. (2012).
[26]
James R Larus and Eric Schnarr. 1995. EEL: Machine-independent executable editing. In Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation. 291--300.
[27]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436--444.
[28]
Cullen Linn, Saumya Debray, Gregory Andrews, and Benjamin Schwarz. 2004. Stack analysis of x86 executables. Manuscript. April (2004).
[29]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[30]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: building customized program analysis tools with dynamic instrumentation. Acm sigplan notices 40, 6 (2005), 190--200.
[31]
Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, portable and fast basic block throughput estimation usingdeep neural networks. In International Conference on Machine Learning. PMLR, 4505--4515.
[32]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavy- weight dynamic binary instrumentation. In SIGPLAN.
[33]
Chromium OS. 2017. Stack Size Analyzer. https://www.chromium.org/chromium-os/ec-development/stack-size-analyzer. (2017).
[34]
Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. arXiv preprint arXiv:1904.01038 (2019).
[35]
Vasilis Pappas. 2012. kBouncer: Efficient and transparent ROP mitigation. Apr 1 (2012), 1--2.
[36]
James Patrick-Evans, Lorenzo Cavallaro, and Johannes Kinder. 2020. Probabilistic naming of functions in stripped binaries. In Annual Computer Security Applications Conference. 373--385.
[37]
Kexin Pei, Jonas Guan, David Williams King, Junfeng Yang, and Suman Jana. 2021. Xda: Accurate, robust disassembly with transfer learning. In Symposium on Network and Distributed System Security (NDSS).
[38]
Josh Poimboeuf. 2016. objtool: add tool to perform compile-time stack metadata validation. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=442f04c34a1a467759d024a1d2c1df0f744dcb06. (2016).
[39]
Vijay Janapa Reddi, Alex Settle, Daniel A Connors, and Robert S Cohn. 2004. Pin: a binary instrumentation tool for computer architecture research and education. In Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture. 22--es.
[40]
John Regehr, Alastair Reid, and Kirk Webb. 2005. Eliminating stack overflow by abstract interpretation. In ACM Transactions on Embedded Computing Systems (TECS). ACM New York, NY, USA, 751--778.
[41]
Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing Functions in Binaries with Neural Networks. In 24th USENIX Security Symposium (USENIX Security 15). USENIX Association, Washington, D.C., 611--626. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/ presentation/shin
[42]
Linux Torvalds. 2012. Re: [RFC 0/5] kernel: backtrace unwind support. https: //lkml.org/lkml/2012/2/10/356. (2012).
[43]
A. M. Turing. 1937. On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathemati- cal Society s2--42, 1 (1937), 230--265. https://doi.org/10.1112/plms/s2-42.1. 230 arXiv:https://londmathsoc.onlinelibrary.wiley.com/doi/pdf/10.1112/plms/s2- 42.1.230
[44]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/ 3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[45]
Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 363--376.
[46]
Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, et al. 2020. Big Bird: Transformers for Longer Sequences. In NeurIPS.
[47]
Chao Zhang, Tao Wei, Zhaofeng Chen, Lei Duan, Laszlo Szekeres, Stephen Mc- Camant, Dawn Song, and Wei Zou. 2013. Practical control flow integrity and randomization for binary executables. In 2013 IEEE Symposium on Security and Privacy. IEEE, 559--573.
[48]
Mingwei Zhang and R Sekar. 2013. Control flow integrity for COTS binaries. In 22nd USENIX Security Symposium (USENIX Security 13). 337--352.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security
November 2021
210 pages
ISBN:9781450386579
DOI:10.1145/3474369
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Check for updates

Author Tags

  1. binary lifting
  2. machine learning
  3. recompilation
  4. stack symbolization

Qualifiers

  • Research-article

Funding Sources

Conference

CCS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 585
    Total Downloads
  • Downloads (Last 12 months)206
  • Downloads (Last 6 weeks)29
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media