Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

WYSINWYX: What you see is not what you eXecute

Published: 13 August 2010 Publication History

Abstract

Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. The article presents the algorithms that we developed, explains how they are used to recover Intermediate Representations (IRs) from executables that are similar to the IRs that would be available if one started from source code, and describes their application in the context of program understanding and automated bug hunting.
Unlike algorithms for analyzing executables that existed prior to our work, the ones presented in this article provide useful information about memory accesses, even in the absence of debugging information. The ideas described in the article are incorporated in a tool for analyzing Intel x86 executables, called CodeSurfer/x86. CodeSurfer/x86 builds a system dependence graph for the program, and provides a GUI for exploring the graph by (i) navigating its edges, and (ii) invoking operations, such as forward slicing, backward slicing, and chopping, to discover how parts of the program can impact other parts.
To assess the usefulness of the IRs recovered by CodeSurfer/x86 in the context of automated bug hunting, we built a tool on top of CodeSurfer/x86, called Device-Driver Analyzer for x86 (DDA/x86), which analyzes device-driver executables for bugs. Without the benefit of either source code or symbol-table/debugging information, DDA/x86 was able to find known bugs (that had been discovered previously by source-code analysis tools), along with useful error traces, while having a low false-positive rate. DDA/x86 is the first known application of program analysis/verification techniques to industrial executables.

References

[1]
Aigner, G. and Hölzle, U. 1996. Eliminating virtual function calls in C++ programs. In Proceedings of the European Conference on Object-Oriented Programming.
[2]
aiT. aiT worst-case execution time analyzer. http://www.AbsInt.com/aiT.
[3]
Amme, W., Braun, P., Zehendner, E., and Thomasset, F. 2000. Data dependence analysis of assembly code. Int. J. Parall. Program. 28, 5, 431--467.
[4]
Backes, W. 2004. Programmanalyse des xrtl zwischencodes. Ph.D. thesis, Universitaet des Saarlandes. (In German.)
[5]
Bala, V., Duesterwald, E., and Banerjia, S. 2000. Dynamo: A transparent runtime optimization system. In Proceedings of the Conference on Programming Language Design and Implementation.
[6]
Balakrishnan, G. 2007. WYSINWYX: What You See Is Not What You eXecute. Ph.D. thesis, Tech. rep. 1603. Computer Science Department, University of Wisconsin, Madison, WI.
[7]
Balakrishnan, G., Gruian, R., Reps, T., and Teitelbaum, T. 2005a. Codesurfer/x86 -- A platform for analyzing x86 executables, (tool demonstration paper). In Proceedings of the International Conference on Compiler Construction.
[8]
Balakrishnan, G. and Reps, T. 2004. Analyzing memory accesses in x86 executables. In Proceedings of the International Conference on Compiler Construction. 5--23.
[9]
Balakrishnan, G. and Reps, T. 2006. Recency-Abstraction for heap-allocated storage. In Proceedings of the Static Analysis Symposium.
[10]
Balakrishnan, G. and Reps, T. 2007. DIVINE: DIscovering Variables IN Executables. In Proceedings of the Conference on Verification, Model Checking, and Abstract Interpretation.
[11]
Balakrishnan, G. and Reps, T. 2008. Analyzing stripped device-driver executables. In Proceedings of the Conference on Tools and Algorithms for the Construction and Analysis of Systems.
[12]
Balakrishnan, G., Reps, T., Kidd, N., Lal, A., Lim, J., Melski, D., Gruian, R., Yong, S., Chen, C.-H., and Teitelbaum, T. 2005b. Model checking x86 executables with CodeSurfer/x86 and WPDS++. In Proceedings of the Conference on Computer Aided Verification.
[13]
Balakrishnan, G., Reps, T., Melski, D., and Teitelbaum, T. 2007. WYSINWYX: What you see is not what you execute. In Proceedings of the IFIP Working Conference on Verified Software: Theories, Tools, Experiments.
[14]
Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., Ondrusek, B., Rajamani, S., and Ustuner, A. 2006. Thorough static analysis of device drivers. In Proceedings of the European Conference on Computer Systems.
[15]
Ball, T. and Rajamani, S. 2000. Bebop: A symbolic model checker for Boolean programs. In Proceedings of the Spin Workshop. Lecture Notes in Computer Science, vol. 1885. Springer, 113--130.
[16]
Ball, T. and Rajamani, S. 2001. The SLAM toolkit. In Proceedings of the Conference on Computer-Aided Verification.
[17]
Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M., Lavoie, Y., and Tawbi, N. 2001. Static detection of malicious code in executable programs. Int. J. Req. Engin.
[18]
Bergeron, J., Debbabi, M., Erhioui, M., and Ktari, B. 1999. Static analysis of binary code to isolate malicious behaviors. In Proceedings of the International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises.
[19]
Blanchet, B., Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux, D., and Rival, X. 2003. A static analyzer for large safety-critical software. In Proceedings of the Conference on Programming Language Design and Implementation. 196--207.
[20]
Bouajjani, A., Esparza, J., and Maler, O. 1997. Reachability analysis of pushdown automata: Application to model checking. In Proceedings of the International Conference on Concurrency Theory (CONCUR'97).
[21]
Bourdoncle, F. 1993. Efficient chaotic iteration strategies with widenings. In Proceedings of the International Conference on Formal Methods in Programming and Their Application. Lecture Notes in Computer Science. Springer.
[22]
Brown, R., Khazan, R., and Zhivich, M. 2007. AWE: Improving software analysis through modular integration of static and dynamic analyses. In Workshop on Program Analysis for Software Tools and Engineering (PASTE'07).
[23]
Brumley, D. and Newsome, J. 2006. Alias analysis for assembly. Tech. rep. CMU-CS-06-180, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
[24]
Bryant, R. 1986. Graph-Based algorithms for Boolean function manipulation. IEEE Trans. Comput. C-35, 6, 677--691.
[25]
Burch, J., Clarke, E., McMillan, K., Dill, D., and Hwang, L. 1990. Symbolic model checking: 1020 states and beyond. In Proceedings of the Symposium on Logic in Computer Science. 428--439.
[26]
Bush, W., Pincus, J., and Sielaff, D. 2000. A static analyzer for finding dynamic programming errors. Softw. Pract. Exper. 30, 775--802.
[27]
Cai, H., Shao, Z., and Vaynberg, A. 2007. Certified self-modifying code. In Proceedings of the Conference on Programming Language Design and Implementation. ACM Press, New York, 66--77.
[28]
Chandra, S. and Reps, T. 1999. Physical type checking for C. In Proceedings of the Conference on Program Analysis for Software Tools and Engineering. 66--75.
[29]
Chang, B.-Y., Harren, M., and Necula, G. 2006. Analysis of low-level code using cooperating decompilers. In Proceedings of the Static Analysis Symposium.
[30]
Chen, H. and Wagner, D. 2002. MOPS: An infrastructure for examining security properties of software. In Proceedings of the Conference on Computer and Communications Security. 235--244.
[31]
Chou, A., Yang, J., Chelf, B., Hallem, S., and Engler, D. 2001. An empirical study of operating systems errors. In Proceedings of the Symposium on Operating Systems Principles.
[32]
Christodorescu, M., Goh, W.-H., and Kidd, N. 2005. String analysis for x86 binaries. In Proceedings of the Conference on Program Analysis for Software Tools and Engineering.
[33]
Cifuentes, C. and Fraboulet, A. 1997a. Interprocedural data flow recovery of high-level language code from assembly. Tech. rep. 421, University of Queensland.
[34]
Cifuentes, C. and Fraboulet, A. 1997b. Intraprocedural static slicing of binary executables. In Proceedings of the International Conference on Software Maintenance. 188--195.
[35]
Cifuentes, C., Simon, D., and Fraboulet, A. 1998. Assembly to high-level language translation. In Proceedings of the International Conference on Software Maintenance. 228--237.
[36]
Clarke, E., Grumberg, O., Jha, S., Lu, Y., and Veith, H. 2000. Counterexample-Guided abstraction refinement. In Proceedings of the Conference on Computer Aided Verification. 154--169.
[37]
CodeSonar. CodeSonar, GrammaTech, Inc. www.grammatech.com/products/codesonar.
[38]
CodeSurfer. CodeSurfer, GrammaTech, Inc. www.grammatech.com/products/codesurfer.
[39]
Cooper, K. and Kennedy, K. 1988. Interprocedural side-effect analysis in linear time. In Proceedings of the Conference on Program Language Design and Implementation. 57--66.
[40]
Corbett, J., Dwyer, M., Hatcliff, J., Laubach, S., Pasareanu, C., Robby, and Zheng, H. 2000. Bandera: Extracting finite-state models from Java source code. In Proceedings of the International Conference on Software Engineering. 439--448.
[41]
Cousot, P. and Cousot, R. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction of approximation of fixed points. In Proceedings of the Symposium on Principles of Programming Language. 238--252.
[42]
Cousot, P. and Cousot, R. 1979. Systematic design of program analysis frameworks. In Proceedings of the Symposium on Principles of Programming Language. 269--282.
[43]
Cousot, P. and Halbwachs, N. 1978. Automatic discovery of linear constraints among variables of a program. In Proceedings of the Symposium on Principles of Programming Language. 84--96.
[44]
Cova, M., Felmetsger, V., Banks, G., and Vigna, G. 2006. Static detection of vulnerabilities in x86 executables. In Proceedings of the Annual Computer Security Applications Conference.
[45]
Coverity. Coverity Prevent. www.coverity.com/html/coverity-prevent-static-analysis.html.
[46]
Das, M., Lerner, S., and Seigle, M. 2002. ESP: Path-Sensitive program verification in polynomial time. In Proceedings of the Conference on Programming Language Design and Implementation. ACM Press, New York, 57--68.
[47]
De Sutter, B., De Bus, B., De Bosschere, K., Keyngnaert, P., and Demoen, B. 2000. On the static analysis of indirect control transfers in binaries. In Proceedings of the International Confernece on Parallel and Distributed Processing Techniques and Applications.
[48]
Debray, S., Linn, C., Andrews, G., and Schwarz, B. 2004. Stack analysis of x86 executables. www.cs.arizona.edu/~debray/Publications/stack-analysis.pdf.
[49]
Debray, S., Muth, R., and Weippert, M. 1998. Alias analysis of executable code. In Proceedings of the Symposium on Principles of Programming Language. 12--24.
[50]
Dhurjati, D., Das, M., and Yang, Y. 2006. Path-Sensitive dataflow analysis with iterative refinement. In Proceedings of the Static Analysis Symposium. 425--442.
[51]
DMCA §1201. §1201. Circumvention of Copyright Protection Systems. www.copyright.gov/title17/92chap12.html#1201.
[52]
Dor, N., Rodeh, M., and Sagiv, M. 2003. CSSV: Towards a realistic tool for statically detecting all buffer overflows in C. In Proceedings of the Conference on Program Language Design and Implementation. 155--167.
[53]
Eidorff, P., Henglein, F., Mossin, C., Niss, H., Sørensen, M., and Tofte, M. 1999. Anno Domini: From type theory to year 2000 conversion tool. In Proceedings of the Symposium on Principles of Programming Language. 1--14.
[54]
Emmerik, M. V. 2007. Static single assignment for decompilation. Ph.D. thesis, School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Australia.
[55]
Engler, D., Chelf, B., Chou, A., and Hallem, S. 2000. Checking system rules using system-specific, programmer-written compiler extensions. In Proceedings of the Conference on Operating System Design and Implementation. 1--16.
[56]
Feng, X., Shao, Z., Vaynberg, A., Xiang, S., and Ni, Z. 2006. Modular verification of as - sembly code with stack-based control abstractions. In Proceedings of the Conference on Program Language Design and Implementation. 401--414.
[57]
Ferdinand, C. 2009. Personal communication.
[58]
Ferdinand, C., Heckmann, R., Langenbach, M., Martin, F., Schmidt, M., Theiling, H., Thesing, S., and Wilhelm, R. 2001. Reliable and precise WCET determination for a real-life processor. In Proceedings of the 1st International Workshop on Embedded Software (EMSOFT'01). 469--485.
[59]
Finkel, A., Willems, B., and Wolper, P. 1997. A direct symbolic approach to model checking pushdown systems. Elec. Not. Theor. Comput. Sci. 9.
[60]
Fischer, J., Jhala, R., and Majumdar, R. 2005. Joining dataflow with predicates. In Proceedings of the Conference on Foundations of Software Engineering.
[61]
Gopan, D. and Reps, T. 2006. Lookahead widening. In Proceedings of the Conference on Computer Aided Verification.
[62]
Gopan, D. and Reps, T. 2007. Low-Level library analysis and summarization. In Proceedings of the Conference on Computer Aided Verification.
[63]
Graf, S. and Saïdi, H. 1997. Construction of abstract state graphs with PVS. In Proceedings of the Conference on Computer Aided Verification. Lecture Notes in Computer Science, vol. 1254. Springer, 72--83.
[64]
Grewe, D. 2008. Static congruence analysis of binaries. Bachelors thesis, University des Saarlandes.
[65]
Guo, B., Bridges, M., Triantafyllis, S., Ottoni, G., Raman, E., and August, D. 2005. Practical and accurate low-level pointer analysis. In Proceedings of the 3rd International Symposium on Code Generation and Optimization. 291--302.
[66]
Halbwachs, N., Proy, Y.-E., and Roumanoff, P. 1997. Verification of real-time systems using linear relation analysis. Formal Methods Syst. Des. 11, 2, 157--185.
[67]
Havelund, K. and Pressburger, T. 2000. Model checking Java programs using Java PathFinder. Softw. Tools Tech. Transfer 2, 4.
[68]
Henzinger, T., Jhala, R., Majumdar, R., and McMillan, K. L. 2004. Abstractions from proofs. In Proceedings of the Symposium on Principles of Programming Language. 232--244.
[69]
Henzinger, T., Jhala, R., Majumdar, R., and Sutre, G. 2002. Lazy abstraction. In Proceedings of the Symposium on Principles of Programming Language. 58--70.
[70]
Hind, M. 2001. Pointer analysis: Haven't we solved this problem yet? In Proceedings of the Conference on Program Analysis for Software Tools and Engineering.
[71]
Holley, L. and Rosen, B. 1981. Qualified data flow problems. Trans. Softw. Engin. 7, 1, 60--78.
[72]
Howard, M. 2002. Some bad news and some good news. MSDN, Microsoft Corp., msdn2.microsoft.com/en-us/library/ms972826.aspx.
[73]
IDAPro. IDAPro disassembler. www.hex-rays.com/idapro/.
[74]
Kiriansky, V., Bruening, D., and Amarasinghe, S. 2002. Secure execution via program shepherding. In Proceedings of the USENIX Security Symposium.
[75]
Kiss, A., J. Jász, G. L., and Gyimóthy, T. 2003. Interprocedural static slicing of binary executables. In Proceedings of the International Workshop on Source Code Analysis and Manipulation.
[76]
Kruegel, C., Kirda, E., Mutz, D., Robertson, W., and Vigna, G. 2005. Automating mimicry attacks using static binary analysis. In Proceedings of the USENIX Security Symposium.
[77]
Kurshan, R. 1994. Computer-Aided Verification of Coordinating Processes. Princeton University Press.
[78]
Lal, A., Reps, T., and Balakrishnan, G. 2005. Extended weighted pushdown systems. In Proceedings of the Conference on Computer Aided Verification.
[79]
Larus, J. and Schnarr, E. 1995. EEL: Machine-Independent executable editing. In Proceedings of the Conference on Programming Language Design and Implementation. 291--300.
[80]
Lim, J. and Reps, T. 2008. A system for generating static analyzers for machine instructions. In Proceedings of the International Conference on Compiler Construction.
[81]
Lim, J., Reps, T., and Liblit, B. 2006. Extracting output formats from executables. In Proceedings of the Working Conference on Reverse Engineering.
[82]
Manevich, R., Ramalingam, G., Field, J., Goyal, D., and Sagiv, M. 2002. Compactly representing first-order structures for static analysis. In Proceedings of the Static Analysis Symposium. 196--212.
[83]
Mauborgne, L. and Rival, X. 2005. Trace partitioning in abstract interpretation based static analyzers. In Proceedings of the European Symposium on Programming.
[84]
Miné, A. 2006. Field-Sensitive value analysis of embedded C programs with union types and pointer arithmetics. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems.
[85]
Müller-Olm, M. and Seidl, H. 2005. Analysis of modular arithmetic. In Proceedings of the European Symposium on Programming.
[86]
Mycroft, A. 1999. Type-Based decompilation. In Proceedings of the European Symposium on Programming.
[87]
Myers, E. 1984. Efficient applicative data types. In Proceedings of the Symposium on Principles of Programming Languages.
[88]
Ni, Z. and Shao, Z. 2006. Certified assembly programming with embedded code pointers. In Proceedings of the Symposium on Principles of Programming Languages. 320--333.
[89]
Nita, M., Grossman, D., and Chambers, C. 2008. A theory of platform-dependent low-level software. In Proceedings of the Symposium on Principles of Programming Languages.
[90]
O'Callahan, R. and Jackson, D. 1997. Lackwit: A program understanding tool based on type inference. In Proceedings of the International Conference on Software Engineering.
[91]
Oney, W. 2003. Programming the Microsoft Windows Driver Model. Microsoft Press.
[92]
Pande, H. and Ryder, B. 1996. Data-Flow-Based virtual function resolution. In Proceedings of the Static Analysis Symposium. 238--254.
[93]
Phoenix. Phoenix software optimization and analysis framework. http://connect.microsoft.com/phoenix.
[94]
Pioli, A. and Hind, M. 1999. Combining interprocedural pointer analysis and conditional constant propagation. Tech. rep. RC 21532(96749), IBM T.J. Watson Research Center.
[95]
PREfast. 2004. PREfast with driver-specific rules. www.microsoft.com/whdc/archive/PREfast-drv.mspx.
[96]
Pugh, W. 1988. Incremental computation and the incremental evaluation of functional programs. Ph.D. thesis, Cornell University.
[97]
Ramalingam, G., Field, J., and Tip, F. 1999. Aggregate structure identification and its application to program analysis. In Proceedings of the Symposium on Principles of Programming Languages.
[98]
Regehr, J., Reid, A., and Webb, K. 2005. Eliminating stack overflow by abstract interpretation. ACM Trans. Embed. Comput. Syst. 4, 4, 751--778.
[99]
Reps, T. and Balakrishnan, G. 2008. Improved memory-access analysis for x86 executables. In Proceedings of the International Conference on Compiler Construction.
[100]
Reps, T., Balakrishnan, G., and Lim, J. 2006. Intermediate-Representation recovery from low-level code. In Proceedings of the Conference on Part. Evaluation and Semantics-Based Program Manipulation.
[101]
Reps, T., Balakrishnan, G., Lim, J., and Teitelbaum, T. 2005. A next-generation platform for analyzing executables. In Proceedings of the Asian Symposium on Programming Languages and Systems.
[102]
Reps, T., Teitelbaum, T., and Demers, A. 1983. Incremental context-dependent analysis for language-based editors. Trans. Program. Lang. Syst. 5, 3, 449--477.
[103]
Rival, X. 2003. Abstract interpretation based certification of assembly code. In Proceedings of the Conference on Verification, Model Checking, and Abstract Interpretation.
[104]
Rugina, R. and Rinard, M. 2005. Symbolic bounds analysis of pointers, array indices, and accessed memory regions. Trans. Program. Lang. Syst. 27, 2, 185--235.
[105]
Sharir, M. and Pnueli, A. 1981. Two approaches to interprocedural data flow analysis. In Program Flow Analysis: Theory and Applications. S. Muchnick and N. Jones, Eds. Prentice-Hall, Englewood Cliffs, NJ, Chapter 7, 189--234.
[106]
Siff, M. and Reps, T. 1996. Program generalization for software reuse: From C to C++. In Proceedings of the Conference on Foundations of Software Engineering.
[107]
Srivastava, A., Edwards, A., and Vo, H. 2001. Vulcan: Binary transformation in a distributed environment. MSR-TR-2001-50, Microsoft Research.
[108]
Srivastava, A. and Eustace, A. 1994. ATOM - A system for building customized program analysis tools. In Proceedings of the Conference on Programming Language Design and Implementation.
[109]
Swift, M., Annamalai, M., Bershad, B., and Levy, H. 2004. Recovering device drivers. In Proceedings of the Symposium on Operating Systems Design and Implementation.
[110]
Swift, M., Bershad, B., and Levy, H. 2005. Improving the reliability of commodity operating systems. ACM Trans. Comput. Syst. 23, 1.
[111]
van Deursen, A. and Moonen, L. 1998. Type inference for COBOL systems. In Proceedings of the Working Conference on Reverse Engineering.
[112]
van Deursen, A. and Moonen, L. 1998. Type inference for COBOL systems. In Procedings of the Working Conference on Reverse Engineering.
[113]
Wagner, D., Foster, J., Brewer, E., and Aiken, A. 2000. A first step towards automated detection of buffer overrun vulnerabilities. In Proceedings of the Conference on Network and Distributed Systems Security.
[114]
Wall, D. 1992. Systems for late code modification. In Code Generation—Concepts, Tools, Techniques, R. Giegerich and S. Graham, Eds. Springer.
[115]
WHDC. 2007. C++ for kernel mode drivers: Pros and cons. www.microsoft.com/whdc/driver/kernel/KMcode.mspx.
[116]
WHQL. 2004. Defrauding the WHQL driver certification process. blogs.msdn.com/oldnewthing/archive/2004/03/05/84469.aspx.
[117]
Wikipedia: Enforceability. Software license agreement: Enforceability (in the United States). en.wikipedia.org/wiki/Software license agreement#Enforceability, Sept. 19, 2009.
[118]
Wikipedia: shrink-wrap and click-wrap licenses. Software license agreement: Shrink-wrap and click-wrap licenses. en.wikipedia.org/wiki/Software license agreement#Shrink-wrap and click-wrap licenses, Sept. 19, 2009.
[119]
Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschu- lat, J., and Stenström, P. 2008. The worst-case execution-time problem—Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3, 1--53.
[120]
Wilson, R. and Lam, M. 1995. Efficient context-sensitive pointer analysis for C programs. In Proceedings of the Conference on Programming Language Design and Implementation. 1--12.
[121]
Windows DDK. 2003. Windows Server 2003 DDK. www.microsoft.com/whdc/devtools/ddk.
[122]
Xu, Z., Miller, B., and Reps, T. 2000. Safety checking of machine code. In Proceedings of the Conference on Programming Language Design and Implementation.
[123]
Xu, Z., Miller, B., and Reps, T. 2001. Typestate checking of machine code. In Proceedings of the European Symposium on Programming.
[124]
Zhang, J., Zhao, R., and Pang, J. 2007. Parameter and return-value analysis of binary executables. In Proceedings of the Computer Software and Applications Conference.

Cited By

View all
  • (2025)Identifying runtime libraries in statically linked linux binariesFuture Generation Computer Systems10.1016/j.future.2024.107602164(107602)Online publication date: Mar-2025
  • (2025)Lift-Offline: Instruction Lifter GeneratorsStatic Analysis10.1007/978-3-031-74776-2_4(86-119)Online publication date: 21-Jan-2025
  • (2024)Binsweep: Reliably Restricting Untrusted Instruction Streams with Static Binary Analysis and Control-Flow IntegrityProceedings of the 2024 on Cloud Computing Security Workshop10.1145/3689938.3694778(53-65)Online publication date: 19-Nov-2024
  • Show More Cited By

Recommendations

Reviews

Ramesh S

Source code analysis is a very well-known and well-developed area that deals with analyzing programs written in high-level languages to detect the presence of defects and to extract useful information about the program. Balakrishnan and Reps discuss the limitations of source code analysis when it comes to looking for various security vulnerabilities that often occur or are revealed only at the machine code level; also, the source code is not always available for analysis, especially for third-party-developed software. These considerations have motivated the authors to look at the static analysis of machine code with little additional information. Machine-level static analysis is very challenging, and involves extracting simple information such as the variables used by the program, and the control flow and data flow dependency, which deals with memory addresses. The analysis engine has to identify the correct chunks of address spaces as variables and perform the analysis. The large body of work in this area carried out by Reps and some of his students over the last several years is presented in this paper. As a result, it is voluminous and rich in content. It should be very useful to researchers interested in static analysis as applied to machine code. A number of algorithms described in the paper include the extraction of intermediate representation (IR) of the code, which is very similar to what is obtained from a high-level language. Value-set analysis is an important and novel technique proposed by the authors that would be very useful in performing machine code analysis. The paper is very extensive and describes the various algorithms used for the analysis clearly. It is very well written and can serve as an important resource for people interested in the static analysis of machine code. It contains a few examples and some details of the implementation of algorithms that have been used in the extension of their tool, CodeSurfer, for executables. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems  Volume 32, Issue 6
August 2010
215 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/1749608
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2010
Accepted: 01 December 2009
Revised: 01 September 2009
Received: 01 February 2009
Published in TOPLAS Volume 32, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Abstract interpretation
  2. context-sensitive analysis
  3. data structure recovery
  4. interprocedural dataflow analysis
  5. pointer analysis
  6. reverse engineering
  7. static analysis

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)959
  • Downloads (Last 6 weeks)90
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Identifying runtime libraries in statically linked linux binariesFuture Generation Computer Systems10.1016/j.future.2024.107602164(107602)Online publication date: Mar-2025
  • (2025)Lift-Offline: Instruction Lifter GeneratorsStatic Analysis10.1007/978-3-031-74776-2_4(86-119)Online publication date: 21-Jan-2025
  • (2024)Binsweep: Reliably Restricting Untrusted Instruction Streams with Static Binary Analysis and Control-Flow IntegrityProceedings of the 2024 on Cloud Computing Security Workshop10.1145/3689938.3694778(53-65)Online publication date: 19-Nov-2024
  • (2024)VarLifter: Recovering Variables and Types from Bytecode of Solidity Smart ContractsProceedings of the ACM on Programming Languages10.1145/36897118:OOPSLA2(1-29)Online publication date: 8-Oct-2024
  • (2024)BinVuGAL: Binary vulnerability detection method based on graph neural network combined with assembly language modelProceedings of the 2024 3rd International Conference on Cryptography, Network Security and Communication Technology10.1145/3673277.3673305(159-163)Online publication date: 19-Jan-2024
  • (2024)When Compiler Optimizations Meet Symbolic Execution: An Empirical StudyProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670372(4212-4225)Online publication date: 2-Dec-2024
  • (2024)Compiling with Abstract InterpretationProceedings of the ACM on Programming Languages10.1145/36563928:PLDI(368-393)Online publication date: 20-Jun-2024
  • (2024)NativeSummary: Summarizing Native Binary Code for Inter-language Static Analysis of Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680335(971-982)Online publication date: 11-Sep-2024
  • (2024)Define-Use Guided Path Exploration for Better Forced ExecutionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652128(287-299)Online publication date: 11-Sep-2024
  • (2024)Polynima: Practical Hybrid Recompilation for Multithreaded BinariesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650065(1126-1141)Online publication date: 22-Apr-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media