Binary analysis is a critical discipline within the realm of cybersecurity and software development, focusing on the examination of compiled executable files – known as binaries – to understand their behavior, vulnerabilities, and overall functionality. Dynamic binary analysis emerges as a cornerstone in this field, enabling the real-time observation and manipulation of running binaries and gaining insight into program execution paths, memory status, system calls, and interactions with external resources. This method proves indispensable for identifying vulnerabilities such as buffer overflows, race conditions, and memory leaks, as well as for reverse engineering proprietary software, improving software performance, and enhancing overall system security.
In this dissertation, we propose to improve the performance of dynamic binary analysis by adding elasticity. Our key insight is that the runtime analysis should be aligned with the proportion of program states under examination.Simply put, the runtime analysis should only be performed when necessary to keep the overhead minimal.
First, we adopted the idea of elastic analysis to improve whole-system dynamic taint analysis. Compared to process-level taint analysis, the main barrier of applying whole-system dynamic taint analysis in practice is the large slowdown that can be sometimes up to 30 times. Elastic whole-system taint analysis strives to perform taint analysis as least frequently as possible while maintaining the precision and accuracy. Evaluation shows the prototype, DECAF++, achieves over 202\% speedup on various benchmarks.
Second, we present SymFit, a binary concolic execution with the benefit of elastic symbolic tracing and efficient symbolic expression management. Evaluation on real-world benchmarks showcases 10 times speedup over existing works. With the increased efficiency, we demonstrate a unique application of concolic execution by employing symbolic expressions for crash seed deduplication.
Third, we introduce SpecTaint, a Spectre-gadget detection tool that leverages the benefit of elastic taint analysis to track Spectre data flow pattern at runtime. The evaluation results indicate that it outperforms existing methods with respect to detection precision and recall and detect spectre gadgets in real-world applications like Caffe and Brotli.
Lastly, we present LogicMem, a novel method to enable VMI and memory forensics when the kernel profile is not available.