Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3052973.3052995acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

Extracting Conditional Formulas for Cross-Platform Bug Search

Published: 02 April 2017 Publication History
  • Get Citation Alerts
  • Abstract

    With the recent increase in security breaches in embedded systems and IoT devices, it becomes increasingly important to search for vulnerabilities directly in binary executables in a cross-platform setting. However, very little has been explored in this domain. The existing efforts are prone to producing considerable false positives, and their results cannot provide explainable evidence for human analysts to eliminate these false positives. In this paper, we propose to extract conditional formulas as higher-level semantic features from the raw binary code to conduct the code search. A conditional formula explicitly captures two cardinal factors of a bug: 1) erroneous data dependencies and 2) missing or invalid condition checks. As a result, binary code search on conditional formulas produces significantly higher accuracy and provide meaningful evidence for human analysts to further examine the search results. We have implemented a prototype, XMATCH, and evaluated it using well-known software, including OpenSSL and BusyBox. Experimental results have shown that XMATCH outperforms the existing bug search techniques in terms of accuracy. Moreover, by evaluating 5 recent vulnerabilities, XMATCH provides clear evidence for human analysts to determine if a matched candidate is indeed vulnerable or has been patched.

    References

    [1]
    The LLVM Compiler Infrastructure. http://llvm.org/.
    [2]
    The z3 theorem prover. https://z3.codeplex.com/, 2010.
    [3]
    Dd-wrt firmware image r21676. ftp://ftp.dd-wrt.com/others/eko/BrainSlayer-V24-preSP2/2013/05-27-2013-r21676/senao-eoc5610/linux.bin(lastvisit: 2016-1-20), 2013.
    [4]
    Retargetable decompiler. https://retdec.com, 2013.
    [5]
    AVGERINOS, T., CHA, S. K., REBERT, A., SCHWARTZ,E. J., WOO, M., AND BRUMLEY, D. Automatic exploit generation. Communications of the ACM 57, 2 (2014), 74--84.
    [6]
    BALAKRISHNAN, G., GRUIAN, R., REPS, T., AND TEITELBAUM, T. Codesurfer/x86: A platform for analyzing x86 executables. In Compiler Construction, Lecture Notes in Computer Science. 2005.
    [7]
    BALAKRISHNAN, G., AND REPS, T. Analyzing memory accesses in x86 executables. In Compiler Construction (2004).
    [8]
    BRUMLEY, D., JAGER, I., AVGERINOS, T., AND SCHWARTZ, E. J. Bap: a binary analysis platform. In Computer aided verification (2011), Springer, pp. 463--469.
    [9]
    BRUMLEY, D., NEWSOME, J., SONG, D., WANG, H., AND JHA, S. Towards automatic generation of vulnerability-based signatures. In IEEE Symposium on Security and Privacy (Oakland) (2006).
    [10]
    CABALLERO, J., JOHNSON, N. M., MCCAMANT, S., AND SONG, D. Binary code extraction and interface identification for security applications. In Proceedings of the 17th Annual Network and Distributed System Security Symposium (San Diego, CA, Feb. 2010).
    [11]
    CHA, S. K., WOO, M., AND BRUMLEY, D. Program-adaptive mutational fuzzing. In IEEE Symposium on Security and Privacy (Oakland) (2015).
    [12]
    CHEN, D. D., EGELE, M., WOO, M., AND BRUMLEY, D. Towards automated dynamic analysis for linux-based embedded firmware. In NDSS (2016).
    [13]
    DAVID, Y., AND YAHAV, E. Tracelet-based code search in executables. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI'14) (2014), ACM.
    [14]
    DINABURG, A., AND RUEF, A. Mcsema: Static translation of x86 instructions to llvm. In ReCon (2014).
    [15]
    DOLAN-GAVITT, B., LEEK, T., HODOSH, J., AND LEE, W. Tappan zee (north) bridge: mining memory accesses for introspection. In CCS (2013).
    [16]
    DULLIEN, T., AND PORST, S. Reil: A platform-independent intermediate representation of disassembled code for static code analysis. CanSecWest (2009).
    [17]
    EGELE, M., WOO, M., CHAPMAN, P., AND BRUMLEY, D. Blanket execution: Dynamic similarity testing for program binaries and components. In USENIX Security (2014).
    [18]
    ELWAZEER, K., ANAND, K., KOTHA, A., SMITHSON, M., AND BARUA, R. Scalable variable and data type detection in a binary rewriter. In ACM SIGPLAN Notices (2013).
    [19]
    ESCHWEILER, S., YAKDAN, K., AND GERHARDS-PADILLA, E. discovre: Efficient cross-architecture identification of bugs in binary code. In NDSS (2016).
    [20]
    FENG, Q., ZHOU, R., XU, C., CHENG, Y., TESTA, B., AND YIN, H. Scalable graph-based bug search for firmware images. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (2016).
    [21]
    GAO, D., REITER, M. K., AND SONG, D. Binhunt: Automatically finding semantic differences in binary programs. In Information and Communications Security. Springer, 2008, pp. 238--255.
    [22]
    GEOFFRION, A. M. Lagrangean relaxation for integer programming. Springer, 1974.
    [23]
    IRELAND, A., AND STARK, J. On the automatic discovery of loop invariants. In NASA Conference Publication (1997).
    [24]
    JANG, J. Scaling Software Security Analysis to Millions of Malicious Programs and Billions of Lines of Code. PhD thesis, CARNEGIE MELLON UNIVERSITY, 2013.
    [25]
    JANG, J., AGRAWAL, A., AND BRUMLEY, D. Redebug: finding unpatched code clones in entire os distributions. In IEEE Symposium on Security and Privacy (Oakland) (2012).
    [26]
    JHALA, R., AND MAJUMDAR, R. Path slicing. In ACM SIGPLAN Notices (2005).
    [27]
    KAMIYA, T., KUSUMOTO, S., AND INOUE, K. Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (2002), 654--670.
    [28]
    KARIM, M. E., WALENSTEIN, A., LAKHOTIA, A., AND PARIDA, L. Malware phylogeny generation using permutations of code. Journal in Computer Virology 1, 1-2 (2005), 13--23.
    [29]
    KHOO, W. M., MYCROFT, A., AND ANDERSON, R. Rendezvous: A search engine for binary code. In Proceedings of the 10th Working Conference on Mining Software Repositories (2013), IEEE Press.
    [30]
    KUHN, H. W. The hungarian method for the assignment problem. In 50 Years of Integer Programming 1958-2008. 2010, pp. 29--47.
    [31]
    LEE, J., AVGERINOS, T., AND BRUMLEY, D. Tie: Principled reverse engineering of types in binary programs. In Network and Distributed System Security Symposium (Feb. 2011).
    [32]
    LI, Z., LU, S., MYAGMAR, S., AND ZHOU, Y. Cp-miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI (2004), vol. 4, pp. 289--302.
    [33]
    MING, J., PAN, M., AND GAO, D. ibinhunt: binary hunting with inter-procedural control flow. In Information Security and Cryptology. Springer, 2012, pp. 92--109.
    [34]
    NETHERCOTE, N., AND SEWARD, J. Valgrind: a framework for heavyweight dynamic binary instrumentation. In PLDI (2007), pp. 89--100.
    [35]
    PEWNY, J., GARMANY, B., GAWLIK, R., ROSSOW, C., AND HOLZ, T. Cross-architecture bug search in binary executables. In 2015 IEEE Symposium on Security and Privacy (Oakland'15) (2015), IEEE.
    [36]
    PEWNY, J., SCHUSTER, F., BERNHARD, L., HOLZ, T., AND ROSSOW, C. Leveraging semantic signatures for bug search in binary programs. In ACSAC (2014).
    [37]
    REBERT, A., CHA, S. K., AVGERINOS, T., FOOTE, J., WARREN, D., GRIECO, G., AND BRUMLEY, D. Optimizing seed selection for fuzzing. In USENIX Security (2014).
    [38]
    RIESEN, K., NEUHAUS, M., AND BUNKE, H. Bipartite graph matching for computing the edit distance of graphs. In Graph-Based Representations in Pattern Recognition. 2007, pp. 1--12.
    [39]
    SCHWARTZ, E. J., LEE, J., WOO, M., AND BRUMLEY, D. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In USENIX Security (2013).
    [40]
    SHOSHITAISHVILI, Y., WANG, R., HAUSER, C., KRUEGEL, C., AND VIGNA, G. Firmalice-automatic detection of authentication bypass vulnerabilities in binary firmware. In NDSS (2015).
    [41]
    SONG, D., BRUMLEY, D., YIN, H., CABALLERO, J., JAGER, I., KANG, M. G., LIANG, Z., NEWSOME, J., POOSANKAM, P., AND SAXENA, P. BitBlaze: A newapproach to computer security via binary analysis. In Proceedings of the 4th International Conference on Information Systems Security (Hyderabad, India, Dec. 2008).
    [42]
    STEPHENS, N., GROSEN, J., SALLS, C., DUTCHER, A., AND WANG, R. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS (2016).
    [43]
    TAHA, H. A. Integer programming: theory, applications, and computations. Academic Press, 2014.

    Cited By

    View all
    • (2024)A Survey of Binary Code Similarity Detection TechniquesElectronics10.3390/electronics1309171513:9(1715)Online publication date: 29-Apr-2024
    • (2024)Semantic aware-based instruction embedding for binary code similarity detectionPLOS ONE10.1371/journal.pone.030529919:6(e0305299)Online publication date: 11-Jun-2024
    • (2024)HAformer: Semantic fusion of hex machine code and assembly code for cross-architecture binary vulnerability detectionComputers & Security10.1016/j.cose.2024.104029(104029)Online publication date: Jul-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASIA CCS '17: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security
    April 2017
    952 pages
    ISBN:9781450349444
    DOI:10.1145/3052973
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 April 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. binary analysis
    2. firmware security
    3. vulnerability search

    Qualifiers

    • Research-article

    Funding Sources

    • DARPA Grant
    • National Science Foundation Grant
    • Air Force Research Lab Grant

    Conference

    ASIA CCS '17
    Sponsor:

    Acceptance Rates

    ASIA CCS '17 Paper Acceptance Rate 67 of 359 submissions, 19%;
    Overall Acceptance Rate 418 of 2,322 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Survey of Binary Code Similarity Detection TechniquesElectronics10.3390/electronics1309171513:9(1715)Online publication date: 29-Apr-2024
    • (2024)Semantic aware-based instruction embedding for binary code similarity detectionPLOS ONE10.1371/journal.pone.030529919:6(e0305299)Online publication date: 11-Jun-2024
    • (2024)HAformer: Semantic fusion of hex machine code and assembly code for cross-architecture binary vulnerability detectionComputers & Security10.1016/j.cose.2024.104029(104029)Online publication date: Jul-2024
    • (2024)Optir-SBERT: Cross-Architecture Binary Code Similarity Detection Based on Optimized LLVM IRDigital Forensics and Cyber Crime10.1007/978-3-031-56583-0_7(95-113)Online publication date: 3-Apr-2024
    • (2023)IoTSim: Internet of Things-Oriented Binary Code Similarity Detection with Multiple Block RelationsSensors10.3390/s2318778923:18(7789)Online publication date: 11-Sep-2023
    • (2023)Codeformer: A GNN-Nested Transformer Model for Binary Code Similarity DetectionElectronics10.3390/electronics1207172212:7(1722)Online publication date: 4-Apr-2023
    • (2023)BlockMatch: A Fine-Grained Binary Code Similarity Detection Approach Using Contrastive Learning for Basic Block MatchingApplied Sciences10.3390/app13231275113:23(12751)Online publication date: 28-Nov-2023
    • (2023)LibAM: An Area Matching Framework for Detecting Third-Party Libraries in BinariesACM Transactions on Software Engineering and Methodology10.1145/362529433:2(1-35)Online publication date: 23-Dec-2023
    • (2023)Discovering Causes of Traffic Congestion via Deep Transfer ClusteringACM Transactions on Intelligent Systems and Technology10.1145/360481014:5(1-24)Online publication date: 11-Aug-2023
    • (2023)Causal Feature Selection in the Presence of Sample Selection BiasACM Transactions on Intelligent Systems and Technology10.1145/360480914:5(1-18)Online publication date: 11-Aug-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media