A Framework to Quantify the Quality of Source Code Obfuscation

Jin, Hongjoo; Lee, Jiwon; Yang, Sumin; Kim, Kijoong; Lee, Dong Hoon

doi:10.3390/app14125056

Open AccessArticle

A Framework to Quantify the Quality of Source Code Obfuscation

by

Hongjoo Jin

,

Jiwon Lee

,

Sumin Yang

,

Kijoong Kim

and

Dong Hoon Lee

^*

School of Cybersecurity, Korea University, Seoul 02841, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5056; https://doi.org/10.3390/app14125056

Submission received: 8 May 2024 / Revised: 2 June 2024 / Accepted: 8 June 2024 / Published: 10 June 2024

(This article belongs to the Special Issue Cyber Security and Software Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Malicious reverse engineering of software has served as a valuable technique for attackers to infringe upon and steal intellectual property. We can employ obfuscation techniques to protect against such attackers as useful tools to safeguard software. Applying obfuscation techniques to source code can prevent malicious attackers from reverse engineering a program. However, the ambiguity surrounding the protective efficacy of these source code obfuscation tools and techniques presents challenges for users in evaluating and comparing the varying degrees of protection provided. This paper addresses these issues and presents a methodology to quantify the effect of source code obfuscation. Our proposed method is based on three main types of data: (1) the control flow graph, (2) the program path, and (3) the performance overhead added to the process—all of which are derived from a program analysis conducted by human experts and automated tools. For the first time, we have implemented a tool that can quantitatively evaluate the quality of obfuscation techniques. Then, to validate the effectiveness of the implemented framework, we conducted experiments using four widely recognized commercial and open-source obfuscation tools. Our experimental findings, based on quantitative values related to obfuscation techniques, demonstrate that our proposed framework effectively assesses obfuscation quality.

Keywords:

source code obfuscation; obfuscation measure; quantifying obfuscation quality

1. Introduction

Programs designed to execute specific functions within a system are distributed as source code or compiled binaries, often containing sensitive data such as algorithms and encryption keys [1]. Attackers use various tools to extract these assets [2]. Obfuscation techniques protect programs by complicating reverse engineering, thus deterring attacks [3,4]. These techniques, applied at the code level, conceal important elements while maintaining functionality [5]. Compared to binary obfuscation, source code obfuscation generally has lower performance overhead and is more efficient. Various commercial [6,7] and open-source [8,9] tools, as well as researcher-developed methods [10,11,12,13,14], provide different obfuscation techniques. However, there is a lack of comprehensive analysis and proven quantitative measurement indicators for evaluating the effectiveness of these tools [15,16].

Deobfuscation, which experts like hackers and analysts perform, is the most direct way to evaluate obfuscation techniques [15]. However, these evaluation methods depend on human expertise and entail significant costs. Furthermore, since these evaluations are based on individual expert characteristics, they are more qualitative and, therefore, challenging to utilize universally. Consequently, numerous studies on obfuscation have turned to quantitative measures when evaluating their techniques. These studies primarily utilize metrics like McCabe cyclomatic complexity [13,14,17,18,19], Line of Code (LoC) [13,20,21,22], and runtime overhead [10,23,24,25,26,27,28]. Nonetheless, assessing obfuscation quality when only a single or a small number of evaluation indicators are used has limitations. A program, for instance, might still be relatively easy to analyze even with a high McCabe cyclomatic complexity index if the number of nodes is small and the size of each node (in terms of the number of instructions) is minimal. Moreover, even if the LoC count is high due to the inclusion of extensive dummy code or newline characters, it can still be easily analyzed by tools or simplified through code optimization techniques. As highlighted in Ebad’s review paper [15], numerous indicators can help quantify obfuscation quality, and a more comprehensive evaluation can be achieved by combining and quantifying several of these. Consequently, there is a demonstrated need for an idea or technique to establish indicators to quantify obfuscation quality, and it is essential to conduct quantification experiments using these chosen indicators to demonstrate effectiveness.

In this paper, we define and employ specific metrics to quantify obfuscation quality. Aligning with prior studies [5], we evaluate three principal categories: Potency, which reflects the difficulty of direct analysis for humans or analysts; Resilience, which indicates the challenge posed by a tool-based analysis; and Cost, which represents the performance impact of obfuscation on the target program. To assess each category, we identified challenges that program analysts or malicious attackers could face while analyzing obfuscated source code. Our approach involves defining detailed measurement indicators based on the difficulty of analyzing a program. These indicators are then used to obtain quantitative values.

We developed a framework for measuring obfuscation quality that quantifies the values of the measurement indicators we define. We employed CFG Analyzer and Binary Analyzer to measure methods related to Potency and Cost, respectively, and utilized KLEE symbolic executor [29], Clang static analysis tool [30] and LLVM opt for measuring Resilience. We integrated measurement technology into the LLVM compiler [31] to measure 12 measurement indicators at the intermediate stages of the binary generation process, including IR code, bitcode, and assembly code. We incorporated our technique into the LLVM compiler source code compilation process. Our framework enhances user convenience by enabling users to receive compiled binaries and reports of quantitative values as output simply by inputting obfuscated C/C++ code.

To evaluate the proposed framework, we used both commercial and open-source obfuscation tools, including Stunnix C/C++ Obfuscator [6], Semantic Designs C-GCC4 Obfuscator [7], Tigress Obfuscator [8], and Obfuscator LLVM [9]. We created a baseline dataset from the NIST Juliet Test Suite [32] and obfuscation benchmarks [33]. We produced reports with quantitative values by applying obfuscation to this dataset and using our framework. The results varied depending on the obfuscation technique used. More potent techniques resulted in greater changes in quantitative values. Thus, our framework can help determine the relative quality of obfuscation by comparing these values against a baseline.

The contributions of this paper are as follows:

We are the first to implement a framework that can quantitatively evaluate the quality of obfuscation techniques.
We define the challenges associated with analyzing and executing obfuscated programs to quantify Potency, Resilience, and Cost and establish 12 measurement indicators based on these definitions.
Our proposed framework for quantifying obfuscation quality integrates the techniques into the LLVM compiler so that all quantification values can be measured during compilation.
We conducted extensive quantitative measurement experiments using well-known obfuscation tools and successfully demonstrated our framework’s effectiveness in this paper.

2. Background

2.1. Source Code Obfuscation

Source code is critical for delivering software functionality. Malicious reverse engineering can lead to intellectual property violations by allowing attackers to extract sensitive information from the source code. To mitigate this risk, various obfuscation techniques have been developed. Based on their application scope and methodology, these techniques are generally categorized into layout [34,35], data [36,37,38,39,40,41], and control flow obfuscation [42,43,44].

2.1.1. Layout Obfuscation

Layout obfuscation alters the physical appearance of the source code while retaining its functional behavior. This method includes techniques such as:

Scramble Identifiers [35,45]: mangles symbols like function and variable names.
Change Formatting [46]: changes the format by deleting or adding whitespace and newline characters.
Remove Comments [46]: deletes programmer comments.

These techniques make the source code challenging to read and understand, especially for reverse engineers or malicious analysts, while maintaining its original functionality to protect intellectual property.

2.1.2. Data Obfuscation

Data obfuscation conceals how data are stored and processed within the source code. Key techniques include:

Data Encoding [41,47,48,49,50]: transforms strings and values to obscure recognition.
Instruction Substitution [39,51,52]: complicates instruction calculation expressions.
Mixed Boolean Arithmetic [37,40,53,54]: uses formulas combining Boolean algebra and arithmetic operations.

These techniques hinder attackers’ understanding and modification of data processing mechanisms by reorganizing arrays and objects or employing intricate encoding schemes.

2.1.3. Control Flow Obfuscation

Control flow obfuscation complicates a program’s logical flow while preserving its functionality. Techniques include:

Bogus Control Flows [28]: inserts dummy code affecting the control flow graph.
Opaque Predicates [42]: creates conditional statements that insert garbage code.
Control Flow Flattening [55]: transforms the program structure into a single, complex switch-case statement.

These methods obscure the execution path, making it difficult for observers to trace and comprehend the program’s logic, thereby protecting intellectual property and defending against security vulnerabilities.

Table 1 summarizes the obfuscation techniques discussed. Our experiments employed characteristic obfuscation techniques from each category to achieve diverse obfuscation quality.

2.2. Source Code Obfuscator

A source code obfuscator is a tool or software engineered to transform readable source code into a format that is challenging to comprehend and analyze yet maintains its original functionality. The main objective of an obfuscator is to safeguard intellectual property and bolster security by rendering reverse engineering, tampering, or unauthorized analysis more difficult. This goal is accomplished by employing various obfuscation techniques, including altering variable names, modifying code structure, scrambling control flow, and implementing encryption or encoding schemes. These alterations make the code appear complex and incomprehensible to humans, but the architecture can still interpret it as maintaining its original functionality. Obfuscators are extensively utilized in software development, particularly in applications where securing and protecting sensitive information is paramount. This tool is a crucial part of a developer’s toolbox for safeguarding code against piracy, unauthorized modification, and exploitation of vulnerabilities.

Stunnix C/C++ Obfuscator and Semantic Designs C-GCC4 Obfuscator enhance source code security by making C/C++ code difficult to comprehend and analyze. Stunnix obfuscates code by shuffling its structure, renaming variables, and altering data structures, thus retaining functionality while appearing unreadable. Semantic Designs retains the code’s logic while adding complexity through excessive whitespace and newlines. Both tools protect intellectual property and prevent unauthorized access by making the source code hard to decipher.

The Tigress obfuscation tool is a robust, flexible software program crafted to modify C programs. Tigress enhances security and resilience against reverse engineering. Tigress is recognized for its comprehensive range of obfuscation strategies, encompassing control flow and data obfuscation, code virtualization, and anti-debugging techniques. It can generate multiple versions of the same functionality, significantly complicating the task for attackers to grasp the program’s logic or identify vulnerabilities. A notable feature of Tigress is its ability to create virtual machines within the code. Within a virtual machine, the original program logic is transformed into bytecode for execution, adding a complexity layer. This tool proves particularly valuable in protecting software’s intellectual property, deterring tampering, and guarding against diverse forms of exploitation. Tigress is well-suited for various uses, from academic research to industrial software development, particularly in scenarios where enhancing C code security is paramount.

Obfuscator LLVM is a notable open-source obfuscation tool that integrates with the LLVM compiler framework. This tool adds a layer of security and protection to the LLVM intermediate representation. Obfuscator LLVM is distinguishable for its ability to implement advanced obfuscation techniques at the compiler level, such as Control Flow Flattening, spurious control flow, and Instruction Substitution. These techniques thwart reverse engineering efforts by rendering the source code extremely difficult to analyze and comprehend. For instance, Control Flow Flattening obscures the execution path by transforming the program structure into a single, complex switch-case statement. Instruction Substitution enhances code complexity by swapping simple instructions with more intricate, semantically equivalent alternatives. Obfuscator LLVM is well-regarded in the open-source community for its effectiveness in safeguarding intellectual property and bolstering software security against hacking and unauthorized modifications. Its integration with LLVM renders it a versatile and potent tool for developers aiming to secure a program’s C and C++ code.

In our experiments, we utilized the four above-mentioned obfuscation tools. Specifically, we configured obfuscation options to quantify the quality of various obfuscation techniques. We applied three variation options for Tigress obfuscation to observe the increase in quantification values as additional obfuscation techniques were implemented. Table 2 provides a summary of the obfuscation tools and options used in our experiments, along with the obfuscation techniques applied (as described in Table 1).

2.3. Dataset for Evaluating Source Code Obfuscation

Assessing obfuscation quality requires a dataset encompassing a broad spectrum of software characteristics and complexities. Such datasets typically comprise various source codes in various programming languages, ranging from simple scripts to complex, multi-module applications. This diversity ensures that the effectiveness of obfuscation techniques can be tested across a wide range of coding styles, structures, and functions. Additionally, the dataset should include examples with varying levels of initial readability and clarity, as this factor can impact the extent to which code can be obfuscated. Additionally, annotated code versions may be included, highlighting key features such as control flow, data structures, and algorithmic logic. This facilitates a before-and-after comparison of the obfuscation process. Evaluation metrics for the dataset may include readability scores, complexity measures, and resilience against reverse engineering tools and techniques. Therefore, an ideal dataset for evaluating obfuscation quality should be comprehensive, diverse, and rich in metadata to enable a thorough and nuanced analysis of obfuscation effectiveness.

The NIST Juliet Test Suite is an extensive compilation of test cases created by the National Institute of Standards and Technology (NIST) to assess how well different software tools detect vulnerabilities. This dataset has been meticulously curated to encompass a broad spectrum of typical coding errors that are known to result in potential security vulnerabilities; this is in line with the classifications established by the Common Weakness Enumeration (CWE) system. The NIST Juliet Test Suite (version 1.3) comprises thousands of test cases in various programming languages, including C, C++, Java, and C#.

Each test case in the Juliet Test Suite is carefully crafted to simulate a distinct vulnerability type, encompassing scenarios like buffer overflows, SQL injections, cross-site scripting, and inadequate input validation. These test cases are accompanied by comprehensive annotations and are systematically structured to allow automated tools to perform processing and analysis by automated tools efficiently. This suite is primarily utilized to assess the precision and efficacy of static analysis tools when identifying a wide range of software vulnerabilities.

The Obfuscation-Benchmarks dataset is a specialized compilation of source code samples and test cases explicitly crafted to assess and closely examine the efficacy of code obfuscation techniques. This dataset is well-suited for assessing how various obfuscation methods can safeguard software against reverse engineering, unauthorized alterations, and comparable security risks. It also plays a crucial role in advancing the field of software obfuscation by furnishing a structured and varied array of examples and evaluation criteria. This dataset supports the development of more secure and resilient methods for safeguarding software.

For our experiments, we utilized the Juliet Test Suite and the Obfuscation-Benchmarks dataset to confirm the quality of our experimental code. The 130 selected C/C++ source codes are standardized to have varying code characteristics but similar sizes. We established the selected dataset as a baseline and created obfuscated datasets by applying each obfuscation technique.

3. Threat Model

This section outlines our threat model, which addresses the challenges posed by sophisticated attackers with advanced knowledge of software engineering and reverse engineering tools. These attackers aim to decipher and exploit obfuscated code using static and dynamic analysis techniques and manual inspection. Our primary concern is with attackers who can perform static analysis using tools like disassemblers and decompilers, dynamic analysis through debugging and execution tracing, and symbolic execution to explore program paths and systematically identify hidden values and vulnerabilities. Additionally, these attackers can recognize common obfuscation patterns and apply deobfuscation techniques to reverse-engineer the obfuscated code. We consider several threat scenarios where an attacker might attempt to deobfuscate the code, including static analysis attacks to reverse-engineer code without execution, dynamic analysis attacks to understand runtime behavior, symbolic execution attacks to explore program paths, and pattern recognition attacks using machine learning techniques to identify and reverse common obfuscation patterns. In each scenario, the obfuscation techniques are evaluated based on their ability to increase the difficulty of analysis and reduce the feasibility of successful reverse engineering. The goal is to ensure that even if parts of the code are deobfuscated, the overall program logic and sensitive information remain secure and incomprehensible. By detailing these threat scenarios, we provide a comprehensive framework for evaluating the quality of source code obfuscation. This ensures that our approach to quantifying obfuscation quality is robust and addresses the real-world challenges posed by advanced attackers.

4. Approach

We investigated the challenges associated with analyzing obfuscated source code and identified suitable measurement indicators for this analysis. The program consists entirely of functions and has control flow transitions occurring via calls and returns among these functions. A function is composed of a basic block, which is an atomic control flow unit. A basic block is a straight instruction sequence with no incoming branches (other than the entry) and no outgoing branches (other than the exit). These instructions represent the most fundamental units of execution. When analyzing a program, it is crucial to understand where the basic block, the smallest unit of control flow, branches. As illustrated in Figure 1, a program can be depicted as a control flow graph, in which basic blocks are nodes and the control flows are edges. Control flow graphs are useful when analysts need to analyze source code to obtain a program algorithm. The control flow graph provides a comprehensive view of all the basic blocks (nodes) and control flow transfers (edges) accessible within a program.

A program based on input values exhibits different branches according to those inputs and their specific values. The sequence in which the program branches and executes based on the value is called a path. This path represents the execution order of nodes and edges in the control flow graph. In Figure 1, on the path originating from the %0 label and branching to the %7 label, the edges connecting the %8, %11, and %15 labels create a cycle. Depending on the conditional expression and the branch statement’s value, these cycles can branch to the %18 label after executing numerous sequences of nodes and edges. Furthermore, unlike a control flow graph, in which multiple paths are possible, only a single path is determined based on the input value, creating numerous distinct paths. Consequently, analysts often invest considerable time and effort in tracing the program path and uncovering hidden values along a specific program path. Therefore, analysts often employ analysis tools like symbolic execution tools to analyze programs more efficiently.

Drawing on definitions and terminology from prior research, we have identified three categories to measure the complexity involved in analyzing and deobfuscating programs that have applied obfuscation techniques: Potency, Resilience, and Cost. Additionally, we developed a method to assess values in each of these categories quantitatively: Potency reflects the extent to which human analysis has become more difficult; Resilience denotes the challenge posed to analysis tools; and Cost represents the impact on program performance caused by the obfuscation process. Specific measurement indicators quantify each category, derived from insights gained while analyzing and deobfuscating the program. Table 3 presents our established indicators and explains each.

4.1. Potency

To measure Potency, we focus specifically on identifying the factors that contribute to the complexity of the control flow graph. The control flow graph consists of nodes and edges, with each node being made up of instructions. Consequently, the control flow graph’s complexity increases with the number of nodes and edges, and the number of instructions making up each node and the number of program lengths in the source code influence its size. The control flow graph depth is also considered as it indicates the maximum number of overlaps in the branches. Our framework considers the number of edges compared to the number of nodes, the number of nodes, the control flow depth, the number of instructions, and the program length to measure Potency.

4.2. Resilience

To quantify Resilience, we determine the degree to which automated tools interfere with analysis. Studies on deobfuscation primarily employ symbolic execution techniques, in which attackers engaged in deobfuscation can effectively analyze the program path, thereby uncovering the program algorithm, hidden values, and other aspects [56,57]. Furthermore, since code optimization technology can effectively simplify codes that have become more complex due to obfuscation, we evaluate the level of difficulty that symbolic execution, static analysis tools using symbolic execution engines, and code optimizers each encounter in performing source code analysis and optimization on obfuscated code. In this paper, we use the time required for symbolic execution, the time needed for static analysis, the analysis coverage of the symbolic execution tool, and the optimization ratio of the code optimizer as measurement indicators to quantify Resilience.

4.3. Cost

To measure Cost, we assess the “effect on program performance”. The greater the complexity of source code obfuscation, the more it can impair the program’s performance. Specifically, obfuscation techniques result in runtime overhead, memory overhead, and increased file size. Consequently, we conduct performance measurements by compiling the obfuscated source code into binary files. To quantify cost, we employ measurement indicators such as program execution time, process memory usage, and program size.

To assess the quality of obfuscation, we have established 12 measurement indicators for measuring each category: Potency, Resilience, and Cost. We employ 12 indicators to illustrate the differences between the obfuscated and original codes. Furthermore, we have developed an obfuscation quality quantification framework that accepts source code as input, applies 12 indicators during compilation, and produces quantified results as output.

5. Framework

This paper introduces a framework that measures obfuscation quality and generates quantitative values as output. Our framework integrates the measurement code into the LLVM compiler. The quantification framework accepts a C/C++ program for evaluation as input and intervenes in compiling it into binary to measure the quantitative values of 12 indicators. As .c code is compiled into the binary, .ll code and .bc files are produced, and the proposed framework assesses the values of the measurable indicators at each stage. The values measured for the 12 indicators are stored in a .csv file format. Users of the quantification framework can easily obtain the compiled binaries and results of the obfuscation quality measurements by simply inputting the source code. Figure 2 presents an overview of the obfuscation quality quantification framework.

We additionally developed CFG Analyzer, a tool that generates a control flow graph for a program and conducts analyses to measure Potency. CFG Analyzer utilizes LLVM opt to produce control flow graphs, outputting them as .dot files. The generated .dot file is analyzed to measure the control flow graph’s nodes, edges, and depth. Furthermore, program length and instruction count are computed during compilation. This is part of measuring the values for quantifying Potency. Our framework uses the KLEE symbolic execution tool, Clang static analyzer, and the optimization tool provided by LLVM opt to quantify Resilience. With the KLEE symbolic execution tool, the framework measures the analysis time and code coverage for symbolic execution. It then uses the Clang static analysis tool to analyze the entire source code statically, thereby evaluating the duration and calculating the percentage of instruction optimization achieved by the LLVM opt optimizer for the program. Lastly, the binary created during compilation is analyzed using our Binary Analyzer to assess Cost. We developed and incorporated this Binary Analyzer to accurately measure a program’s execution time, process memory usage, and file size.

6. Experiment

We utilized well-known obfuscation tools to assess the effectiveness of our implemented framework in measuring obfuscation quality. Our approach involves comparing the original, unobfuscated source code with the source code to which obfuscation has been applied. We implemented and evaluated the framework on an Intel i9-13900K, Intel, Santa Clara, CA, USA @ 5.80 GHz CPU (64 GB RAM and 32 cores), which utilized the LLVM version 13.0.0 compiler and operated on the Ubuntu 22.04 LTS system with kernel version 5.15.0. The detailed experimental process is outlined in Section 6.1.

6.1. Experimental Process

To provide a comprehensive understanding of our experimental process, this section outlines the detailed phases involved in the experiments to evaluate the quality of source code obfuscation.

6.1.1. Experimental Design

The experimental design involved the following steps:

Selection of Dataset: we used 130 C/C++ source codes from the NIST Juliet Test Suite and the Obfuscation-Benchmarks dataset. These datasets were selected to ensure a diverse range of code characteristics.
Configuration of Obfuscation Tools: four obfuscation tools were used:
- Stunnix C/C++ Obfuscator
- Semantic Designs C-GCC4 Obfuscator
- Tigress Obfuscator (with three levels of obfuscation: Level 1, Level 2, Level 3)
- Obfuscator LLVM
Each tool was configured with specific options to apply distinct obfuscation techniques.
Application of Obfuscation Techniques: the selected datasets were obfuscated using the configured tools. Each obfuscated code was then used to generate the required binaries for analysis.
Measurement of Metrics: the following metrics were measured:
- Potency: McCabe cyclomatic complexity, control flow graph size, control flow depth, program length, and instruction count.
- Resilience: symbolic execution time, code coverage, static analysis time, and code optimization.
- Cost: time overhead, space overhead, and file size.
Analysis and Comparison: the measured metrics from the obfuscated codes were compared against the baseline (non-obfuscated) codes to evaluate the impact of each obfuscation technique.

6.1.2. Detailed Protocol

Experimental Environment:
- Hardware: Intel i9-13900K @ 5.80 GHz CPU, 64 GB RAM, 32 cores
- Software: LLVM version 13.0.0 compiler, Ubuntu 22.04 LTS, Kernel version 5.15.0
Obfuscation Tool Configuration:
- Stunnix C/C++ Obfuscator: applied formatting changes and comment removal.
- Semantic Designs C-GCC4 Obfuscator: applied identifier scrambling, formatting changes, comment removal, and data encoding.
- Tigress Obfuscator:
  -
  Level 1: Mixed Boolean Arithmetic
  -
  Level 2: Mixed Boolean Arithmetic + Opaque Predicates
  -
  Level 3: Mixed Boolean Arithmetic + Opaque Predicates + Control Flow Flattening
- Obfuscator LLVM: applied control flow flattening, instruction substitution, and bogus control flows.
Data Collection:
- Potency Metrics: calculated using CFG Analyzer integrated with LLVM.
- Resilience Metrics: measured using the KLEE symbolic execution tool, Clang static analyzer, and LLVM opt optimizer.
- Cost Metrics: assessed using Binary Analyzer developed to measure runtime overhead, memory usage, and binary file size.
Analysis:
- The results were analyzed to identify the impact of each obfuscation technique on the selected metrics. Comparative analysis was performed to evaluate the effectiveness of different techniques.

Figure 3 illustrates the experimental process for quantifying results using our framework. In the experiment, we established the NIST Juliet Test Suite v1.3 codes and Banescusebi’s obfuscation benchmark codes as the baseline for 130 codes. Generate an obfuscated test code set by applying each obfuscation tool to the established baseline code set. We employed the most robust data obfuscation options like Stunnix C/C++ Obfuscator v4.9 on the baseline code set and executed layout and data obfuscation using the +PrintAsls+Obfusc-ate+ObfuscateLiterals options of the Semantic Designs C-GCC4 Obfuscator.

Unlike previous commercial obfuscation tools, Obfuscator LLVM v4.0.1 can apply potent control flow obfuscation to LLVM intermediate representations (.ll files). We applied Control Flow Flattening, Instruction Substitution, and Bogus Control Flows options of Obfuscator LLVM on the baseline code set. Furthermore, we decompiled the obfuscated .ll files to create a dataset comprising .c files.

Tigress Obfuscator v3.1 offers a total of 32 obfuscation options. Of these, we selected three that we considered to be the most prevalent and effective for obfuscation research. We created three variations of the test code set to examine the changes in obfuscation quality quantification as we applied additional obfuscation options. The first Tigress data set exclusively utilized the Enc.Arithmetic option, a Mixed Boolean Arithmetic obfuscation technique renowned for its effectiveness among data obfuscation methods (Level 1). The second Tigress data set added the Add Opaque option to Enc.Arithmetic, and we also included the Opaque Predicates obfuscation technique (Level 2). The third Tigress data set added the Flatten option to the Level 2 data set, and we applied a Control Flow Flattening obfuscation technique (Level 3).

To assess the obfuscation quality of each tool, we evaluated six obfuscation datasets that our framework generated. The quality of obfuscation can be estimated based on the size of an increase or decrease in quantitative values compared to the baseline. We anticipated that the quantitative values for layout obfuscation and data obfuscation—which is typically regarded as less robust—would not show significant deviation from the baseline. Furthermore, we expected control flow obfuscation, deemed more potent, to significantly influence all 12 measurement indicators. We represented the results of the 12 indicator measurements for each data set, organizing them into the categories of Potency, Resilience, and Cost. We separately displayed the results for Stunnix C/C++ Obfuscator and Semantic Designs C-GCC4 Obfuscator to minimize scale discrepancies in the graphs.

6.2. Potency Measurement Results

Figure 4 displays the outcomes of the five indicators used to quantify Potency. Count is the unit of all measurements in Figure 4. Stunnix C/C++ Obfuscator and Semantic Designs C-GCC4 Obfuscator showed no impact on the McCabe cyclomatic complexity, control flow graph size, control flow depth, and instruction count. It is notable, however, that the values of program length were affected. This occurred because Stunnix C/C++ Obfuscator removes newlines and white spaces during format alteration, while Semantic Designs C-GCC4 Obfuscator adds them. Conversely, Tigress Obfuscator and Obfuscator LLVM showed an increase in all metrics compared to the baseline. Specifically, the numbers increased when we added the options in Tigress Obfuscator. Obfuscator LLVM reveals a significant increase in program lengths and instructions due to the addition of numerous bogus control flows. Table 4 details the quantification results for obfuscation Potency.

6.3. Resilience Measurement Results

Figure 5 displays the outcomes of the four indicators used to assess Resilience. In Figure 5, the units for symbolic execution time and analysis time measurements are in milliseconds, while coverage and optimization measurements are expressed as percentages. Neither Stunnix C/C++ Obfuscator nor Semantic Designs C-GCC4 Obfuscator influenced coverage or optimization values in our experiments. Conversely, there was an impact on symbolic time and analysis time, indicating that data or code structure alterations can affect the tool’s time to analysis. Table 5 details the quantification results for obfuscation Resilience.

Tigress Obfuscator and Obfuscator LLVM both demonstrated increased symbolic time and analysis time. Notably, the symbolic time exhibited a markedly upward trend with the addition of options in Tigress Obfuscator, while Obfuscator LLVM led to a significant increase in analysis time. This characteristic emerges because while the Bogus Control Flows technique does not substantially impede symbolic execution, it considerably increases the number of instructions that need to be analyzed statically. Furthermore, as the Bogus Control Flows introduce a substantial amount of redundant code that is not executed, we observed that the optimization value for Obfuscator LLVM decreased notably. Regarding coverage measurements, Obfuscator LLVM decreased, whereas Tigress Obfuscator increased when we added the obfuscation options. This occurs because, although the total number of instructions increased due to the added obfuscation option, the analysis effectively addressed the increased volume of instructions.

6.4. Cost Measurement Results

Figure 6 displays the results of using three indicators to measure Cost. In Figure 6, the time overhead is measured in milliseconds, while space overhead and file size are measured in kilobytes. In our experiments, the Stunnix C/C++ Obfuscator and the Semantic Designs C-GCC4 Obfuscator had minimal impact on all Cost measurement indicators. Conversely, Tigress Obfuscator and Obfuscator LLVM substantially affected space overhead and file size. Specifically, the addition of Tigress Obfuscator options can be observed to increase numbers gradually, but the obfuscation tools did not significantly affect time overhead. We attribute this to the experiment conducted in a multi-core environment with a high-performance CPU, the test code being less than 200 lines long, and the applied obfuscation techniques not generating sufficient load to affect program performance markedly. Table 6 details the quantification results for obfuscation Cost.

The experiments confirmed that control flow obfuscation techniques are more effective than layout obfuscation and data obfuscation, which is consistent with our expectations. Furthermore, we verified that different obfuscation techniques impact various measurement indicators, underscoring the necessity of using a range of methods, as we do in our approach, to evaluate obfuscation quality effectively.

7. Discussion

In our study, we endeavor to establish a rigorous framework for the quantitative assessment of obfuscation techniques by leveraging specific metrics derived from static and dynamic analysis. This approach addresses a crucial gap in current research: the quantification of obfuscation quality, which traditionally leans towards qualitative assessments due to the intricate nature of software comprehension and reverse engineering practices. However, several challenges and issues need to be addressed.

Challenges of Software Metrics in Evaluating Potency. Our research methodically measures selected software metrics to approximate the Potency of obfuscation techniques. Although these metrics intuitively reflect the structural and operational complexities introduced by obfuscation, their correlation with human comprehension remains inadequately explored. The primary aim of obfuscation is to thwart reverse engineering by complicating the comprehension process, making the software difficult to decode and analyze. Metrics such as McCabe’s cyclomatic complexity, code density, and control flow alteration provide a surface-level estimation of these complexities but do not directly translate into the cognitive challenges faced by human analysts. As such, while these metrics are valuable, they capture only a facet of the obfuscation’s effectiveness, neglecting the nuanced and often subjective experience of comprehending obfuscated code.

Limitations of Control Flow Graph (CFG) Analysis. A significant technical challenge in this research concerns measuring the Control Flow Graph (CFG). CFGs are pivotal in understanding program structure and flow, particularly under static analysis. However, certain obfuscation techniques specifically target and disrupt the legibility and reconstructability of CFGs. For instance, techniques involving runtime pointer dereferences effectively obscure the true flow of control, thereby eluding traditional CFG reconstruction methods. Our framework currently does not encompass strategies to counteract or measure the impact of such techniques, which limits its ability to assess Resilience against advanced obfuscation methods fully.

Implications for Future Research. These insights underscore the need to develop more sophisticated methods for analyzing obfuscated code, particularly in enhancing our understanding of how such techniques affect human comprehension. Future research could focus on developing empirical studies that measure the time and effort required by professional reverse engineers to decipher obfuscated code. Additionally, improving CFG analysis in the presence of obfuscation techniques that alter runtime behavior could provide a more comprehensive understanding of an obfuscation’s true Potency.

Addressing Framework Shortcomings. Future iterations of our framework must incorporate methods to address these identified shortcomings. This might include integrating cognitive psychology insights to link software metrics with human comprehension difficulty or developing advanced tools capable of dynamic CFG reconstruction despite obfuscation techniques aimed at runtime processes.

In conclusion, while our framework sets a foundational stage for the quantitative assessment of software obfuscation, it also highlights critical areas for enhancement and further research. Bridging these gaps will improve the accuracy of our evaluations and contribute to the broader field of cybersecurity by developing tools and methodologies that keep pace with evolving obfuscation technologies.

8. Related Works

In software obfuscation, the McCabe metric [58], known for measuring the complexity of a program, has been widely used to assess the potency of various obfuscation techniques. In related works, researchers have leveraged this metric to provide insight into how obfuscation impacts the comprehensibility and complexity of code.

Collberg et al. [5] employed McCabe’s cyclomatic complexity metric to assess the quality of obfuscated code. By utilizing McCabe’s cyclomatic complexity metric, they suggested a methodology for gauging Potency, defined as the extent to which code is challenging for humans to comprehend. Moreover, the authors identified four key categories to assess obfuscation quality: Potency, Resilience, Cost, and Stealth, and they provided detailed descriptions for each. Ceccato et al. [59] explored how different obfuscation methods affect the cyclomatic complexity of code, as measured by the McCabe metric. The authors applied various obfuscation techniques to a set of software samples, analyzed the resulting increase in complexity, and offered a quantitative assessment of obfuscation effectiveness. Viticchié et al. [60] focused on the correlation between obfuscation and code metrics, including McCabe’s cyclomatic complexity. The authors demonstrated that higher post-obfuscation McCabe scores indicate greater difficulty in understanding and maintaining the code, thus proving the effectiveness of obfuscation in protecting software. Kumar et al. [61] presented a comparative analysis of several obfuscation techniques applied to Java bytecode. The authors used the McCabe metric to measure the increase in complexity after obfuscation, providing a detailed evaluation of each technique’s potency.

Symbolic execution attacks [33], which involve analyzing software by treating inputs as symbolic variables rather than concrete values, have been used in various studies to measure the resilience of obfuscation techniques. These works assess how well obfuscation can withstand such advanced analysis methods as performed by a given technique.

Banescu et al. [33] presented a framework for evaluating the strength of different obfuscation techniques against symbolic execution attacks. The authors developed a series of tests using symbolic execution to probe obfuscated code and measure its ability to resist analysis, providing valuable insight into the effectiveness of various obfuscation strategies. Yadegari et al. [62] employed symbolic execution to analyze the resilience of several popular obfuscation methods. The authors’ findings highlighted the varying degrees of protection offered by different techniques and identified specific weaknesses attackers could exploit using symbolic execution.

9. Conclusions

In this paper, we introduced an automated measurement framework designed to quantify the quality of source code obfuscation. Our framework employs twelve distinct measurement indicators to deliver quantified values across the categories of Potency, Resilience, and Cost. Through rigorous experimentation and analysis, we have demonstrated the efficacy of our framework in providing a comprehensive assessment of various obfuscation techniques. Our work’s strengths lie in its systematic approach to quantification, its applicability to a wide range of obfuscation methods, and its potential to facilitate more objective comparisons of obfuscation quality. Our findings indicate that inherent limitations exist while the framework effectively captures key aspects of obfuscation quality. These include challenges in accurately measuring certain dynamic behaviors of obfuscated code and the need for more advanced techniques to evaluate human comprehension difficulties. Furthermore, the framework’s current scope primarily focuses on static and certain dynamic analyses, which may not fully encompass all facets of obfuscation.

The added value of our work is highlighted by its potential to advance the field of software obfuscation by providing a robust tool for researchers and practitioners to assess and compare obfuscation techniques quantitatively. This can lead to better-informed decisions in selecting and implementing obfuscation methods. However, our study also underscores future research’s need to address these limitations. Prospective enhancements could involve integrating cognitive psychology insights to better link software metrics with human comprehension difficulties and developing advanced tools capable of dynamic analysis even in the presence of sophisticated obfuscation techniques. Additionally, empirical studies measuring the time and effort required by professional reverse engineers to decipher obfuscated code could provide further validation and refinement of our framework. In conclusion, while our framework sets a foundational stage for the quantitative assessment of software obfuscation, it also highlights critical areas for enhancement and further research. Bridging these gaps will improve the accuracy of our evaluations and contribute to the broader field of cybersecurity by developing tools and methodologies that keep pace with evolving obfuscation technologies.

Author Contributions

The authors confirm their contributions to the paper as follows: supervision, D.H.L.; conceptualization and methodology: H.J.; software and data curation: J.L.; design and implementation: H.J. and S.Y.; experiment and result analysis: H.J., S.Y. and K.K.; draft manuscript preparation: H.J. and D.H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIP) (No. RS-2024-00399389, Generative AI based Binary Deobfuscation Technology and Its Evaluation Metrics).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Banescu, S.; Ochoa, M.; Pretschner, A. A framework for measuring software obfuscation resilience against automated attacks. In Proceedings of the 2015 IEEE/ACM 1st International Workshop on Software Protection, Florence, Italy, 16–24 May 2015; IEEE: New York, NY, USA, 2015; pp. 45–51. [Google Scholar]
Akhunzada, A.; Sookhak, M.; Anuar, N.B.; Gani, A.; Ahmed, E.; Shiraz, M.; Furnell, S.; Hayat, A.; Khan, M.K. Man-At-The-End attacks: Analysis, taxonomy, human aspects, motivation and future directions. J. Netw. Comput. Appl. 2015, 48, 44–57. [Google Scholar] [CrossRef]
Collberg, C.S.; Thomborson, C. Watermarking, tamper-proofing, and obfuscation-tools for software protection. IEEE Trans. Softw. Eng. 2002, 28, 735–746. [Google Scholar] [CrossRef]
Bhansali, S.; Aris, A.; Acar, A.; Oz, H.; Uluagac, A.S. A first look at code obfuscation for webassembly. In Proceedings of the 15th ACM Conference on Security and Privacy in Wireless and Mobile Networks, San Antonio, TX, USA, 16–19 May 2022; pp. 140–145. [Google Scholar]
Collberg, C.; Thomborson, C.; Low, D. A Taxonomy of Obfuscating Transformations; Technical Report; Department of Computer Science, The University of Auckland: Auckland, New Zealand, 1997. [Google Scholar]
Obfuscator, S. Protect your C/C++ Code. Available online: http://stunnix.com/prod/cxxo/ (accessed on 12 October 2023).
Designs, S. Source Code Obfuscator. Available online: http://www.semdesigns.com/Obfuscators/ (accessed on 12 October 2023).
Obfuscator, T. The Tigress C Diversifier/Obfuscator. Available online: http://http://tigress.cs.arizona.edu/ (accessed on 12 October 2023).
Junod, P.; Rinaldini, J.; Wehrli, J.; Michielin, J. Obfuscator-LLVM–software protection for the masses. In Proceedings of the 2015 IEEE/ACM 1st International Workshop on Software Protection, Florence, Italy, 16–24 May 2015; IEEE: New York, NY, USA, 2015; pp. 3–9. [Google Scholar]
Balachandran, V.; Emmanuel, S. Potent and stealthy control flow obfuscation by stack based self-modifying code. IEEE Trans. Inf. Forensics Secur. 2013, 8, 669–681. [Google Scholar] [CrossRef]
Sultan, A.B.M.; Ghani, A.A.A.; Ali, N.M.; Admodisastro, N.I. Hybrid obfuscation technique to protect source code from prohibited software reverse engineering. IEEE Access 2020, 8, 187326–187342. [Google Scholar]
Ahire, P.; Abraham, J. Mechanisms for source code obfuscation in C: Novel techniques and implementation. In Proceedings of the 2020 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 12–14 March 2020; IEEE: New York, NY, USA, 2020; pp. 52–59. [Google Scholar]
Bertholon, B.; Varrette, S.; Martinez, S. Shadobf: A c-source obfuscator based on multi-objective optimisation algorithms. In Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, Cambridge, MA, USA, 20–24 May 2013; IEEE: New York, NY, USA, 2013; pp. 435–444. [Google Scholar]
Styugin, M.; Zolotarev, V.; Prokhorov, A.; Gorbil, R. New approach to software code diversification in interpreted languages based on the moving target technology. In Proceedings of the 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), Baku, Azerbaijan, 12–14 October 2016; IEEE: New York, NY, USA, 2016; pp. 1–5. [Google Scholar]
Ebad, S.A.; Darem, A.A.; Abawajy, J.H. Measuring software obfuscation quality—A systematic literature review. IEEE Access 2021, 9, 99024–99038. [Google Scholar] [CrossRef]
Hosseinzadeh, S.; Rauti, S.; Laurén, S.; Mäkelä, J.M.; Holvitie, J.; Hyrynsalmi, S.; Leppänen, V. Diversification and obfuscation techniques for software security: A systematic literature review. Inf. Softw. Technol. 2018, 104, 72–93. [Google Scholar] [CrossRef]
Ceccato, M.; Capiluppi, A.; Falcarin, P.; Boldyreff, C. A large study on the effect of code obfuscation on the quality of java code. Empir. Softw. Eng. 2015, 20, 1486–1524. [Google Scholar] [CrossRef]
Capiluppi, A.; Falcarin, P.; Boldyreff, C. Code defactoring: Evaluating the effectiveness of java obfuscations. In Proceedings of the 2012 19th Working Conference on Reverse Engineering, Kingston, ON, Canada, 15–18 October 2012; IEEE: New York, NY, USA, 2012; pp. 71–80. [Google Scholar]
Dunaev, D.; Lengyel, L. Cognitive evaluation of intermediate level obfuscator. In Proceedings of the 2014 5th IEEE Conference on Cognitive Infocommunications (CogInfoCom), Vietri sul Mare, Italy, 5–7 November 2014; IEEE: New York, NY, USA, 2014; pp. 521–525. [Google Scholar]
Sebastian, B.; Christian, C.; Alexander, P. Predicting the Resilience of Obfuscated Code Against Symbolic Execution Attacks via Machine Learning. In Proceedings of the 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, Canada, 16–18 August 2017; pp. 661–678. [Google Scholar]
Duchêne, J.; Alata, E.; Nicomette, V.; Kaâniche, M.; Le Guernic, C. Specification-based protocol obfuscation. In Proceedings of the 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Luxembourg, 25–28 June 2018; IEEE: New York, NY, USA, 2018; pp. 478–489. [Google Scholar]
Omar, R.; El-Mahdy, A.; Rohou, E. Arbitrary control-flow embedding into multiple threads for obfuscation: A preliminary complexity and performance analysis. In Proceedings of the 2nd International Workshop on Security in Cloud Computing, Kyoto, Japan, 3 June 2014; pp. 51–58. [Google Scholar]
Han, S.; Ryu, M.; Cha, J.; Choi, B.U. HOTDOL: HTML obfuscation with text distribution to overlapping layers. In Proceedings of the 2014 IEEE International Conference on Computer and Information Technology, Xi’an, China, 11–13 September 2014; IEEE: New York, NY, USA, 2014; pp. 399–404. [Google Scholar]
Ibrahim, A.; Banescu, S. StIns4CS: A State Inspection Tool for C#. In Proceedings of the 2016 ACM Workshop on Software PROtection, Vienna, Austria, 28 October 2016; pp. 61–71. [Google Scholar]
Lackner, M.; Berlach, R.; Weiss, R.; Steger, C. Countering type confusion and buffer overflow attacks on Java smart cards by data type sensitive obfuscation. In Proceedings of the First Workshop on Cryptography and Security in Computing Systems, Vienna, Austria, 20 January 2014; pp. 19–24. [Google Scholar]
Liu, W.; Li, W. Unifying the method descriptor in Java obfuscation. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; IEEE: New York, NY, USA, 2016; pp. 1397–1401. [Google Scholar]
Ko, S.; Choi, J.; Kim, H. COAT: Code obfuscation tool to evaluate the performance of code plagiarism detection tools. In Proceedings of the 2017 International Conference on Software Security and Assurance (ICSSA), Altoona, PA, USA, 24–25 July 2017; IEEE: New York, NY, USA, 2017; pp. 32–37. [Google Scholar]
Li, Y.; Sha, Z.; Xiong, X.; Zhao, Y. Code Obfuscation Based on Inline Split of Control Flow Graph. In Proceedings of the 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), Dalian, China, 28–30 June 2021; IEEE: New York, NY, USA, 2021; pp. 632–638. [Google Scholar]
Cadar, C.; Dunbar, D.; Engler, D.R. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the OSDI, San Diego, CA, USA, 8–10 December 2008; Volume 8, pp. 209–224. [Google Scholar]
Kremenek, T. Finding Software Bugs with the Clang Static Analyzer; Apple Inc.: Cupertino, CA, USA, 2008; p. 2008. [Google Scholar]
Lattner, C.; Adve, V. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization, CGO 2004, San Jose, CA, USA, 20–24 March 2004; IEEE: New York, NY, USA, 2004; pp. 75–86. [Google Scholar]
Black, P.E.; Black, P.E. Juliet 1.3 Test Suite: Changes from 1.2; US Department of Commerce, National Institute of Standards and Technology: Gaithersburg, MD, USA, 2018. [Google Scholar]
Banescu, S.; Collberg, C.; Ganesh, V.; Newsham, Z.; Pretschner, A. Code obfuscation against symbolic execution attacks. In Proceedings of the 32nd Annual Conference on Computer Security Applications, Los Angeles, CA, USA, 5–8 December 2016; pp. 189–200. [Google Scholar]
Hachez, G. A Comparative Study of Software Protection Tools Suited for E-Commerce with Contributions to Software Watermarking and Smart Cards. Ph.D. Thesis, Universite Catholique de Louvain, Ottignies-Louvain-la-Neuve, Belgium, 2003. [Google Scholar]
Chan, J.T.; Yang, W. Advanced obfuscation techniques for Java bytecode. J. Syst. Softw. 2004, 71, 1–10. [Google Scholar] [CrossRef]
Zhu, W.F. Concepts and Techniques in Software Watermarking and Obfuscation. Ph.D. Thesis, The Department of Computer Sciences The University of Auckland, Auckland, New Zealand, 2007. [Google Scholar]
Liu, B.; Feng, W.; Zheng, Q.; Li, J.; Xu, D. Software obfuscation with non-linear mixed boolean-arithmetic expressions. In Proceedings of the Information and Communications Security: 23rd International Conference, ICICS 2021, Chongqing, China, 19–21 November 2021; Proceedings, Part I 23. Springer: Berlin/Heidelberg, Germany, 2021; pp. 276–292. [Google Scholar]
Kang, S.; Lee, S.; Kim, Y.; Mok, S.K.; Cho, E.S. Obfus: An obfuscation tool for software copyright and vulnerability protection. In Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, Virtual, 26–28 April 2021; pp. 309–311. [Google Scholar]
Ahire, P.; Abraham, J. Secure cloud model for intellectual privacy protection of arithmetic expressions in source codes using data obfuscation techniques. Theor. Comput. Sci. 2022, 922, 131–149. [Google Scholar] [CrossRef]
Schloegel, M.; Blazytko, T.; Contag, M.; Aschermann, C.; Basler, J.; Holz, T.; Abbasi, A. Loki: Hardening code obfuscation against automated attacks. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 3055–3073. [Google Scholar]
Rajba, P.; Mazurczyk, W. Data hiding using code obfuscation. In Proceedings of the 16th International Conference on Availability, Reliability and Security, Vienna, Austria, 17–20 August 2021; pp. 1–10. [Google Scholar]
Xu, D.; Ming, J.; Wu, D. Generalized dynamic opaque predicates: A new control flow obfuscation method. In Proceedings of the Information Security: 19th International Conference, ISC 2016, Honolulu, HI, USA, 3–6 September 2016; Proceedings 19. Springer: Berlin/Heidelberg, Germany, 2016; pp. 323–342. [Google Scholar]
Ge, J.; Chaudhuri, S.; Tyagi, A. Control flow based obfuscation. In Proceedings of the 5th ACM Workshop on Digital Rights Management, Alexandria, VA, USA, 7 November 2005; pp. 83–92. [Google Scholar]
Balachandran, V.; Keong, N.W.; Emmanuel, S. Function level control flow obfuscation for software security. In Proceedings of the 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems, Birmingham, UK, 2–4 July 2014; IEEE: New York, NY, USA, 2014; pp. 133–140. [Google Scholar]
Tang, Z.; Chen, X.; Fang, D.; Chen, F. Research on java software protection with the obfuscation in identifier renaming. In Proceedings of the 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC), Kaohsiung, Taiwan, 7–9 December 2009; IEEE: New York, NY, USA, 2009; pp. 1067–1071. [Google Scholar]
Balachandran, V.; Emmanuel, S. Software code obfuscation by hiding control flow information in stack. In Proceedings of the 2011 IEEE International Workshop on Information Forensics and Security, Iguacu Falls, Brazil, 29 November–2 December 2011; IEEE: New York, NY, USA, 2011; pp. 1–6. [Google Scholar]
Ertaul, L.; Venkatesh, S. Novel obfuscation algorithms for software security. In Proceedings of the 2005 International Conference on Software Engineering Research and Practice, SERP, Citeseer, Las Vegas, NV, USA, 27–29 June 2005; Volume 5. [Google Scholar]
Fukushima, K.; Kiyomoto, S.; Tanaka, T.; Sakurai, K. Analysis of program obfuscation schemes with variable encoding technique. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2008, 91, 316–329. [Google Scholar] [CrossRef]
Kovacheva, A. Efficient code obfuscation for Android. In Proceedings of the Advances in Information Technology: 6th International Conference, IAIT 2013, Bangkok, Thailand, 12–13 December 2013; Proceedings 6. Springer: Berlin/Heidelberg, Germany, 2013; pp. 104–119. [Google Scholar]
Hessler, A.; Kakumaru, T.; Perrey, H.; Westhoff, D. Data obfuscation with network coding. Comput. Commun. 2012, 35, 48–61. [Google Scholar] [CrossRef]
LeDoux, C.; Sharkey, M.; Primeaux, B.; Miles, C. Instruction embedding for improved obfuscation. In Proceedings of the 50th Annual Southeast Regional Conference, Tuscaloosa, AL, USA, 29–31 March 2012; pp. 130–135. [Google Scholar]
Darwish, S.M.; Guirguis, S.K.; Zalat, M.S. Stealthy code obfuscation technique for software security. In Proceedings of the The 2010 International Conference on Computer Engineering & Systems, Cairo, Egypt, 30 November–2 December 2010; IEEE: New York, NY, USA, 2010; pp. 93–99. [Google Scholar]
Eyrolles, N. Obfuscation with Mixed Boolean-Arithmetic Expressions: Reconstruction, Analysis and Simplification Tools. Ph.D. Thesis, Université Paris Saclay (COmUE), Paris, France, 2017. [Google Scholar]
Zhou, Y.; Main, A.; Gu, Y.X.; Johnson, H. Information hiding in software with mixed boolean-arithmetic transforms. In Proceedings of the International Workshop on Information Security Applications, Jeju Island, Republic of Korea, 27–29 August 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 61–75. [Google Scholar]
László, T.; Kiss, Á. Obfuscating C++ programs via control flow flattening. Ann. Univ. Sci. Budapestinensis De Rolando Eötvös Nomin. Sect. Comput. 2009, 30, 3–19. [Google Scholar]
Schloegel, M.; Blazytko, T.; Contag, M.; Aschermann, C.; Basler, J.; Holz, T.; Abbasi, A. Technical Report: Hardening Code Obfuscation Against Automated Attacks. arXiv 2021, arXiv:2106.08913. [Google Scholar]
Tatzer, C. Opcode Coverage-Guided Virtualization Deobfuscation Based on Symbolic Execution. Ph.D. Thesis, Technische Universität Wien, Vienna, Austria, 2020. [Google Scholar]
Watson, A.H.; Wallace, D.R.; McCabe, T.J. Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric; US Department of Commerce, Technology Administration, The National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 1996; Volume 500. [Google Scholar]
Ceccato, M.; Di Penta, M.; Nagra, J.; Falcarin, P.; Ricca, F.; Torchiano, M.; Tonella, P. Towards experimental evaluation of code obfuscation techniques. In Proceedings of the 4th ACM Workshop on Quality of Protection, Alexandria, VA, USA, 27 October 2008; pp. 39–46. [Google Scholar]
Viticchié, A.; Regano, L.; Torchiano, M.; Basile, C.; Ceccato, M.; Tonella, P.; Tiella, R. Assessment of source code obfuscation techniques. In Proceedings of the 2016 IEEE 16th international working conference on source code analysis and manipulation (SCAM), Raleigh, NC, USA, 2–3 October 2016; IEEE: New York, NY, USA, 2016; pp. 11–20. [Google Scholar]
Kumar, K.; Kehar, V.; Kaur, P. A comparative analysis of static java bytecode software watermarking algorithms. Afr. J. Comput. ICT 2015, 8, 201–208. [Google Scholar]
Yadegari, B.; Debray, S. Symbolic execution of obfuscated code. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 732–744. [Google Scholar]

Figure 1. Example 2 of a control flow graph.

Figure 2. Overview of a framework for quantifying obfuscation quality.

Figure 3. Obfuscation quality quantification experiment for commercial and open-source obfuscation tools.

Figure 4. Potency measurement results.

Figure 5. Resilience measurement results.

Figure 6. Cost measurement results.

Table 1. Description of source code obfuscation techniques by category.

Categories	Obfuscation Techniques	Description
	Scramble Identifiers	Mangles symbols such as function names and variable names
Layout Obfuscation	Change Formatting	Changes the format of source code by deleting or adding white space, newline characters, etc.
	Remove Comments	Deletes comments written by programmers
	Data Encoding	Transforms strings, values, and similar elements to obscure recognition
Data Obfuscation	Instruction Substitution	Complicates the structure of instruction calculation expressions such as add, sub, etc.
	Mixed Boolean Arithmetic	Uses formulas that combine Boolean algebra and arithmetic operations
	Bogus Control Flows	Complicates the code by inserting dummy code, thus affecting the control flow graph
Control Obfuscation	Opaque Predicates	Creates a conditional statement that executes in one direction and inserts garbage code in the part that does not execute
	Control Flow Flattening	Puts all control flows, such as loop and conditional branches, into one huge switch statement to move all other blocks from just one block

Table 2. The following details the obfuscation tools used in the experiments, summarizing each tool’s options and obfuscation techniques.

Obfuscators	Options	Obfuscation Techniques	Obfuscator Type
Stunnix C/C++ Obfuscator	protect everything but leave symbol names as is	Change Formatting	Commercial
		Remove Comments
		Instruction Substitution
Semantic Designs C-GCC4 Obfuscator	+PrintAsis +Obfuscate +ObfuscateLiterals	Scramble Identifiers	Commercial
		Change Formatting
		Remove Comments
		Data Encoding
Tigress Obfuscator	Level 1: Enc.Arithmetic	Mixed Boolean Arithmetic	Open-source
	Level 2: Enc.Arithmetic and Add Opaque	Mixed Boolean Arithmetic
	Level 2: Enc.Arithmetic and Add Opaque	Opaque Predicates
	Level 3: Enc.Arithmetic and Add Opaque and Flatten	Mixed Boolean Arithmetic
		Opaque Predicates
		Control Flow Flattening
Obfuscator LLVM	Control Flow Flattening and Instruction Substitution and Bogus Control Flows	Instruction Substitution	Open-source
		Bogus Control Flows
		Control Flow Flattening

Table 3. Description of measurement indicators for quantifying obfuscation quality.

Categories	Measurement Indicators	Description
Potency	McCabe Cyclomatic Complexity	{Number of Edges} − {Number of Nodes} + 2
	Control Flow Graph Size	Number of Nodes
	Control Flow Depth	The maximum number of Edges it takes to get from one Node to another Node
	Program Length	Lines of Code (LoC) in source code
	Instruction Count	Number of Instructions
Resilience	Symbolic Execution Time	Time taken by the Symbolic Execution tool to complete the analysis
	Code Coverage	Percentage of instructions for which the Symbolic Execution tool performed analysis among all instructions
	Static Analysis Time	Analysis time of Static Analysis tools
	Code Optimization	Percentage of instructions optimized by Code Optimization tools
Cost	Time Overhead	Run time of compiled binary
	Space Overhead	Process memory usage, including .data sections, .text sections, etc.
	File Size	Size of binary file

Table 4. Results of measurement for quantifying obfuscation quality—Potency.

Tools		Potency (Num)
Tools		McCabe	CFG Size	CF Depth	Program Length	Instruction Count
Baseline	Average	6.60	14.00	5.20	94.08	79.17
Stunnix C/C++ Obfuscator	Average	6.60	14.00	5.20	33.32	79.17
	Growth Rate	0%	0%	0%	−64.54%	0%
Semantic Designs C Obfuscator	Average	6.60	14.00	5.20	116.89	79.17
	Growth Rate	0%	0%	0%	24.77%	0%
Obfuscator LLVM	Average	43.71	111.86	5.78	723.14	1311.50
	Growth Rate	530.50%	735.44%	78.25%	674.41%	1529.57%
Tigress C Obfuscator (Level 1)	Average	7.28	13.95	4.37	119.48	126.70
	Growth Rate	12.77%	7.51%	−6.26%	27.55%	59.02%
Tigress C Obfuscator (Level 2)	Average	12.87	24.78	5.61	343.41	363.35
	Growth Rate	103.42%	116.41%	44.82%	264.23%	359.91%
Tigress C Obfuscator (Level 3)	Average	49.59	69.50	5.00	654.50	502.63
	Growth Rate	679.42%	533.65%	57.86%	595.34%	536.32%

Table 5. Results of measurement for quantifying obfuscation quality—Resilience.

Tools		Resilience (ms,%)
Tools		Symbolic Execution Time	Code Coverage	Static Analysis Time	Code Optimization
Baseline	Average	55.81	73.17	104.95	55.02
Stunnix C/C++ Obfuscator	Average	38.66	73.13	109.04	55.02
	Growth Rate	−25.77%	0.05%	7.21%	0%
Semantic Designs C Obfuscator	Average	54.92	73.05	108.69	55.02
	Growth Rate	6.01%	0.15%	8.29%	0%
Obfuscator LLVM	Average	84.38	64.80	278.08	15.63
	Growth Rate	74.89%	−10.58%	402.55%	−72.55%
Tigress C Obfuscator (Level 1)	Average	47.24	70.41	109.26	52.34
	Growth Rate	−13.96%	4.08%	12.33%	−4.03%
Tigress C Obfuscator (Level 2)	Average	481.89	80.90	118.47	58.01
	Growth Rate	486.02%	−11.30%	29.83%	8.95%
Tigress C Obfuscator (Level 3)	Average	666.15	79.27	118.71	60.68
	Growth Rate	693.74%	−9.44%	37.45%	15.31%

Table 6. Results of measurement for quantifying obfuscation quality—Cost.

Tools		Cost (ms, Kilo-Byte)
Tools		Time Overhead	Space Overhead	File Size
Baseline	Average	0.768	2.29	15.78
Stunnix C/C++ Obfuscator	Average	0.770	2.29	15.78
	Growth Rate	0.30%	0%	0%
Semantic Designs C Obfuscator	Average	0.778	2.29	15.77
	Growth Rate	1.23%	0%	−0.07%
Obfuscator LLVM	Average	0.767	10.73	23.30
	Growth Rate	−0.13%	376.01%	47.94%
Tigress C Obfuscator (Level 1)	Average	0.782	2.50	15.88
	Growth Rate	1.83%	9.31%	0.69%
Tigress C Obfuscator (Level 2)	Average	0.761	3.88	16.99
	Growth Rate	−0.92%	69.91%	7.69%
Tigress C Obfuscator (Level 3)	Average	0.776	5.33	18.09
	Growth Rate	1.07%	133.43%	14.63%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jin, H.; Lee, J.; Yang, S.; Kim, K.; Lee, D.H. A Framework to Quantify the Quality of Source Code Obfuscation. Appl. Sci. 2024, 14, 5056. https://doi.org/10.3390/app14125056

AMA Style

Jin H, Lee J, Yang S, Kim K, Lee DH. A Framework to Quantify the Quality of Source Code Obfuscation. Applied Sciences. 2024; 14(12):5056. https://doi.org/10.3390/app14125056

Chicago/Turabian Style

Jin, Hongjoo, Jiwon Lee, Sumin Yang, Kijoong Kim, and Dong Hoon Lee. 2024. "A Framework to Quantify the Quality of Source Code Obfuscation" Applied Sciences 14, no. 12: 5056. https://doi.org/10.3390/app14125056

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Framework to Quantify the Quality of Source Code Obfuscation

Abstract

1. Introduction

2. Background

2.1. Source Code Obfuscation

2.1.1. Layout Obfuscation

2.1.2. Data Obfuscation

2.1.3. Control Flow Obfuscation

2.2. Source Code Obfuscator

2.3. Dataset for Evaluating Source Code Obfuscation

3. Threat Model

4. Approach

4.1. Potency

4.2. Resilience

4.3. Cost

5. Framework

6. Experiment

6.1. Experimental Process

6.1.1. Experimental Design

6.1.2. Detailed Protocol

6.2. Potency Measurement Results

6.3. Resilience Measurement Results

6.4. Cost Measurement Results

7. Discussion

8. Related Works

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI