1 Introduction
In today’s highly digitized world, an increasing number of cyber attacks on the network, software, or hardware layers pose a threat to individuals, governments, industry, and societies as a whole [
39,
53].
Integrated Circuits (
ICs) in the form of microchips perform various security-critical functions in a wide range of applications, ranging from smartphones and other consumer electronic devices to industrial electronics and network routers that form the internet backbone, making them a valuable target for sophisticated cyber attacks.
Hardware Reverse Engineering (
HRE) is a common method for retrieving information about the inner structure and functionality of microchips and is therefore often a starting point for cyber attacks on hardware [
34,
47]. Attacks based on HRE are typically difficult to detect and to defend against and can include
Intellectual Property (
IP) theft as well as the implementation of malicious backdoors or hardware Trojans, which can undermine integrity, availability, and confidentiality of the target system [
7,
38]. Hardware Trojans also underlie the current political discussion about foreign-built computer and communication equipment [
15]. Consequences of undetected malicious actions can range from monetary losses and reputational damage (especially for hardware manufacturers), susceptibility to extortion, threats to critical infrastructure, and life-threatening conditions for individuals such as politically oppressed persons (e.g., through compromise of their communication equipment for the purpose of causing them harm) [
38,
44].
As HRE tools that automate the entire analysis do not exist [
60], the success of HRE strongly depends on the analysts’ cognitive processes such as the application of domain-specific knowledge or problem-solving strategies. As such, hardware reverse engineers must usually resort to conducting a variety of problem-specific and at times fully-customized analysis steps [
34]. Thus, it is crucial to obtain a thorough understanding of the cognitive factors and processes that influence analyst performance. This understanding has, however, thus far not gained much research attention in the field [
6,
60].
In our work, we adapt principles of cognitive psychology and
Human Computer Interaction (
HCI) research to understand the behavior of hardware reverse engineers conducting HRE. Over the past few decades, the importance of cybersecurity topics in HCI research has continued to grow. Recently, Stephanidis et al. [
53] identified research on security as one of seven grand challenges facing the HCI community. Security researchers (e.g., Sasse et al., 2001 [
50]) have long considered the human factor when designing secure and usable systems, as well as to generally improve the interaction between various user groups (e.g., end users and developers) and security-related systems. In the article at hand, however, our aim is not to improve the interaction between users (reverse engineers) and the system (e.g., targeted microchip or tools), but rather to make it more difficult through the development of novel countermeasures that impede HRE through the introduction of cognitively challenging tasks, a technique we coin
cognitive obfuscation. This “twist” on the typical HCI framework may constitute a novel direction for HCI research.
In this article, we strive to obtain an initial understanding of the hitherto poorly understood human factors that determine the success of HRE efforts. The motivation behind this exploratory work is not to generalize or derive definitive answers about human factors and problem-solving processes in HRE, but to provide a foundation for future research and discussion in this area. Thus, we aim at sheding light on the cognitive processes involved in HRE and provide detailed insights into problem-solving strategies employed to solve a realistic HRE task. From this exploration, we derive first ideas for the design of cognitively challenging tasks which could serve as the basis for future HRE countermeasures that make the cost of HRE unattractively high.
However, researchers who aim at studying hardware reverse engineers face the methodological problem that HRE experts are unavailable for research. To address this methodological challenge and to explore human factors in HRE, we suggest a methodology to systematically investigate the problem-solving behavior of human analysts. To this end, we conduct an exploratory study in which we collect detailed behavioral log files from nine reverse engineers with different levels of expertise (eight intermediates and one expert) whom we had instructed to solve a realistic HRE problem-solving task over the course of two weeks. This small but thorough study allows us to explore and observe the entire HRE process, from receiving an unknown netlist to identifying the crucial netlist components in an ecologically valid scenario. In our exploratory study, we apply an iterative open coding scheme based on the well-established
Grounded Theory (
GT) approach [
55]. We focus on (i) observed problem-solving processes, (ii) differences between applied problem-solving processes of intermediates and the expert, and (iii) differentiating between more and less efficient problem-solving strategies. We discuss our results in the context of the psychological literature on problem solving and suggest several novel research directions for the development of cognitive obfuscation techniques.
In summary, we make the following contributions:
(1)
We suggest a methodology to explore human factors in HRE where HRE experts are usually unavailable for research. On this basis, we conduct an exploratory study in which we collect and systematically analyze 2,445 behavioral log entries of nine participants with different levels of HRE expertise.
(2)
Based on the log file data, we develop a detailed hierarchical HRE problem-solving model, which includes and classifies 103 discrete actions that reverse engineers took while completing a realistic task.
(3)
Through the application of an iterative open coding approach based on GT, we obtain insights into the problem-solving strategies that reverse engineers apply and their efficiency and explore the differences between the study participants.
(4)
We qualitatively classify the specific characteristics of problem solving in HRE, discuss the methodological limitations of our work, and lay the groundwork for future psychological studies.
(5)
We outline ideas for cognitively demanding challenges for future technical studies on novel countermeasures—i.e., cognitive obfuscation—against HRE.
3 Methods
During the summer term of 2019, we studied the reverse engineering, i.e., problem-solving, behavior of students, who received a 14-week training to acquire an intermediate level of HRE expertise, as well as one expert. We asked the participants to solve a realistic HRE problem-solving task and collected detailed behavioral log files comprised of 1,141 executed scripts, 68 console inputs, and 1,217 manual activities. More specifically, we applied a well-established iterative open coding methodology—based on the GT approach by Strauss and Corbin [
55]—to systematically examine applied problem-solving strategies.
3.1 Participants
In the present study, we investigated how eight students on intermediate levels of HRE expertise and one expert solved an HRE problem-solving task. In Section
3.1.1, we outline how we trained the student sample to acquire an intermediate level of HRE expertise. We elaborate on the sample of intermediates in Section
3.1.2 and provide information on the expert participant in Section
3.1.3.
All participants voluntarily participated in our study and provided written informed consent. We emphasized that the participants retained the right to withdraw and/or request the deletion of all study-related data at any time and without reason. All student participants received monetary compensation for the completion of study-related materials. We protected participants’ privacy by using randomly-assigned pseudonyms instead of participant names on all study-related materials and in all study-related activities. During data analysis, these pseudonyms were replaced by randomly-assigned numbers. The privacy-protection procedures of this study were reviewed and approved by the data protection officer of the university at which it was conducted.
3.1.1 HRE Skill-training for Students.
As achieving our research objectives was contingent upon the availability of an appropriately experienced study population, all participating students were required to acquire the prerequisite HRE-specific knowledge and skills. To facilitate this process, we embedded a 14-week HRE training course, which conveyed both declarative knowledge (e.g., of digital hardware design) and skills (e.g., the development of customized scripts in HAL), into our study (see Section
2.3). The transformation of declarative HRE knowledge into procedural knowledge (i.e., skills) was promoted by the specific structure of the HRE training [
61] that was based on the psychological theory of the
Adaptive Control of Thought-Rational (
ACT-R) [
2]. The contents of the HRE training were presented to HRE experts and revised according to their feedback and suggestions.
1The content of the four training tasks was as follows:
—
The first task introduces the HAL framework and its basic features to the students, who have to analyze the data path of a small unknown substitution-permutation-network called ToyCipher consisting of 131 gates. Analyzing and comprehending the data path of an unknown (sub)circuit is critical for reverse engineers seeking to understand a netlist.
—
Task 2 covers the foundations of Finite State Machine (FSM) reverse engineering based on a modified version of the ToyCipher with 138 gates. FSMs are of particular interest to reverse engineers because they control almost every hardware circuit. Finding and understanding these control elements is therefore a fundamental part of learning HRE.
—
FSM obfuscation is a potential starting point for HRE countermeasures as it aims at hampering the analyst’s understanding of the circuit control logic. In the third task, students have to extract the gates that implement the obfuscated control logic of another version of the ToyCipher (128 gates) and analyze the deployed FSM obfuscation method. In the second step, they disable the obfuscation through initial state patching and verify their manipulation.
—
In the last training task, students must manipulate a netlist implementing an Advanced Encryption Standard (AES)-128 encryption and key schedule so that they could extract the hard-coded encryption key. The underlying netlist with 2,176 gates is by far the most complex encountered during the training tasks. This complexity is comparable to the problem-solving task under study.
While the HRE training course presented several realistic tasks which aid in the building of HRE skills, it intentionally did not present any best practices or solution strategies specific to the problem-solving task central to the study (see Section
3.2.1) to avoid biasing or artificially enhancing the performance of the participants.
3.1.2 Intermediates.
The intermediate population consisted of 22 experienced students who were enrolled in either their last year of a three-year Bachelor’s cybersecurity program or in a Master’s cybersecurity program and had successfully completed the HRE training (Section
3.1.1) to achieve an intermediate level of expertise. Of the original 22 participants, eight withdrew from the study or had to be excluded due to incomplete datasets.
As we aimed at developing an initial and transferable model of HRE problem-solving processes that would describe the behaviors of reverse engineers as they exist in the real world, we selected only the eight top-performing students (mean age
\(M = 24\) years;
\(SD = 4\) years) from the HRE training for further study, as they were the most suitable proxy for practicing hardware reverse engineers. In quantitative terms, this group achieved a mean solution probability of
\(97.5\)% with a
\(7.1\)% standard deviation over all four HRE training tasks and a mean solution probability of
\(98.5\)% with a
\(1.9\)% standard deviation in the HRE problem-solving task. By selecting these top-performing students, we sought to avoid biasing our HRE problem-solving model through the capture of unrelated issues, such as insufficient programming skills or difficulty using the HRE tool HAL (see Section
3.2.2). Three teaching assistants collaboratively graded the participants’ solutions. Using a detailed gradebook with sample solutions, they assigned a solution probability on a scale of 0% to 100%. To achieve a solution probability of 100%, participants needed to turn in a functioning script which completely removed the implemented watermark from the underlying netlist (see Section
3.2.1).
After successful completion of the four training tasks during our 14-week HRE training, the eight top-performing students acquired an intermediate level of expertise: We observed that they were able to achieve a high solution probability by applying rules effectively and trying to avoid making mistakes, although their decisions were not always error-free. Those characteristics are commonly assigned to intermediates [
23]. A novice would have completed the HRE tasks very slowly, laboriously, and in a “every step by rule” manner, as their knowledge would have been scarce [
23]. In contrast to intermediates, experts usually solve problems by unconsciously applying strategies that they have developed based on their extensive domain-specific prior experience [
23].
3.1.3 Expert.
Despite the previously discussed difficulty of recruiting HRE experts for research purposes (see Section
2.3), we were successfully able to recruit one expert by leveraging the professional network of one of the authors. This expert, a researcher in the field of hardware security, had five years of experience
2 in HRE and typically spent between 20 and 30 hours per week on HRE and an additional 20 to 30 hours per week on software and hardware programming activities relevant to the completion of HRE tasks. Additionally, the expert had considerable experience in reverse engineering–related topics such as high-level and low-level programming languages, as well as significant knowledge of HRE-specific topics such as chip architectures (e.g.,
Field Programmable Gate Arrays (
FPGAs)) and crucial netlist components (such as FSMs).
The HRE expert was substantially involved in the development of the HRE framework HAL and had profound prior experience in using HAL during the daily-working practice in order to analyze unknown netlists. Already possessing the required knowledge and skills, the HRE expert did not participate in the training.
3.2 Materials
3.2.1 HRE Problem-solving Task.
A representative task sampled directly from a real-world situation is essential to capturing realistic problem-solving processes [
13]. In light of this, we developed a realistic HRE problem-solving task for the purposes of the study in which participants, i.e., the intermediates and the expert, were directed to identify and remove a watermark from a given netlist—a challenge similar to those reverse engineers must face in the real world (see Section
2). The removal of IP protection mechanisms such as watermarks from an Integrated Circuit (IC) is a realistic scenario in HRE since engineers try to identify the watermark in the netlist and remove it in order to make an illegal copy of the chip.
The netlist on which the task is based implements a round-based AES-128 encryption and is synthesized for the
Xilinx Unisim gate library [
42]. The synthesized flat netlist consists of 2,294 gates—
Look-up tables (
LUTs) with two to six inputs,
Flip Flops (
FFs), multiplexers, and buffers—and 2,435 wires connecting the gates. After synthesis, we removed variable names from all elements and embedded
watermark signatures, i.e., bit strings, into 50 randomly selected (LUTs) according to the watermarking scheme proposed by Schmid et al. [
51]: (i) we added input to the selected LUT and (ii) connected the input to
GND or
VCC, i.e., to a constant “0” or “1”. (iii) Then we inserted the randomly generated watermark signatures into the non-addressable area of the LUT. In this way, according to Schmid et al., an unauthorized cloned circuit could later be uniquely identified.
In the problem-solving task, we asked the participants to remove the watermark from the circuit, that is, to clone the circuit without copying the watermark such that the IP owner would be unable to prove that the circuit had been counterfeited. Since a completely manual manipulation of all individual bits of the watermark signatures seemed unrealistic, the participants had to develop a script that removes the signatures. It was not further specified in which way this has to be done; however, the participants were made aware that the netlist contains a watermark and that this is the watermark proposed by Schmid et al. These are insights that a reverse engineer would first have to obtain in the real world, for example by researching patents. Thus, in the context of our user study, we resembled a situation where a reverse engineer has already researched all the necessary background information to begin the actual HRE process.
After participants submitted their solution scripts, the teaching assistants ran them on the netlist and checked the LUTs that (formerly) contained the watermark signatures in the resulting manipulated netlist. The solution probability was subsequently determined based on the successful removal of the watermark.
3.2.2 The HAL HRE Tool.
The HAL HRE tool, developed by Fyrbiak et al. [
36], served as the environment in which we conducted the study, providing the participants with a uniform platform for performing HRE attacks and researchers with a mechanism for creating detailed behavioral log files of the same. HAL is a state-of-the-art HRE framework used by researchers and experts in the domain of netlist analysis and is available on GitHub under an open-source MIT license [
11,
59]. While HAL does not provide any (semi-)automated analysis methods, it allows analysts to focus on the development of HRE attacks with no further need for tool development. A screenshot of the
Graphical User Interface (
GUI) of HAL is depicted in Figure
1.
HAL’s GUI allows for both manual and script-based analysis of netlists. Increasing netlist complexity has necessitated a departure from the predominantly manual reverse engineering practices of the past through the use of script-based partial automation. Users can navigate through a textual or graphical representation of the netlist to manually analyze components such as gates or their interconnections. HAL’s integrated Python shell allows analysts to interactively script and tests their analysis methods, which is necessary when solving complex HRE tasks, such as the one in this study. HAL also offers multiple reverse engineering-specific Python commands that enable the interaction with the netlist under attack.
Participants familiarized themselves with HAL by completing four tasks during the HRE skill training. To create a realistic study environment comparable to that in which HRE experts work, we provided the participants with a manual for HAL containing operating instructions, a detailed description of its reverse-engineering specific Python commands, as well as several code snippets demonstrating HAL’s capabilities.
3.3 Collected Data
Demographic Questions. Using two short questionnaires, we asked participants to provide information about their socio-demographic backgrounds. Of these, one was designed for the student participants and the other for the HRE expert. Students were asked to answer questions about their age, major, and target degree. To corroborate the reported expertise level of the expert, the expert was asked questions including age, the highest level of education, current job position, expertise level, years of relevant experience in HRE, and hours spent performing HRE per week (according to Votipka and colleagues [
58]).
Behavioral Log Files. A behavioral log file was automatically generated by HAL for each participant solving the HRE task. Every log file contained between 148 and 467 entries consisting of a timestamp and one of the following events: (i) a script-based analysis step containing the executed Python script evaluated for syntactical correctness and the corresponding console output; (ii) a short Python console input evaluated for syntactical correctness and the corresponding output; (iii) a manual analysis step of a netlist component (e.g., selection of a gate or net), together with a unique identifier for the component; or (iv) an indicator for (in)activity phases and their duration. In sum, the log files of the nine participants contained 2,445 different events (1,141 executed scripts, 68 console inputs, 1,217 manual netlist interactions, and 19 idle events). Each participant’s log file was pre-processed in order to display the events together with their associated information in tabular form. An excerpt of such a pre-processed log file is shown in Table
1.
3.4 Data Analysis Methods
Iterative Open Coding based upon GT. We qualitatively analyzed each event recorded for each participant by applying an iterative open coding scheme based upon GT methodology [
55]. Grounded Theory (GT) is a well-established and standard research method from the social sciences and is often applied in research areas in which no previous theories or models exist [
12]. As prior research on cognitive processes in HRE is lacking, we decided to apply GT as it provides explicit guidelines which facilitate the analysis of problem-solving processes.
A detailed description of the iterative coding procedure we applied is as follows. First, we grouped one or several consecutive events into \(1{,}232\) segments so that related events could be annotated together. These groups mostly consisted of manual netlist interactions and only rarely consecutive console inputs or executed scripts. To create a basis for the assignment of open codes describing HRE problem-solving processes and strategies, we annotated each segment with the following:
(1)
A description of the observed problem-solving step (e.g., “The executed script iterates over all netlist components and checks if their name contains the string LUT.”, or “Manual selection of a netlist component implementing a watermark.”)
(2)
The duration of the problem-solving step in seconds.
(3)
A description of any changes compared to the previous step (e.g., “The participant added three lines of code containing one print statement, one if-clause, and the reversing-specific Python function get_data_by_key().”)
(4)
An explanation of the observed behavior (e.g., “The watermarking is implemented by LUTs. Therefore, the participant filters the netlist components for LUTs.”)
We subsequently assigned one or several open codes (e.g., script-based inspection of watermark candidates, or successful correction of syntactical errors) capturing the most relevant annotations to each segment. Two of the authors began the open coding process by collaboratively annotating and encoding the log files of four participants to create an initial code book, making updates and re-coding previous segments as necessary. Using this initial code book, a third author annotated and encoded the log files of the remaining five participants independently. After completing this process of annotating and encoding each participant’s log file, we discussed the results, incorporated any new codes into the code book, and retroactively applied any such codes where applicable. The final code book contains 103 different unique codes, which are a condensed version of the segment annotations and reflect the observed HRE processes of our participants. These 103 unique open codes were assigned 1,887 times across the 1,232 annotated segments. After completing the iterative open coding for all participants, an external researcher independently coded 71 randomly selected segments from three participants on the basis of the existing code book with an acceptable inter-coder reliability of \(84.5\)%. Two of the coders had relevant backgrounds in HRE and a significant level of domain-specific prior knowledge in, for example, Boolean Algebra, chip architecture, hardware watermarks, and the handling of HAL, as well as scripting in Python. The third coder had relevant knowledge of cognitive psychology and qualitative data analysis, such as applying an iterative open coding procedure to analyze qualitative data and to derive a theoretical model from the data.
Taxonomy of Open Codes. Using the final code book as a starting point, we developed a taxonomy of observed HRE problem-solving processes during several rounds of discussions organized for that purpose. To do this, we first grouped related open codes into problem-solving clusters based on similarities. These clusters were then subordinated into nine principal actions representing the problem-solving processes exhibited by participants during the study (RQ1), thereby establishing a foundation for the later analysis of problem-solving strategies (RQ2). Eight of the nine principal actions could again be organized into two main categories, allowing HRE problem-solving processes to be subdivided into either programming-related or reversing-specific actions.
Once completed, this taxonomy enabled us to describe observed problem-solving processes and to highlight differences between participants with different levels of expertise (see Section
4).
Total Solution Time. We computed the total solution time, which is an important metric for performance analyses in psychological studies. In the particular setting of our study, participants were asked to solve a single HRE problem-solving task. While industry experts are likely to develop strategies that transfer to other HRE tasks, our participants were not specifically requested to develop generalizable solutions. We will return to this aspect in Section
5.4. As such, time is a central metric for assessing the efficiency of HRE and can provide important insights into the efficiency of participants’ solutions when combined with our in-depth qualitative analysis of applied problem-solving strategies.
Solution times were calculated per participant using the time stamps in the automatically generated log files. Idle events, which could be clearly identified as breaks unrelated to the task based on their annotations and wall-clock time (i.e., the time of day at which they occurred), were excluded from the total solution time. As the participants were already preselected based on their fully correct solutions, we analyzed total solution time to determine differences in the efficiency of observed problem-solving strategies (RQ2).
Incorporating Open Code Duration. Since the mere number of open codes assigned does not alone provide a granular metric with which to describe the problem-solving strategies of a reverse engineer (RQ2), we supplemented it through the incorporation of participant solution time. To do so, we first split the total solution time of participants into 50 time windows of 2% each.
3 We then identified which open code(s) were assigned in each time window, thereby creating a fine-grained measure of the relative amount of time each participant spent on single actions (as described by the respective open codes). This metric has the added benefit of providing information about which open codes were assigned towards the beginning, middle, or end of a participant’s time-on-task.
The incorporation of participants’ time spent on assigned open codes, in summary, enables the analysis of their applied problem-solving strategies to answer RQ2.
3.5 Summary of Study Procedures
At the beginning of the 14-week HRE training, participants signed the informed consent document and were assigned a randomly chosen pseudonym. Subsequently, all participants were asked to answer the questionnaires on socio-demographics via an online survey provider. After finishing the HRE training, participants received the HRE problem-solving task materials consisting of a short task description, the watermarked netlist, and a copy of the paper in which the implemented watermark was first proposed.
Participants were provided a two-week window in which to complete the task and were free to choose their time and place of work as well as whether they wanted to work through the problem in one sitting or in smaller steps with breaks in between. The participants individually worked on their submissions, which was clearly communicated to them at the beginning of the HRE training and the main study. Based on the submissions and the log files, we checked the students’ submissions for plagiarism and copies. We did not detect any signs of plagiarism in the participants’ submissions. Once finished, participants uploaded their log files to a server located at the university and received the stipulated monetary compensation for their participation in the study.
4 Results
In the following sections, we present the results of our analysis as they relate to the three research questions introduced in Section
2.4. In Section
4.1, we use the detailed behavioral log files of the nine participants who completed the HRE problem-solving task central to the study to create a model of HRE problem-solving processes. In Section
4.3, we build upon this foundation to examine differences between the one expert and eight intermediate problem-solving processes. Finally, we describe differences between the problem-solving strategies applied by the participants in regard to time efficiency in Section
4.5. Each of these main sections (Sections
4.1,
4.3, and
4.5) is followed by a brief discussion of the research questions to which they correspond (Sections
4.2,
4.4, and
4.6).
4.1 RQ1a: Which Problem-solving Processes Can be Observed while Participants are Solving a Realistic HRE Task?
Following the methodology described in Section
3.4, we systematically analyzed and categorized the log files submitted by eight intermediates and one expert after completing the realistic HRE task presented to them in the study. We grouped the 103 unique codes into nine principal actions, enabling us to describe HRE problem-solving processes in detail and to create a model of HRE problem solving, which is presented in Figure
2 below. According to our analysis, eight of the nine principal actions could be assigned to one of the two main HRE problem-solving categories:
Reversing Actions, and
Code Development. These two main categories clearly differentiate actions which directly influence the success of problem-solving processes from actions pertaining to the development of programming code, which indirectly influences problem-solving processes. The ninth and final principal action,
External Influences, encompasses codes that describe exogenous influences upon HRE problem-solving processes such as
external interruptions of the analyst.
Codes from all nine principal actions were assigned to the events contained in the log files of eight participants; we did not observe any
External Influences in the logs of the ninth participant. Table
2 provides a per-participant overview of assigned open codes aggregated at the principal-actions level. A detailed view of all open codes and the frequency with which they were assigned to each participant can be obtained from Appendix
C. The HRE problem-solving model we developed is hierarchical and consists of four levels, progressing from general to specific:
categories are represented at the highest level,
principal actions at the second level, followed by
clusters and finally by
open codes at levels three and four, respectively. The number of
unique open codes specifies how many different open codes a principal action or cluster contains, whereas the number of
assigned open codes represents how often unique codes were assigned to participants’ actions.
4.1.1 Main Category: Reversing Actions.
The main category
Reversing Actions consists of 63 unique codes which were annotated and assigned 543 times to actions undertaken by the study participants. At the next level of our hierarchical model, we identified these codes as belonging to one of four principal actions:
Inspection and Information Gathering,
Reversing Strategy Decisions,
Reversing Milestones and Sub-Steps, and
Reversing Problems (see Table
2). Where helpful, we added further structure below the principal-actions level.
The first principal action, Inspection and Information Gathering, encompasses all of the steps that participants took to retrieve general information about the netlist and its components (e.g., Exploration and Identification) as well as detailed information about (crucial) netlist components (e.g., Inspection). Codes pertaining to Inspection were more frequent in number than codes pertaining to Exploration or Identification. Inspection and Information Gathering steps were conducted predominantly in a manual (15 unique codes, assigned 135 times) as opposed to in a script-based (15 unique codes, assigned 25 times) way. The most prevalent open codes in this principal action included in-depth manual inspection of watermark candidates (assigned 39 times), manual selection of irrelevant gates (assigned 14 times), and manual netlist exploration (assigned 13 times).
The second principal action, Reversing Strategy Decisions, contains all actions taken or decisions made by the participants constituent to the (sub)problems of the task. We further organized the open codes in this principal action into specific Strategies and Approaches (14 unique codes, assigned 135 times), observed Change of Strategy (3 unique codes, assigned 14 times), and detailed Sub-Step Preparation (2 unique codes, assigned 59 times). The two codes in the Sub-Step Preparation grouping were assigned with equal frequency and were the most commonly assigned codes in the principal action Reversing Strategy Decisions. The next most-frequently assigned codes were duplication of partial solutions for watermark candidates (17 times) and reversion to a proven approach (10 times).
Reversing Milestones and Sub-Goals includes codes describing actions through which progress in solving the HRE task e.g., identification of watermark candidates or removal of the watermark was achieved. We structured this principal action hierarchically in descending order of significance, with Achieving Milestones (4 unique codes, assigned 54 times) above Achieving Sub-Goals (3 unique codes, assigned 21 times), followed by Systematic Approach to Considering Milestones (3 unique codes, assigned 43 times).
The fourth and final principal action, Reversing Problems, represents quite the opposite of the previous principal action. The codes in this principal action describe actions which were error-prone and may be signs of the reversing process becoming bogged down. We further subdivided this principal action into clusters including open codes related to Confusion (3 unique codes, assigned 16 times), Failed Attempts (2 unique codes, assigned 12 times), and Lack of Understanding (3 unique codes, assigned 22 times). The most common codes in Reversing Problems include reversing-specific lack of understanding (assigned 11 times), lost track of the reversing approach (assigned 9 times), or dead end (assigned 8 times).
4.1.2 Main Category: Code Development.
The second main category, Code Development, consists of four principal actions: Error Introduction, Troubleshooting, Test and Validation, and Code Adjustments, all of which describe programming-related actions. Every HRE problem requires the development of programming code or customized scripts, and Code Development should therefore be considered in the analysis of HRE problem-solving processes. However, many of the open codes to which this category is home describe problems that can occur in contexts other than HRE.
The first principal action, Error Introduction, captures the Semantic (2 unique codes, assigned 98 times) and Syntactical errors (2 unique codes, assigned 115 times) that can be introduced into the Code Development process. Open codes of the second principal action, Troubleshooting, describe Error Search and Correction Attempts (5 unique codes, assigned 136 times) and Error Correction (5 unique codes, assigned 269 times). Examples of such open codes include successful correction of syntactical errors (assigned 160 times), and general debugging (assigned 57 times).
The third principal action, Test and Validation, represents the Focused (5 unique codes, assigned 181 times) and General (2 unique codes, assigned 51 times) program code testing and validation methods observed in the study and includes open codes such as targeted targeted verification (58 times) or manual netlist inspection for script validation (38 times).
Code Adjustments, the final principal action, contains open codes related to participant’s efforts to restructure or simplify scripts or to document solutions. We consolidated and segmented such open codes into the clusters Creating Clarity (6 unique codes, assigned 268 times), Cut, Copy, and Paste (5 unique codes, assigned 67 times), and Documentation (3 unique codes, assigned 76 times). Common examples of open codes in this principal action include improve clarity of console output (109 times), reversion to previous code components (38 times) and explanatory documentation (31 times).
4.1.3 Principal Action: External Influences.
In addition to the principal actions grouped into the main categories Reversing Actions and Code Development, we also observed several External Influences (4 unique codes, assigned 74 times) in the participants’ logs. Those include open codes such as external interruption (assigned 35 times) or unintentional manual selection (assigned 16 times). Although such External Influences are not unique or exclusive to HRE, they undoubtedly occur in real-world settings similar to that presented in the study, and as such, can materially impact the efficiency and effectiveness of the problem-solving strategies that are the focus of our research.
4.2 Discussion of RQ1a
As introduced in Section
3.4 and expanded upon in the preceding section, we applied an iterative open coding approach based upon GT [
55] and systematically gathered and analyzed behavioral log files from nine participants tasked with solving a realistic HRE problem in order to develop a detailed model of HRE problem-solving processes. The model consists of two main categories and nine principal actions which were developed based on 103 assigned open codes. While we do not claim the model to be exhaustive—new open codes may emerge through the analysis of different tasks—we believe that it covers all of the essential actions hardware reverse engineers conduct in order to solve HRE tasks.
The problem-solving model has a hierarchical structure, ranging from more task-specific layers (open codes) to more general layers (clusters and principal actions). This structure was developed by the three coders in several rounds of discussion based on the final open codes after the coding process was completed. Accordingly, we have moved from data to the HRE problem-solving model, or in other words, from the specific to the general. Therefore, data at the open-code layer is sometimes task-specific, i.e., specific to removing a watermark from a Field Programmable Gate Array (FPGA) netlist (especially in the two principal actions Inspection and Information Gathering and Reversing Milestones and Sub-Goals). However, by applying an inductive analysis, we transferred the specific context of the open codes through the clusters to more general principal actions. Due to the exploratory nature of this work, we cannot draw definitive conclusions about the generalizability of our HRE problem-solving model. We consider our developed model as the first important step in describing problem-solving processes in HRE. However, this model still needs to be validated and possibly extended by future research with different realistic HRE tasks.
The HRE framework HAL is used by a growing number of researchers and professionals and provides a simple integrated development environment for script-based netlist analysis in Python, a path also taken by well-known tools from the field of SRE (e.g., IDA Pro). Code development skills, such as programming skills in Python, are more general abilities that are applied in a variety of contexts. However, in the case of HRE, code development skills are relevant to analyze an unknown netlist. Without programming skills, HRE would be very cumbersome or even impossible. Thus, the actions assigned to Code Development are also likely to be relevant in the context of HRE tasks.
The resulting HRE model allows us to conceptualize the observed HRE processes. This conceptualization in turn provides us with the basis, a language so to speak, with which we can describe the problem-solving strategies of the HRE-process. The above-presented model, which divides reversing-specific HRE problem-solving processes into four principal actions with additional granularity at the sub-cluster level, provides in-depth perspective as we explore fundamental expertise-related differences in participant problem-solving processes in our discussion and analysis in RQ1b. The principal actions are central to the HRE problem-solving processes of our participants as they indicate obstacles encountered while solving the task in the first case, and represent the variety of approaches applied to solve the task in the second. To identify and explore differences in the time efficiency of participants’ problem-solving strategies, these principal actions must be viewed in context with time-on-task as well as other actions which may have led to certain Reversing Problems or Reversing Strategy Decisions. We delve deeper into these principal actions and present this discussion of time-efficiency in RQ2.
4.3 RQ1b: Are there Differences in the HRE Problem-solving Process between Participants with Different Levels of Expertise?
In order to answer RQ1b, we searched for qualitative differences on an open-code level between the problem-solving processes of the eight intermediates and the one expert. An excerpt of a visualization of this comparison is shown in Figure
3. The left-hand side depicts three open codes which were only observed in the expert’s process, all stemming from the main category
Reversing Actions. The right-hand side similarly shows the five most common codes assigned only to actions undertaken by the intermediates. Of these, three again stem from the
Reversing Actions category, with the other two being related to
External Influences. The middle of Figure
3 shows the five most common codes—all related to
Code Development—assigned to both the expert and intermediates. In total, there are 52 unique codes which were only assigned to intermediates’ problem-solving processes. Of these 52 codes, only 10 codes fall into the
Code Development category, whereas 39 are related to the intermediates
Reversing Actions.
Open codes indicating a lack of experience such as introduction of repeated semantic errors (assigned 11 times) and introduction of redundant code (assigned 6 times) reflect major differences in intermediates’ Code Development. In this context, intermediates also inserted anticipatory documentation (assigned 8 times) to help keep better track of their work-in-progress. The expert and the intermediates otherwise had the majority of unique open codes related to Code Development in common.
Differences in regards to Reversing Actions were more marked, with the majority of unique open codes being assigned only to the problem-solving processes of the intermediates. Intermediates showed unique processes in the principal action Inspection and Information Gathering as reflected by several open codes related to manual information gathering, whereas two of the three open codes exclusively assigned to the expert are related to script-based information gathering. In the principal action Reversing Strategy Decisions, intermediates showed several diverse and unique processes as indicated by 13 exclusively assigned codes. Worthy of mention is that seven out of eight unique codes from the Reversing Problems principal action were assigned only to intermediates. Seven of the eleven open codes from the principal action Reversing Milestones and Sub-Goals were assigned to both the expert and to the intermediates, whereas four codes were assigned to intermediates exclusively.
4.4 Discussion of RQ1b
In the preceding section, we compared the HRE problem-solving process of one expert and eight intermediates at the open-code level.
Our analysis shows that major differences in the problem-solving processes of the two groups exist within the
Reversing Actions category, primarily concentrated in the principal actions
Inspection and Information Gathering,
Reversing Strategy Decisions, and
Reversing Problems. While many of the unique open codes from those principal actions were only assigned to intermediates, it is plausible that each analyst only executed a subset of a broader universe of reversing-related actions available to them. The actions enumerated in the three aforementioned principal actions will allow us to develop a precise picture of strategies employed during the observed attack, and will therefore remain in focus as we analyze the efficiency of HRE problem-solving strategies in RQ2. As we tackle the topic of efficiency, we will expand our analysis beyond the number of assigned codes to include a fine-grained metric based on time (see Section
3.4).
Intermediates tended to be interrupted by External Influences as evidenced by the frequency with which they had to re-enter and re-orient themselves after experiencing external interruption. While we cannot draw a generally applicable conclusion from this observation, it is conceivable that the expert was better able to focus on the task than the intermediates.
The problem-solving processes of the expert and the intermediates in regard to Code Development differed only on the margins, with the expert exhibiting a higher level of programming experience—the many open codes assigned to both groups otherwise indicate a general similarity of process.
The question at the heart of the following analysis is twofold: Was the expert able to solve the HRE problem-solving task more quickly and in a more efficient way than the intermediates and are there differences in the applied problem-solving strategies?
4.5 RQ2: How do the Participants’ Strategies Compare to Each Other?
Because a hardware reverse engineer will ultimately always succeed if given enough resources, efficiency is in this field defined as a function of time. This point was also borne out in the results of our study: all of the participants correctly completed the HRE task (as indicated by their high solution probabilities), but with solution times ranging from 163 to 528 minutes (see Table
3).
In Section
4.3, we established that the differences between the expert and the intermediates are most concentrated in the principal actions
Inspection and Information Gathering,
Reversing Strategy Decisions, and
Reversing Problems. Thus, we used the open codes assigned to these principal actions to form the basis of our evaluation of the efficiency of the participants’ problem-solving strategies.
The absolute number of open codes assigned is broadly descriptive of the steps of which problem-solving processes consist but is an otherwise coarse metric. Folding in the actual time participants spent on the assigned open codes (see Section
3.4) added granularity to our analysis through the consideration of the relative time each participant spent on actions related to open codes within the principal actions
Inspection and Information Gathering,
Reversing Strategy Decisions, and
Reversing Problems in Table
3. This table provides an overview of the most prevalent actions taken in each of the participants’ problem-solving strategies. It also shows the time-intensity of those actions as well as the point in the timeline of the attack at which they were taken. It therefore serves as a reference which complements the case-by-case descriptions of problem-solving strategies that follow. We proceed according to total solution time in ascending order, starting with the fastest participant.
Expert: Sub-steps and Test Cases. The expert achieved a solution in 163 minutes, the fastest time of all participants. The expert’s Reversing Strategy Decisions revolved around the (small-step) preparation of reversing sub-steps, indicating an ability to sub-divide the task into several isolated sub-problems which could then be overcome individually. The continuous development of test cases that helped the expert to test solutions to sub-problems in a practical manner further supports this conclusion. From the outset, the expert applied both manual and script-based Inspection and Information Gathering actions, demonstrating the ability to quickly translate manual methods into scripts, thereby automating information gathering from the beginning. Of note is that, after encountering a problem (lost track of the reversing approach) in the middle of the reversing attack, the expert reverted to manual selections. These manual selections seemed to be an important anchor which helped to overcome the reversing problem in just seven minutes.
P8: External Resources, Reversion, and Duplication. Participant 8 (P8) required 176 minutes to solve the HRE task, which is very close the expert’s solution time. P8’s problem-solving strategy was characterized by three overarching Reversing Strategy Decisions: using external resources, reversion to a proven approach, and the duplication of partial solutions. Especially at the beginning and in the middle of the attack, P8 applied knowledge and skills acquired from previous HRE training tasks (proven approaches; e.g., reversing methods to identify components of interest). Throughout the attack, P8 continuously referred to external resources such as the provided coding guide, pen-and-paper analyses, or online resources. In the further course of the attack, P8 divided the HRE task into several similar sub-tasks. P8 then solved one of those sub-tasks and adapted the solution to the other open sub-tasks (as reflected by the open code duplication of partial solutions). With respect to Inspection and Information Gathering, P8 performed only manual actions. While most of those actions took place during the beginning of the attack, in-depth manual inspection of watermark candidates was observed throughout the task. Moreover, P8 only showed very short-lasting Confusion (lost track of the reversing approach; lasting for 7 minutes), which was mainly caused by several simultaneous Code Development actions.
P3: External Resources and Sub-steps. At 187 minutes, Participant 3 (P3) was also very close to the expert’s solution time. In terms of Reversing Strategy Decisions, P3 often engaged in (small-step) preparation of reversing sub-steps during the first half of the attack, that is, P3 was able to divide the larger problem into smaller sub-problems. At the beginning of the task, P3 was using external resources and applied script-based methods for Inspection and Information Gathering. This bore fruit as the attack progressed, as P3 conducted only a few manual information-gathering actions toward the middle of the task and did not need to gather any additional information in the second half, having already automated essential information gathering the in beginning. Despite the short overall solution time, we observed a very large number of Reversing Problems, which were present for 75 minutes in P3’s problem-solving process. These problems arose from a Lack of Understanding of two basic HRE concepts which had to be resolved before correctly completing the task.
P5: Sub-steps, Generic Approach, and Test Cases. Participant 5 (P5) required 221 minutes to solve the HRE task, about an hour longer than the expert. P5 consulted external resources and applied script-based Inspection and Information Gathering methods at the beginning of the attack. A dominant strategy that could be observed upon review of P5’s Reversing Strategy Decisions was the evolution of a generic approach stemming from the selection of test candidates and development of test cases. A generic approach is characterized by its re-usability and adaptability, e.g., for similar tasks on other netlists. The other overarching strategy observed in P5’s problem-solving process was the (small-step) preparation of reversing sub-steps. Although similar development of test cases and sub-step preparation was observed in the problem-solving processes of other participants, a generic approach leading to re-usable solutions was unique to P5. Despite us identifying two dead ends as well as a few strategy changes for script-based analyses, P5 was able to work through the observed Lack of Understanding in a relatively short amount of time (12 minutes).
P2: Sub-steps and External Resources. During P2’s attack, which lasted 221 minutes, two overarching Reversing Strategy Decisions could be observed: small-step preparation of reversing sub-steps and using external resources. The several manual and script-based Inspection and Information Gathering actions which P2 took appeared only at the beginning of the attack. In the middle of the attack, P2 showed and resolved a Lack of Understanding, which accounted for roughly 35 minutes of the total solution time.
P4: External Resources. The attack of Participant 4 (P4) lasted 233 minutes. While using external resources could be identified as the overarching Reversing Strategy Decision especially in the middle and end of P4’s attack, other strategies, including the development of test cases, reversion to a proven approach, and preparation of a reversing sub-step were also observed. P4 applied several manual Inspection and Information Gathering actions toward the end of the attack, whereas no script-based information-gathering methods could be observed. In P4’s problem-solving process, several Reversing Problems totaling 65 minutes were annotated. These problems occurred mainly at the beginning and during the middle of the attack and encompassed Failed Attempts such as the unsuccessful transfer of an already known approach to a current problem and Confusion leading to an instance of lost track of the reversing approach.
P6: External Resources and Sub-steps. Participant 6 (P6) required a total of 261 minutes to reach a solution. The overarching Reversing Strategy Decisions of P6 were using external resources and small-step preparation of a reversing sub-step. In addition to these strategies, P6 also had reversion to a proven approach and applied a generic approach towards the end of the attack. With respect to Inspection and Information Gathering, P6 applied several manual and script-based techniques at the beginning and during the middle of the attack. P6 spent a total of 78 minutes on Reversing Problems, including several attempts to resolve observed Lack of Understanding.
P7: Fully Manual and Hardcoding Approach. Participant 7 (P7) was one of the slowest reverse engineers (351 minutes). P7’s strategy at the beginning of the attack was based on using external resources. As time progressed, P7 made several unique Reversing Strategy Decisions, first through a strategy change from script-based to manual analysis and then later to a fully manual, hardcoding approach during the middle of the attack, which resulted in duplication of partial solutions. The fully manual and hardcoding approach consisted of encoding fixed values into the solution, which had been read out through manual analyses. This approach was also comparable to the participant’s Inspection and Information Gathering processes, which consisted of in-depth manual inspection of watermark candidates. During the attack, P7 encountered a relatively small number of Reversing Problems (Lack of Understanding and Failed Attempts) and these were resolved comparably quickly (35 minutes).
P1: External Resources and Reversing Problem Shooting. Participant 1 (P1) was the slowest overall participant with a total solution time of 528 minutes. The dominant Reversing Strategy Decisions of P1 consist of using external resources and reversion to a proven approach. Notably, during the second half of the attack, P1 failed to recognize that a correct solution of a sub-problem had been reached, and inappropriately changed reversing strategies. No subsequent Reversing Strategy Decisions were observed. Reversing Problems cost P1 a total of 167 minutes, the highest overall time of all participants. P1’s initial Inspection and Information Gathering approach was based on manual actions and only very few further manual information gathering actions could be observed during the subsequent course of the attack.
In summary, our results in RQ2 revealed differences in the problem solver’s time efficiency and applied problem-solving strategies. These differences extended beyond total solution time to encompass the set of strategies applied as well as the (in)ability to solve Reversing Problems quickly. We discuss these findings in the following.
4.6 Discussion of RQ2
The case-by-case analysis of RQ2 revealed similarities and differences between our participants, which are discussed in greater detail below. The open code (small-step) preparation of reversing sub-steps was predominant throughout the complete expert’s problem-solving process. Besides the expert, P5 and P2 also strongly relied on small-step preparation during their respective HRE problem-solving processes. We also identified that P3, P4, and P6 included small-step preparation in their approaches. Thus, we assume that most of the participants divided the main HRE task into several smaller sub-tasks which they solved in a step-by-step manner.
Furthermore, our data suggest that the open code development of test cases was also predominant in the problem solving of the HRE expert. The development of the solution based on test cases could only be identified in two other participants’ strategies: P5 and P4 (both at the beginning and in the middle of the HRE process). We assume that development with test cases is a more specific open code that only appears in the problem solving of reverse engineers who may have a strong background in software development, as test case generation is a common tool there.
Our results showed that all intermediates except P5 used external resources. Relying on external resources (e.g., provided coding guide, pen-and-paper analyses, or online resources) to develop reversing strategies, may compensate for a lack of skills and knowledge, and obtaining reassurance seemed to be an efficient approach for most intermediates. We also observed that most intermediates chose proven approaches (i.e., solutions from previous working steps) in order to save time or to reduce efforts for developing new solution strategies.
Our data suggest individual solution approaches. Intermediates P5 and P7 settled upon unique approaches, the former leading to a more and the latter to a less efficient solution. P5’s generic approach lead to a solution that would have been useful for completing potential future reversing tasks but was a relatively inefficient way to complete this single and specific problem-solving task. Although P7’s fully-manual approach was beset by a relatively small number of Reversing Problems, it was so at the expense of taking inefficiently long to complete. In other words, step-by-step manual analysis and subsequent processing of manually-identified components in the script was mostly accurate, but also very time-consuming.
Overall, we found that the open code strategy change was not a code that was prevalent in participants’ problem solving. Only P5 and P7 had to change their strategy during the reversing process. Hence, we assume that most of the participants had a clear plan on how they wanted to solve the HRE task and therefore were not forced to change their strategy.
In terms of Inspection and Information Gathering, we found that manual netlist interactions (e.g., manual selection of netlist components) occurred in each participant’s problem solving—but to varying degrees. Whereas P5, P7, and P8 strongly focused on manual approaches for netlist inspection and information gathering, the expert, P1, P2, P3, P4, and P6 applied manual approaches only occasionally. Furthermore, we found that only four participants (expert, P2, P3, and P6) applied script-based inspection and information gathering. We assume that analysts preferred manual netlist exploration via the graphical representation of the netlist in HAL over script-based information gathering.
In general, our results showed that more Reversing Problems led to higher total solution times. The fastest two participants (the expert and P8) each needed only seven minutes to resolve their single reversing problem. P1, in contrast, experienced a large number of Reversing Problems, became confused, and stayed off track for a while, resulting in a long total solution time. We did, however, identify one exception in the Reversing Problems dataset: P3 was one of the fastest participants, but also demonstrated frequent episodes of reversing-related Lack of Understanding. Despite a high frequency of mistakes, P3 was able to achieve important Reversing Milestones, resolve Lack of Understanding, and abandon dead-end actions in the second half of the attack.
Based on our previous assumptions and the results of RQ1, it would have been conceivable that the HRE expert was the only participant who could solve the task efficiently. Contrary to these expectations, the in-depth analysis we conducted pursuant to RQ2 revealed that two intermediates were able to achieve very efficient solutions by applying different problem-solving strategies than the expert. We discuss this finding in the next section.
In summary, the case- by-case comparison to answer RQ2 reveals that the participants had several similarities in their problem-solving strategies (e.g., manual Inspection and Information Gathering; small-step preparation of reversing sub-steps; using external resources). However, our analysis also shows that despite these observed overlaps, none of the nine solutions can be described as congruent with any other solution because participants chose different foci that resulted in different compositions of assigned open codes. In other words, we found that participants’ sub-processes are comparable, but that they occurred at different times of the reversing process (beginning, middle, end) and to different extents (less dominant vs. predominant). Our results show that specific sub-processes of the nine hardware reverse engineers were comparable, but the overall strategy differed. Thus, our results suggest that there is no single optimal (“one and only” or “end all, be all”) problem-solving strategy for the most efficient solution to the HRE task at hand. Rather, our analysis showed that several participants achieved efficient solutions through the application of individually composed strategies.
5 General Discussion
5.1 HRE—A Type of Problem Solving that Builds on Expertise and Cognitive Abilities
Based on our findings, we make two main contributions. First, our results contribute to the ongoing debate in psychology on the role of cognitive abilities and prior domain-specific knowledge in problem-solving performance. Our findings suggest that both expertise and cognitive abilities may play a role in HRE, and that HRE tasks may involve aspects of both simple and complex problems. Second, our work contributes to the theoretical considerations of Lee and Johnson–Laird (2013) [
43] and extends the understanding of problem solving in HRE.
Our study revealed that the HRE expert achieved the most time-efficient solution and applied a unique set of problem-solving strategies that could not be found in the problem solving of the intermediates. Through years of deliberate practice, experts build up a wealth of domain-specific knowledge and prior experience in solving domain-specific problems [
27]. Experts’ efficient problem solving is based on strong domain-specific knowledge that significantly influences the perception and categorization of problems [
16,
45]. This problem categorization and representation then lead to the selection of suitable problem-solving strategies enabling time-efficient solutions [
16,
19,
46]. Against this background, it is reasonable to assume that in accordance with the research referenced above, the HRE expert selected strategies based on strong domain-specific knowledge. This profound level of well-structured domain-specific knowledge may have supported the expert in categorizing the problem efficiently and in selecting suitable strategies to solve the tasks efficiently. Ericsson and Kintsch (1995) suggested that experts’ efficient problem solving was based on effective long-term working memory that enabled experts to efficiently activate prior domain-specific knowledge (e.g., stored procedures or chunks) [
26]. Thus, we assume that the time-efficient problem solving by the HRE expert may have been based on an efficient activation and retrieval of prior knowledge from the long-term working memory.
Furthermore, our results showed that the two intermediates also achieved time-efficient solutions, although they had less domain-specific knowledge and less problem-solving experience than the HRE expert. We presume that these intermediates’ high performance may have been influenced by cognitive abilities enabling them to compensate for their lack of domain-specific knowledge and experience. In a prior work [
6] we showed that higher scores in the IQ sub-factor working memory may have supported reverse engineers to solve the HRE task quicker than analysts with lower scores in working memory. We hypothesize that cognitive abilities such as the working memory may explain how intermediates coped with gaps in HRE-specific knowledge and problem-solving experience. Our assumption aligns with the expert-performance approach that explains how novices and intermediates deal with gaps in domain-specific knowledge and problem-solving experience [
27]. The expert-performance approach postulates that domain-specific performance is correlated with general cognitive abilities for novices and intermediates, but not for individuals with higher levels of expertise and specific cognitive structures [
25]. Prior research supported the expert-performance approach by showing that superior performance of non-experts in chess was correlated with above-average IQs [
8,
30]. We hypothesize that cognitive information processing sub-systems such as the working memory [
4] may have helped the two high-performing intermediates in the current study to temporarily keep information in mind and to work on it—for example, by retaining and applying information gathered by inspection of netlist components. It is in particular thinkable that the intermediates’ general speed of information processing—a central basis of human intelligence [
20]—may have influenced their performance in the HRE task, allowing them to compensate for a lack of prior domain-specific knowledge and problem-solving experience. During an HRE attack, reverse engineers try to achieve a desired goal state by making sense of thousands up to millions of single netlist components and their interconnections. From this mass of information, they formulate sub-goals and use intermediate results to plan subsequent steps in the attack without losing sight of the desired goal state. Furthermore, attackers have to ignore irrelevant components and draw their attention to relevant components to avoid potential missteps. Based on these results (superior performance by the expert and two intermediates), we suggest that superior performance in HRE may be a function of both expertise and cognitive abilities.
The role of prior knowledge and cognitive abilities is also at the heart of taxonomies that define human problem solving. Prior psychological research has established a broad range of taxonomies to define problems. A widely accepted and applied taxonomy distinguishes between simple and complex problems [
22]. Prior research postulates, that solving simple and solving complex problems is based on different cognitive processes [
22,
52]. Complex problems are usually described as knowledge-rich systems that activate large semantic networks of prior knowledge and potentially successful problem-solving strategies [
22]. Simple problems are typically described as knowledge-free systems as they require less domain-specific knowledge [
17] and more cognitive abilities (e.g., [
54]) than complex problems. As previously outlined (see Section
2.2), it is an open question to which problem type HRE belongs.
From what we have found, HRE seems to combine aspects of both simple and complex problems. HRE attacks begin with the reverse engineer obtaining a netlist, which can be defined as the initial state of the problem. By selecting suitable operators (e.g., information gathering based on manual netlist analysis), the attacker attempts to achieve a desired goal state (e.g., removal of watermark from a protected circuit). The clear definition of the initial state, the means, and the goal state, are characteristics that are commonly assigned to simple problems [
22]. A further characteristic that HRE and simple problems have in common is time-stability—they change only as a result of inputs from the problem solver [
31]. In contrast to simple problems, however, a realistic HRE problem may be impossible to solve for someone with insufficient domain-specific knowledge and skills (on hardware circuits, chip design, etc.). Another hallmark of semantic-rich HRE problems is the huge amount of information which must be processed. The attacker analyzes netlists ranging from thousands up to millions of components and their interconnections. The complexity of an HRE problem increases with netlist size (i.e., the number of components), requiring reduction and abstraction on the part of the reverse engineer to remain manageable. These characteristics of complexity and connectivity of HRE problems are typically assigned to complex problems (e.g., [
33]. Furthermore, a hardware reverse engineer applies a series of operations to retrieve needed information that are not directly apparent at the beginning of an HRE problem. Thus, an attacker has to handle a lack of transparency (at the beginning of the attack), that is comparable to those found in complex problem-solving settings (e.g., [
32]). While these three aspects of complex problems are present in HRE problem solving, two others—dynamics and polytelic situations—are not. As described above and in contrast to complex problems which usually change dynamically, HRE problems are stable. Moreover, a reverse engineer analyzes the netlist components to achieve (sub-)goals that are clearly defined, non-competitive and not mutually exclusive. Our embedding of HRE in existing problem-solving taxonomies leads us to assume that HRE may combine aspects of both simple and complex problems.
In the following, we draw our first conclusion from the previous discussion points. Our findings seem to be in line with prior psychological research with experts who achieved superior performance based on their efficient problem categorization and representation that influences the selection of suitable operators. Moreover, our findings may support the expert-performance approach [
27], which explains how non-experts deal with the lack of domain-specific knowledge and experience. Against this background, we assume that both HRE intermediates achieved time-efficient solutions based on their highly-efficient information processing systems. Our attempt to embed HRE in the context of taxonomies of human problem solving resulted in our proposal that HRE seems to combine aspects of both simple and complex problems. Although HRE and simple problems seem to have several core aspects in common, the huge amount of information pertaining to non-transparent, highly interconnected components that must be processed and analyzed creates a challenge that places HRE squarely outside of the realm of simple problems as typically defined and constructed. Therefore, HRE may combine aspects of simple and complex problems. In summary, our exploratory findings lead to the hypothesis that HRE may be a type of problem solving that requires both domain-specific expertise (more relevant for solving complex problems) as well as cognitive abilities (more relevant for solving simple problems).
This hypothesis relates to the theoretical considerations by Lee and Johnson–Laird (2013) [
43], who suggest that reverse engineering of Boolean systems may be a specific type of human problem solving. As outlined in Section
2.2, it was unclear to what extent the results of Lee and Johnson–Laird (2013) [
43] concerning applied problem-solving strategies and difficulties in reverse engineering are applicable to the domain of HRE problem solving. We hypothesize that the results by Lee and Johnson–Laird are limited in their applicability to the HRE domain. We found differences in applied problem-solving strategies in HRE, which may be due to the nature of the reversing tasks used in the studies. Lee and Johnson–Laird (2013) [
43] reported that participants applied one of two main problem-solving strategies: focusing on a single output at a time, or focusing on a single component at a time. In contrast, our results revealed that none of the applied problem-solving strategies could be described as the single “best” or as the main strategy for solving the realistic HRE task. Instead, our results show that our participants applied individual strategies consisting of various combinations of problem-solving steps. Moreover, we also found unique problem-solving strategies (expert, P5, P7). Furthermore, Lee and Johnson–Laird (2013) [
43] found that difficulties in reverse engineering depend on three factors: (a) number of components, (b) number of components influencing an output, and (c) dependencies of components influencing an output. In contrast to these more general assumptions, our results revealed specific HRE problems (e.g., confusion based on track losses in reversing approach; dead-ends based on misleading strategies), which in some cases led to difficulties and longer solution times (e.g., P1; P6).
We draw two main conclusions. The first was based on one of our main findings, that superior performance in HRE was achieved by the HRE expert and by two intermediates with above average-scores in the IQ sub-factor working memory [
6]. We hypothesize that superior HRE performance may be based on both domain-specific expertise as well as on cognitive abilities. This aligns with our theoretical contribution that HRE tasks may involve aspects of both simple and complex problems. In general, solving simple problem relies more on cognitive abilities [
54] and solving complex is based on domain-specific knowledge [
22].
In this work, we contribute to the debate initiated by Lee and Johnson–Laird (2013) [
43] by extending the understanding of human problem solving in HRE. In this context, we conducted an exploration of HRE-specific problem-solving strategies, suggesting that a single best strategy may not exist and reverse engineers rely more on individual problem-solving procedures. Furthermore, we extended the understanding of when hardware reverse engineers are struggling by describing HRE-specific problems (e.g., misleading strategies and dead-ends) that occurred during problem solving.
5.2 Future Work
In light of these points, we feel that HRE is an interesting area for future research on human problem solving. It would be valuable to conduct controlled experiments in order to analyze influences and interactions of expertise and cognitive abilities in HRE performance as well.
Additionally, a promising investigation would be to analyze difficulties in the HRE problem-solving processes, and whether these could be avoided over time due to the acquisition of skills and domain-specific knowledge. It would also be interesting to analyze how reverse engineers solve other HRE problem tasks, such as reversing an (obfuscated) FSM or extracting a cryptographic key from an unknown netlist. In the light of developing countermeasures impeding HRE, future studies should focus on how reverse engineers analyze protected (obfuscated) netlists and which problem-solving strategies are applied in order to break the obfuscation.
In order to analyze HRE as a specific type of human problem solving, we suggest to include time pressure in future studies to investigate if higher levels of induced stress (based on time pressure) may influence the performance of HRE tasks. We suggest that future studies may conduct interviews with other HRE professionals or more experienced engineers to obtain feedback on the HRE problem-solving model and to probe their mental models of HRE in general. Furthermore, future studies may also investigate whether the model can be confirmed and extended to problem-solving in other HRE tasks.
5.3 Hypotheses and Recommendations for Cognitive Obfuscation
Here we discuss how the understanding of reverse engineers’ problem-solving processes and strategies may contribute to the field of cybersecurity by supporting the achievement of our overarching goal—the development of novel forms of countermeasures (i.e., cognitive obfuscation) that impede HRE attacks. In general, countermeasures against reverse engineering can never provide absolute protection [
5]. Instead, effective countermeasures raise the cost of an attack (i.e., time-on-task) to a prohibitively high level. However, previously proposed countermeasures were based solely upon technical aspects and no holistic measure for comparing the cost of HRE-based attacks yet exists [
60].
Therefore, we propose several recommendations in pursuit of cognitive obfuscation. First, the HRE problem-solving model we have developed can serve as a framework with which to examine existing hardware obfuscation techniques across the following relevant dimensions.
(1)
To what extent does the obfuscation technique increase time-on-task compared to an attack on the same but unprotected circuit?
(2)
Which Reversing Problem(s) does the applied obfuscation technique cause?
(3)
Can we detect recurring problem-solving strategies for the obfuscation technique under study?
Answering these questions would provide a granular view of the cognitive processes involved in defeating existing hardware obfuscation techniques. The most promising obfuscation techniques can then subsequently be used as a toolkit for building cognitive obfuscation. Since future work on cognitive obfuscation might focus on inducing specific Reversing Problems in order to confuse and force the attacker into time-consuming dead-ends, these studies should specifically focus on the different types of Reversing Problems and how they occur. Those studies may also reveal which Reversing Problems are better avoided by engineers who have completed a period of skill acquisition as well as those which are no less easily overcome even after acquiring additional domain-specific knowledge and skills. Second, novel countermeasures should impede efficient and recurring problem-solving strategies, if observed. This means that cognitively challenging tasks should lead reverse engineers down the wrong track, for example by combining physical and logic-level obfuscation techniques which disguise themselves as another known technique. In the event that no effective problem-solving strategies are observed, a combination of obfuscation techniques from the aforementioned toolkit may lead to countermeasures which effectively increase the cost of mounting an HRE attack. Lastly, we hypothesized that expertise alone might not be the only relevant factor in efficiently solving HRE problems. It is thinkable that the success and efficiency of an HRE attack also depend on the engineer’s general cognitive abilities (e.g., on the information processing capabilities). Thus, cognitive obfuscation challenges should try to overwhelm the attacker’s information processing capabilities, for example by designing highly complex challenges with a huge amount of non-transparent and strongly-interconnected information which has to be processed in several parallel tasks.
Our exploratory results on human problem solving in HRE are a first step to understand the underlying cognitive processes of reverse engineers. Nevertheless, future studies need to quantify whether cognitive abilities such as the working memory or the level of expertise significantly influence the problem solving in HRE tasks. Therefore, deriving concrete ideas for cognitive obfuscation at this stage would be too early and would exceed the exploratory nature of this article.
5.4 Limitations
While our work provides a first exploration of HRE problem-solving processes and strategies and highlights several theoretical and technical implications of the same, it is not without limitations.
First, the present study was limited by the methodological challenge that, at this point, HRE experts are generally unavailable for research. To address this challenge, we developed and applied a comprehensive 14-week training, which enabled students to acquire intermediate levels of HRE expertise. We generated an appropriate sample for our study by selecting the top-performing students from the training. One HRE expert further complemented our sample. Comparing the sample of intermediates to the one expert provides some assurance for our findings, but it is not entirely certain to what extent they can be generalized to other HRE experts. This is a problem inherent in this type of research.
Second, we developed our problem-solving model based on a single realistic HRE problem-solving task. Although we expect that our model captures the most significant processes of HRE and might be applicable to other HRE tasks, it is conceivable that further studies examining different problem-solving tasks (e.g., finding a register that stores a cryptographic key in a netlist) would result in its extension or revision.
Third, the measurement of solution time in combination with qualitative analysis of problem-solving steps was a relevant metric for the analysis of efficiency in this study. However, the calculation of time in minutes might be limited in its transferability to industry, where HRE tasks may incur significant overhead due to the need for the engineering of specialized tools and the development of generic, reusable techniques. Here, time would need to be considered in days or weeks rather than hours or seconds.
Fourth, the open code using external resources includes several different actions such as working with pen and paper or consulting online resources. Due to our study design and with the privacy demands of our participants in mind, we collected only data in HAL. Thus, while we could determine that participants consulted external resources that have helped them move forward, we cannot say with certainty which resources they used. For future studies, HAL now includes a feature that allows us to specifically query the resources used in order to analyze how participants employ these resources to solve a task.
Despite the above limitations, our exploratory results and discussion aspects can be considered a first important step towards a better understanding of the unknown human factors in HRE.