Binary Code Vulnerability Detection Based On Multi-Level Feature Fusion
Binary Code Vulnerability Detection Based On Multi-Level Feature Fusion
ABSTRACT The existence of software vulnerabilities will cause serious network attacks and information
leakage problems. Timely and accurate detection of vulnerabilities in software has become a research
focus on the security field. Most existing work only considers instruction-level features, which to some
extent overlooks certain syntax and semantic information in the assembly code segments, affecting the
accuracy of the detection model. In this paper, we propose a binary code vulnerability detection model
based on multi-level feature fusion. The model considers both word-level features and instruction-level
features. In order to solve the problem that traditional text embedding methods cannot handle polysemy,
this paper uses the Embeddings from Language Models (ELMo) model to obtain dynamic word vectors
containing word semantics and other information. Considering the grammatical structure in the assembly
code segment, the model randomly embeds the normalized assembly code segment to represent it. Then
the model uses bidirectional Gated Recurrent Unit (GRU) to extract word-level sequence features and
instruction-level sequence features respectively. Then, the weighted feature fusion method is used to study
the impact of different sequence features on the model performance. During model training, adding standard
deviation regularization to constrain model parameters can prevent the occurrence of overfitting problems.
To evaluate our proposed method, we conduct experiments on two datasets. Our method achieves an F1-score
of 98.9 percent on the Juliet Test Suite dataset and a F1-score of 87.7 percent on the NDSS18 (Whole)
dataset. The experimental results show that the model can improve the accuracy of binary code vulnerability
detection.
INDEX TERMS Binary code vulnerability detection, embeddings from language models, feature fusion,
instruction level sequence features, word level sequence features.
binary code vulnerability detection [6] has crucial practical Due to its excellent ability to automatically extract fea-
significance. At present, static analysis [7], [8], [9], dynamic tures, deep learning has been validated as effective in code
analysis [10], [11], [12], and a combination of dynamic and vulnerability detection [15]. Li et al. [16] were the first to
static analysis can be used for binary code vulnerability apply deep learning in software vulnerability detection tasks.
detection. The designed VulDeePecker (Vulnerability Deep Pecker)
The static analysis method refers to the method of analyz- system provides a new research perspective for vulnerability
ing the program to detect whether there are vulnerabilities detection and this paper also contributes the first vulner-
in the program without running the program. This method ability dataset suitable for deep learning. Laura et al. [17]
often converts the program to be tested into an intermediate proposed the Vulnerability detection using natural code
language, analyzes the typical features contained in the inter- bases (VUDENC) method based on natural code reposito-
mediate language, and then uses specific techniques to detect ries. VUDENC utilizes Word2Vec to represent word vectors,
vulnerabilities. and LSTM networks classify sequences of vulnerable code
The dynamic analysis method [13] needs to run the tar- tokens. The work of [16] and [17] are both excellent
get program and observe and analyze the execution status approaches for applying deep learning to source code vulner-
of the program so as to achieve the purpose of detecting ability detection.
vulnerabilities. For example, environment injection error is to Applying deep learning technology to the field of binary
inject error information into its execution environment with- code vulnerability detection can be divided into code scan-
out modifying the program to be tested and then observing ning and similarity detection. Binary code similarity detec-
the running status of the program. If the program’s running tion is to calculate the similarity between the code to be
status is abnormal, it means that the artificial injection error tested and the code with identified vulnerabilities, and finally
triggered the potential defects in the program so as to achieve determine whether the code to be tested has vulnerabili-
the purpose of detecting vulnerabilities. ties. Wang et al. [18] proposed the jump-aware Transformer
The dynamic and static analysis method [14] combines for binary code similarity detection (jTrans), which is the
the accuracy of dynamic vulnerability analysis technology first solution to embed control-flow information into Trans-
and the path completeness of static vulnerability analysis former. jTrans combines the natural language processing
technology to detect vulnerabilities. (NLP) model that captures instruction semantics with the
Since the static analysis method studies all the running cfg that captures control information to infer the similarity
tracks of the target program, it has a great path coverage. representation of binary code, and finally realizes the fusion
However, the dynamic analysis method and the combination of control flow information into Transformer. At the same
of dynamic and static analysis methods still have the problem time, many scholars have applied the binary code similarity
of low path coverage. Therefore, the research on static meth- method for vulnerability detection. However, this method
ods in binary code vulnerability detection is also extensive. has certain disadvantages, it cannot detect unknown types of
Static binary code vulnerability detection can be divided vulnerabilities, so it is extremely important to directly detect
into traditional detection methods and detection methods binary codes using code scanning methods.
based on deep learning. Traditional static binary code vul- Code scanning refers to the process of traversing and
nerability detection methods usually convert binary code into slicing the assembly code segments obtained from binary
intermediate language, and then use some static methods code conversion to determine whether the slices contain vul-
to analyze the program to detect potential vulnerabilities. nerabilities. Existing code scanning techniques often utilize
For example, pattern matching method detect vulnerabilities methods from natural language processing for vulnerabil-
according to vulnerability patterns defined by human experts ity detection. The specific steps include data preprocessing,
in advance. Vulnerability patterns are the analysis of a large code embedding, and feature extraction. The code embed-
amount of code, abstracting and summarizing typical fea- ding network typically aims to vectorize the assembly code
tures that may be regular expressions, string matching, code segments, but there are still some challenges in transforming
structure, etc. of specific vulnerability types. By analyzing the code into recognizable vectors by neural networks while
intermediate languages, these typical features can better dis- preserving as much syntax and semantic information as possi-
tinguish whether the code contains defects. However, the ble. The Instruction2vec method proposed by Lee et al. [19]
method of pattern matching relies too much on the features represents each word in the assembly code segments using
abstractly selected, and can only detect such binary code Word2Vec word vectors. It takes into consideration the com-
with typical features. Traditional static binary vulnerability position structure of instructions and represents each line
detection methods have the advantages of detecting defects in of instructions using nine values: one for the opcode and
the early stages of program development, reducing software four for each of the two operands. If there are fewer than
development costs and time. However, when using them for two operands, padding is applied to fill the remaining val-
vulnerability detection, there may be false positives, false ues. Based on this, Yan et al. [20] proposed the Hierarchical
negatives, and other situations, and it may also consume a Attention Network for Binary Code Vulnerability Detection
lot of manpower and computing resources. (HAN-BSVD), which expands on the structural composition
For each word tk , 2 * L+1 vectors will be obtained through The reset gate determines how many states at t − 1 time
the bidirectional language model of the L layer, including the ∼
are written into the current candidate set h t . The calculation
forward and backward vectors of each layer and the original
equation is:
input vector. The final output vector is shown in (3).
γt = σ Wγ · [ht−1 , xt ]
n (6)
→LM ←LM o
Rk = xkLM , h k,j , h k,j |j = 1, · · ·, L = hLM ∼
k,j |j = 0, · · ·, L h t = tanh W∼ · [γt ∗ ht−1 , xt ] (7)
h
(3) The output calculation equation of GRU is:
→LM ←LM yt = σ (WO · ht ) (8)
In (3), h k,j and h k,j are forward and backward language
model outputs, respectively. LSTM networks are often used where [·] represents the connection of two vectors, ∗ rep-
to process data with time series characteristics. In the bidirec- resents the product of matrix, σ (·) represents the sigmoid
tional language model, LSTM extracts the context semantic activation function.
information of the current word according to the sequence.
As shown in the Fig.2, LSTM network [26], [27] is composed III. THE MODEL
of an input gate, forgetting gate, and output gate. For binary code vulnerability detection tasks, this paper pro-
In Fig.2, xt is the input at the current time t, ht is the output poses a vulnerability detection model based on multi-level
at the current time, ct is the cell state, c′t is the new data at the feature fusion. The model mainly includes three parts: feature
current time, and σ is the activation function. ft , it and ot are extraction network, feature fusion module, and classifier. The
respectively the forgetting gate, input gate and output gate at feature extraction module extracts word-level and instruction-
the current moment. wf , wi and wo are respectively the weight level sequence features, respectively. The model uses ELMo
matrix corresponding to the forgetting gate, input gate, and pre-training model to obtain the dynamic vector representa-
output gate. tion of words, and bidirectional GRU obtains the sequence
characteristics between words. In the instruction-level feature
B. GRU extraction module, this paper first considers the syntax and
GRU [28] is a variant of LSTM, which can process sequence semantic structure of the instruction, obtains the embedding
data and solve the long-dependence problem in a traditional matrix of the assembly code segment, uses bidirectional GRU
recurrent neural network. Compared with the three ‘‘door’’ to extract context information of the instruction, and obtains
structures contained in the LSTM shown in Fig.2, the GRU instruction-level sequence features. In the feature fusion
structure is simpler, including only update and reset gates. module, the weighted feature fusion mechanism is used to
As shown in Fig.3, the GRU network consists of update complete the fusion between word-level and instruction-level
gate zt and reset gate γt . The current input of GRU at t time features. Finally, the model uses a classifier to get the clas-
is the hidden layer state ht−1 at t − 1 time and the input xt at sification results, that is, to determine whether the assembly
the current time. code segment contains vulnerabilities. The overall framework
The update door determines how much of the status at t −1 of the model is shown in Fig.4.
time is brought into the current state. The calculation equation
is: A. WORD-LEVEL FEATURE EXTRACTION MODULE
In this section, the ELMo pre-training language model is used
zt = σ (Wz · [ht−1 , xt ]) (4) to obtain the embedded representation of each word in the
∼ assembly code segment, and the bidirectional GRU obtains
ht = (1 − zt ) ∗ ht−1 + zt ∗ h t (5) the long-term context dependency of the word.
When using deep learning for vulnerability detection, through the forward LSTM of the first layer, and get the hid-
identifying potentially vulnerable code needs to focus on →LM
den state h i,2 by inputting it into the forward LSTM of the
the learning of context semantic information in the code. →LM
It can improve the model detection performance by cor- second layer, so h i,j , j = 1, 2 is output as Ei through the for-
rectly and reasonably expressing the code meaning. Because ←LM
ward language model. Similarly, h i,j , j = 1, 2 is output as
Word2Vec [29] and other embedding methods are based Ei through backward language model.
on a large corpus and relatively long context to obtain the So n
the output of word
→LM ←LM o
word vector, for the same word, the vector obtained through Ei is Ri = Ei , h i,j , h i,j |j = 1, 2 = hLM i,j |j = 0, 1, 2 .
Word2Vec is fixed without considering the context and other The output layer of the ELMo model takes into account the
factors of the word. Based on this, this paper uses ELMo to output of the last layer of LSTM, the original input vector,
obtain the dynamic word vector. The specific steps are as and the intermediate word vector. The calculation equation is
follows: Firstly, a pre-trained language model is generated as follows:
to learn the embedding of the word. When used, the word
already has specific context information. Secondly, the word ELM oi = γ · λ0 · hLMi,0 + λ1 · hi,1 + λ2 · hi,2
LM LM
(9)
vector is adjusted according to the context information of the
word in the current task so that the word in different positions
has different vector representations. where γ represents the scaling coefficient of all word vectors
For the word-level feature extraction module, the input of obtained by ELMo when they are finally used, in this experi-
the model is T = (t1 , t2 , · · ·, tm )T , where ti , i = 1, · · · , m ment, the initial value is 1. λ0 , λ1 , λ2 is the weight coefficient
represents the words in the assembly code segment, and after softmax processing, and the initial value is 0. They are
m represents the number of words in the input assembly learnable parameters, and the specific values are obtained by
code segment. Each word is represented by a vector through model training.
the input layer of the ELMo module. Here, the input layer The input dimension in the word-level module is
of the ELMo model uses the random embedding repre- (Batch_size, m). The model input is embedded into the
sentation method to obtain the embedding matrix E = representation through the input layer of the ELMo mod-
(E1 , E2 , · · ·, Em )T . Then this paper uses a two-level bidirec- ule, and the dimension is (Batch_size, m, embed_dim).
tional language model to model the grammar, semantics and Here m represents the number of words in the assembly
other characteristics of words. Take the vector Ei of a word in code segment, embedded_dim indicates the dimension of
→LM the word embedding, that is, the dimension of Ei . Then
the embedded matrix E as the input, get the hidden state h i,1 through the bidirectional language model and the output
Then the two results are merged to get the output ht = where Fword represents the word level sequence feature
→ ← obtained in section I, and Fins represents the instruction level
W→ ht +W← ht +b of bidirectional GRU at the current time
h h sequence feature obtained in section II. The value of weighted
t. Max pooling is used in the final part of the word-level feature fusion parameter α is given in section IV.
feature extraction module, which aims to make the model pay
more attention to important features so as to improve model C. ALGORITHM STEPS
performance. The vulnerability detection model proposed in this paper
begins by processing the input assembly code segments.
B. INSTRUCTION-LEVEL FEATURE EXTRACTION MODULE It uses an embedding network to obtain a matrix represen-
Considering that the context instructions in the assembly code tation of the input, and through a feature extraction network,
are also related, this module uses bidirectional GRU to extract it obtains word-level features and instruction-level features.
the context information of the input vector. Then, a feature fusion module is employed to obtain blended
The input of the instruction-level feature extraction module features. Finally, the classification results are obtained based
is the assembly code segment, but each line instruction has a on these features.
different length, so it needs to standardize each line instruc- The specific algorithm steps are as follows:
tion. When standardizing instructions, first, considering the Input: Raw input assembly code segments and labels D =
(n) (n) N
syntax structure of assembly instructions and referring to x , y n=1
the method of processing instructions in Instruction2Vec, the step1. Use the ELMo model to get the dynamic vector
instructions are expressed in the form of one opcode and two representation of words in different contexts;
operands, where four values represent each operand. Second, step2. The word-level features are obtained through the
analyze the type of each operand in the instruction, and place bidirectional GRU network and max pool, as shown in (13);
the operand in a fixed position according to the type. If the step3. Use random embedding to obtain the vector rep-
value of the operand is not enough for a fixed length, it is resentation of the words in the instruction, and obtain the
filled with the invalid operand ‘PH’. The normalization result instruction-level features through the bidirectional GRU net-
of an instruction line is shown in Fig.5. work and max pool, as shown in (14);
The standardized result of the original assembly code seg- step4. Fusion of word-level sequence features and
ment is shown in Fig.6. instruction-level sequence features, as shown in (12);
In this paper, the random embedding method is used to step5. Input the final feature into the fully connected layer
obtain the embedding matrix X = [x1 , x2 , · · · , xn ]T of the with Softmax, and update the weight parameters according
assembly code segment with the dimension of (Batch_size, to (17);
n, 117). xi represents the i-th instruction in the assembly Output: The results of the vulnerability detection model in
code segment, i = 1, · · ·, n. Then the model inputs each this paper.
TABLE 1. Number of NDSS18 datasets under different platforms. TABLE 2. Experimental results on Juliet Test Suite dataset(unit:%).
TABLE 3. Experimental results on NDSS18(Windows) dataset (unit:%). TABLE 6. Results of ablation experiments for different modules on the
Juliet Test Suite dataset (unit:%).
3) PARAMETER SELECTION
The selection of parameter values in neural network has a
great influence on the model effect. In this experiment, two
parameters that are more important to the model are selected
multi-level feature fusion vulnerability detection model pro-
for discussion, namely α in the weighted feature fusion
posed in this paper, it is indeed effective to consider both
method and λ in the loss standard deviation regularization
word-level sequence features and instruction-level sequence
coefficient.
features. The features obtained through feature fusion can
better express the syntax, semantics, and other information in a: α IN WEIGHTED FEATURE FUSION
the original assembly code segment, thereby improving the It can be seen from III that the weighted feature fusion method
detection performance of the model. has good experimental results on both datasets. Different
weight values indicate a different emphasis on word level and
b: DIFFERENT FEATURE FUSION METHODS instruction level features. The selection of weight value α is
For word-level and instruction-level sequence features, the focus of this section.
this paper uses feature concatenate, feature addition, and Fig.7 shows the impact of different weights on the model.
weighted feature fusion. The ablation results on two As shown in Fig.7(a), when the values are 0.4 and 0.8,
datasets are shown in Table 10, Table 11, Table 12 and F1-score is higher on the Juliet Test Suite dataset. However,
Table 13. through compared with Fig.7(b), it is found that when α =
Different fusion methods have different characteristics and 0.4 is used on NDSS18 (Whole) dataset, F1-score has a higher
different representation capabilities. By analyzing Table 10, score than when α = 0.8 is used. Therefore, this paper
Table 11, Table 12 and Table 13, it can be seen that the finally chooses α = 0.4 as the weight of word-level vector
weighted feature fusion method performs best on each data features. Although the NDSS18 (Whole) dataset contains all
set. Compared with other fusion methods, the weighted the data of the windows and linux platforms, and contains
feature fusion method improves the most on the NDSS18 more types of vulnerabilities than the Juilet Test Suite dataset,
(Whole) dataset, with accuracy increased by 2.1 percent, and its representation is single, and word-level sequence features
F1-score increased by 1.8 percent. The analysis shows that have a greater impact on it.
the performance of the two fusion methods of feature splicing
and feature addition is low, and the weighted feature fusion b: COEFFICIENT λ OF STANDARD DEVIATION
method considers the contribution of different features to the REGULARIZATION IN LOSS
detection task in this paper, and the experimental results are The standard deviation regularizes the constraint model
the best. parameters and obtains the regularization term by multiplying
VI. CONCLUSION
In this paper, we propose a binary code vulnerability detec-
tion model based on multi-level feature fusion. The model
proposed in this paper considers the word-level sequence
features in the assembly code segment and learns the dynamic
vector representation of the same word in different contexts
through ELMo model. Then, word-level sequence features
and instruction-level sequence features in assembly code seg-
ments are fused. In the process of feature fusion, this paper
uses methods such as feature splicing, feature addition, and
weighted feature fusion to discuss the influence of differ-
ent features on the binary code vulnerability detection task.
Considering the phenomenon of overfitting in model training,
this paper uses standard deviation regularization to improve
model performance. We conduct experimental evaluation and
FIGURE 8. F1-score variation diagram using different weight values λ on comparison on the Juilet Test Suite and NDSS18 datasets.
different datasets. On the Juilet Test Suite dataset, the F1-score reaches 98.9 per-
cent, and on the NDSS18 (Whole) dataset, the F1-score
reaches 87.7 percent. Compared to the baseline model, the
the standard deviation of the weight matrix so as to reduce the model proposed in this paper exhibits higher accuracy in the
error and prevent over-fitting. task of binary code vulnerability detection.
Fig.8 shows the impact of different weight values λ on
the model. By analyzing Fig.8, it can be found that when REFERENCES
λ value is 1, the results of the model on both datasets are [1] H. Hanif, M. H. N. M. Nasir, M. F. Ab Razak, A. Firdaus, and N. B. Anuar,
optimal. 0 means that the model does not use standard devi- ‘‘The rise of software vulnerability: Taxonomy of software vulnerabilities
ation regularization during training. When the value of λ is detection and machine learning approaches,’’ J. Netw. Comput. Appl.,
vol. 179, Apr. 2021, Art. no. 103009, doi: 10.1016/j.jnca.2021.103009.
between 0 and 1, F1-score is constantly improving, which [2] F. Lomio, E. Iannone, A. De Lucia, F. Palomba, and V. Lenarduzzi, ‘‘Just-
means that when the value of λ is within a certain range, the in-time software vulnerability detection: Are we there yet?’’ J. Syst. Softw.,
standard deviation regularization has a certain fitting ability vol. 188, Jun. 2022, Art. no. 111283, doi: 10.1016/j.jss.2022.111283.
for the model parameters, which can improve the model [3] A. C. Eberendu, V. I. Udegbe, E. O. Ezennorom, A. C. Ibegbulam, and
T. I. Chinebu, ‘‘A systematic literature review of software vulnerability
performance. However, when the value of λ is greater than 1, detection,’’ Eur. J. Comput. Sci. Inf. Technol., vol. 10, no. 1, pp. 23–37,
the F1-score decreases, so the value of λ should not be too Apr. 2022.
small or too large. [4] G. Lin, S. Wen, Q. Han, J. Zhang, and Y. Xiang, ‘‘Software vulnerability
detection using deep neural networks: A survey,’’ Proc. IEEE, vol. 108,
no. 10, pp. 1825–1848, Oct. 2020, doi: 10.1109/JPROC.2020.2993293.
V. DISCUSSION [5] X. Yuan, G. Lin, Y. Tai, and J. Zhang, ‘‘Deep neural embedding for
In summary, the model proposed in this paper can improve software vulnerability discovery: Comparison and optimization,’’ Secur.
Commun. Netw., vol. 2022, pp. 1–12, Jan. 2022, doi: 10.1155/2022/
the detection performance to a certain extent in the binary 5203217.
code vulnerability detection task. However, this paper also [6] P. Xu, Z. Mai, Y. Lin, Z. Guo, and V. S. Sheng, ‘‘A survey on binary code
has certain limitations. The main limitation is the scarcity vulnerability mining technology,’’ J. Inf. Hiding Privacy Protection, vol. 3,
no. 4, pp. 165–179, 2021, doi: 10.32604/jihpp.2021.027280.
of publicly available binary code datasets. Juliet Test Suite [7] S. Alrabaee, M. Debbabi, and L. Wang, ‘‘A survey of binary code fin-
dataset used in this paper is obtained by processing its source gerprinting approaches: Taxonomy, methodologies, and features,’’ ACM
code dataset, involving the compilation of source code into Comput. Surv., vol. 55, no. 1, pp. 1–41, Jan. 2022, doi: 10.1145/
3486860.
binary code and the disassembly of binary code into assembly
[8] C. B. Sahin and L. Abualigah, ‘‘A novel deep learning-based feature selec-
language. In this process, various issues such as choosing a tion model for improving the static analysis of vulnerability detection,’’
compiler, different compilation platforms, and implementing Neural Comput. Appl., vol. 33, no. 20, pp. 14049–14067, Oct. 2021, doi:
a disassembler must be considered in actual situations. The 10.1007/s00521-021-06047-x.
[9] R. Scandariato, J. Walden, and W. Joosen, ‘‘Static analysis versus pene-
data processing process is very complex. This article refers tration testing: A controlled experiment,’’ in Proc. IEEE 24th Int. Symp.
to the method described in the work of [20], which involves Softw. Rel. Eng. (ISSRE), Nov. 2013, pp. 451–460.
[10] S. Dinesh, N. Burow, D. Xu, and M. Payer, ‘‘RetroWrite: Stati- [25] M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee,
cally instrumenting COTS binaries for fuzzing and sanitization,’’ in and L. Zettlemoyer, ‘‘Deep contextualized word representations,’’
Proc. IEEE Symp. Secur. Privacy (SP), May 2020, pp. 1497–1511, doi: in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics,
10.1109/SP40000.2020.00009. Human Lang. Technol., 2018, pp. 2227–2237. [Online]. Available:
[11] C. Beaman, M. Redbourne, J. D. Mummery, and S. Hakak, ‘‘Fuzzing https://aclanthology.org/N18-1202.pdf
vulnerability discovery techniques: Survey, challenges and future direc- [26] M. Sundermeyer, R. Schluter, and H. Ney, ‘‘LSTM neural net-
tions,’’ Comput. Secur., vol. 120, Sep. 2022, Art. no. 102813, doi: works for language modeling,’’ in Proc. Interspeech, Sep. 2012,
10.1016/j.cose.2022.102813. pp. 1–4. [Online]. Available: https://www.isca-speech.org/archive_v0/
[12] O. Zaazaa and H. El Bakkali, ‘‘Dynamic vulnerability detection archive_papers/interspeech_2012/i12_01 94.pdf
approaches and tools: State of the art,’’ in Proc. 4th Int. Conf. [27] A. Salah, M. Bekhit, E. Eldesouky, A. Ali, and A. Fathalla, ‘‘Price predic-
Intell. Comput. Data Sci. (ICDS), Oct. 2020, pp. 1–6, doi: tion of seasonal items using time series analysis,’’ Comput. Syst. Sci. Eng.,
10.1109/ICDS50568.2020.9268686. vol. 46, no. 1, pp. 445–460, 2023.
[13] J. Jurn, T. Kim, and H. Kim, ‘‘An automated vulnerability detection and [28] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, ‘‘Empirical evalua-
remediation method for software security,’’ Sustainability, vol. 10, no. 5, tion of gated recurrent neural networks on sequence modeling,’’ 2014,
p. 1652, May 2018, doi: 10.3390/su10051652. arXiv:1412.3555.
[14] R. Zhang, S. Huang, Z. Qi, and H. Guan, ‘‘Combining static and dynamic [29] T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘‘Efficient estimation of
analysis to discover software vulnerabilities,’’ in Proc. 5th Int. Conf. Innov. word representations in vector space,’’ 2013, arXiv:1301.3781.
Mobile Internet Services Ubiquitous Comput., Jun. 2011, pp. 175–181, doi: [30] H. Wei, G. Lin, L. Li, and H. Jia, ‘‘A context-aware neural embedding for
10.1109/IMIS.2011.59. function-level vulnerability detection,’’ Algorithms, vol. 14, no. 11, p. 335,
[15] Q. Wang, Y. Li, Y. Wang, and J. Ren, ‘‘An automatic algorithm for software Nov. 2021, doi: 10.3390/a14110335.
vulnerability classification based on CNN and GRU,’’ Multimedia Tools [31] M. A. Albahar, ‘‘A modified maximal divergence sequential auto-
Appl., vol. 81, no. 5, pp. 7103–7124, Jan. 2022, doi: 10.1007/s11042-022- encoder and time delay neural network models for vulnerable binary
12049-1. codes detection,’’ IEEE Access, vol. 8, pp. 14999–15006, 2020, doi:
[16] Z. Li, D. Zou, S. Xu, X. Ou, H. Jin, S. Wang, Z. Deng, and Y. Zhong, 10.1109/ACCESS.2020.2965726.
‘‘VulDeePecker: A deep learning-based system for vulnerability detec- [32] K. Filus, P. Boryszko, J. Domanska, M. Siavvas, and E. Gelenbe, ‘‘Efficient
tion,’’ 2018, arXiv:1801.01681. feature selection for static analysis vulnerability prediction,’’ Sensors,
[17] L. Wartschinski, Y. Noller, T. Vogel, T. Kehrer, and L. Grunske, vol. 21, no. 4, p. 1133, Feb. 2021, doi: 10.3390/s21041133.
‘‘VUDENC: Vulnerability detection with deep learning on a natural code-
base for Python,’’ Inf. Softw. Technol., vol. 144, Apr. 2022, Art. no. 106809,
doi: 10.1016/j.infsof.2021.106809.
[18] H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang,
GUANGLI WU (Member, IEEE) was born in
‘‘JTrans: Jump-aware transformer for binary code similarity detection,’’ in
Proc. 31st ACM SIGSOFT Int. Symp. Softw. Test. Anal., Jul. 2022, pp. 1–13. Weifang, Shandong, China, in 1981. He received
[19] Y. Lee, H. Kwon, S.-H. Choi, S.-H. Lim, S. H. Baek, and K.-W. Park, the Ph.D. degree. He is currently a professor. His
‘‘instruction2vec: Efficient preprocessor of assembly code to detect soft- research interests include network security and
ware weakness with CNN,’’ Appl. Sci., vol. 9, no. 19, p. 4086, Sep. 2019, artificial intelligence.
doi: 10.3390/app9194086.
[20] H. Yan, S. Luo, L. Pan, and Y. Zhang, ‘‘HAN-BSVD: A hierarchical atten-
tion network for binary software vulnerability detection,’’ Comput. Secur.,
vol. 108, Sep. 2021, Art. no. 102286, doi: 10.1016/j.cose.2021.102286.
[21] T. Le, T. Nguyen, T. Le, D. Phung, P. Montague, O. De Vel, and L. Qu,
‘‘Maximal divergence sequential autoencoder for binary software vulner-
ability detection,’’ in Proc. Int. Conf. Learn. Represent., 2019, pp. 1–15.
[Online]. Available: https://openreview.net/pdf?id=ByloIiCqYQ
[22] J. Tian, W. Xing, and Z. Li, ‘‘BVDetector: A program slice-based binary HUILI TANG is currently pursuing the master’s
code vulnerability intelligent detection system,’’ Inf. Softw. Technol., degree with the Gansu University of Political
vol. 123, Jul. 2020, Art. no. 106289, doi: 10.1016/j.infsof.2020.106289. Science and Law. Her current research interests
[23] K. L. Narayana and K. Sathiyamurthy, ‘‘Automation and smart materi- include artificial intelligence and vulnerability
als in detecting smart contracts vulnerabilities in blockchain using deep detection.
learning,’’ Mater. Today, Proc., vol. 81, pp. 653–659, Jan. 2023, doi:
10.1016/j.matpr.2021.04.125.
[24] W. Ouyang, M. Li, Q. Liu, and J. Wang, ‘‘Binary vulnerability mining
based on long short-term memory network,’’ in Proc. World Autom. Congr.
(WAC), Aug. 2021, pp. 71–76, doi: 10.23919/WAC50355.2021.9559467.