research-article

Open access

Learning Program Semantics for Vulnerability Detection via Vulnerability-Specific Inter-procedural Slicing

Authors:

Jun Sun, and

Shang-Wei LinAuthors Info & Claims

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

November 2023

Pages 1371 - 1383

https://doi.org/10.1145/3611643.3616351

Published: 30 November 2023 Publication History

PDF eReader

Abstract

Learning-based approaches that learn code representations for software vulnerability detection have been proven to produce inspiring results. However, they still fail to capture complete and precise vulnerability semantics for code representations. To address the limitations, in this work, we propose a learning-based approach namely SnapVuln, which first utilizes multiple vulnerability-specific inter-procedural slicing algorithms to capture vulnerability semantics of various types and then employs a Gated Graph Neural Network (GGNN) with an attention mechanism to learn vulnerability semantics. We compare SnapVuln with state-of-the-art learning-based approaches on two public datasets, and confirm that SnapVuln outperforms them. We further perform an ablation study and demonstrate that the completeness and precision of vulnerability semantics captured by SnapVuln contribute to the performance improvement.

Supplementary Material

Video (fse23main-p1222-p-video.mp4)

"Recently, the learning-based approaches that learn code representations for software vulnerability detection have been proven to produce inspiring results. However, they still suffer from some limitations. On one hand, some learning-based works learn code representation on a single function for vulnerability detection, which ignore the fact that some vulnerabilities span multiple functions. On the other hand, other works attempt to leverage slicing techniques to extract the program semantics of vulnerable parts to generate code representations for vulnerability detection but fail to slice out precise vulnerable parts due to the wide variety of vulnerabilities that cannot be accurately captured by one general slicing algorithm. To address the limitations, in this paper, we propose a learning-based approach namely SnapVuln, which utilizes multiple type-specific inter-procedural slicing algorithms that operate on inter-procedural graphs to capture precise program semantics of various vulnerability types and leverages a Gated Graph Neural Network (GGNN) with an attention mechanism to learn graph structure information and assign different weights to different program semantics for code representation generation. We conduct extensive experiments on two public datasets, and compare SnapVuln with five state-of-the-art learning-based vulnerability detection approaches and two pre-trained approaches. Experimental results show that SnapVuln outperforms these baselines. We further perform an ablation study to demonstrate that the completeness and precision of vulnerability semantics captured by SnapVuln contribute to the improvement of vulnerability detection."

Download
92.78 MB

References

[1]

2022. joern. https://joern.io/

Abstract

Supplementary Material

References

Index Terms

Recommendations

On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural Vulnerabilities

Automated Software Vulnerability Detection in Statement Level using Vulnerability Reports

Learning-based Vulnerability Detection in Binary Code

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations