Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

An attention-based automatic vulnerability detection approach with GGNN

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Vulnerability detection has long been an important issue in software security. The existing methods mainly define the rules and features of vulnerabilities through experts, which are time-consuming and laborious, and usually with poor accuracy. Thus automatic vulnerability detection methods based on code representation graph and Graph Neural Network (GNN) have been proposed with the advantage of effectively capture both the semantics and structure information of the source code, showing a better performance. However, these methods ignore the redundant information in the graph and the GNN model, leading to a still unsatisfactory performance. To alleviate this problem, we propose a attention-based automatic vulnerability detection approach with Gated Graph Sequence Neural Network (GGNN). Firstly, we introduce two preprocessing methods namely pruning and symbolization representation to reduce the redundant information of the input code representation graph, and then put the graph into the GGNN layer to update the node features. Next, the key subgraph extraction and global feature aggregation are realized through the attention-based Pooling layers. Finally, the classification result is obtained through a linear classifier. The experimental results show the effectiveness of our proposed preprocessing methods and attention-based Pooling layers, especially the higher Accuracy and F1-score gains compared with the state-of-the-art automatic vulnerability detection approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. National vulnerability database. In: https://nvd.nist.gov/

  2. Common vulnerabilities and exposures. In: https://cve.mitre.org

  3. Flawfinder. In: http://www.dwheeler.com/flawfinder

  4. Viega J, Bloch JT, McGraw G (2000) ITS4: a static vulnerability scanner for C and C++ code. In: 16th Annual Computer Security Applications Conference (ACSAC), 11–15 December 2000, New Orleans, Louisiana, USA, p 257

  5. Rough-auditing-tool-for-security. In: https://code.google.com/archive/p/rough-auditing-tool-for-security

  6. Checkmarx: In: https://www.checkmarx.com/

  7. Hp fortify. In: https://www.hpfod.com/

  8. Kim S, Woo S, Lee H, Oh H (2017) VUDDY: a scalable approach for vulnerable code clone discovery. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22–26, pp 595–614

  9. Li Z, Zou D, Xu S, et al (2016) Vulpecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, ACSAC, Los Angeles, CA, USA, December 5–9, pp 201–213

  10. Yamaguchi F, Golde N, Arp D, Rieck K (2014) Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18–21, pp 590–604

  11. Perl, H., Dechand, S., et al (2015) Vccfinder: Finding potential vulnerabilities in open-source projects to assist code audits. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, October 12–16, 2015, pp 426–437

  12. Grieco G, Grinblat GL et al (2016) Toward large-scale vulnerability discovery using machine learning. In: Proceedings of the Sixth ACM on Conference on Data and Application Security and Privacy, CODASPY, New Orleans, LA, USA, March 9–11, pp 85–96

  13. Wu F, Wang J, Liu J, Wang W (2017) Vulnerability detection with deep learning. In: 2017 3rd IEEE International Conference on Computer and Communications (ICCC), pp 1298–1302

  14. Li Z, Zou D, et al. (2018) Sysevr: a framework for using deep learning to detect software vulnerabilities

  15. Li Z, Zou D, et al. (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. In: 25th Annual Network and Distributed System Security Symposium, NDSS, San Diego, California, USA, February 18–21

  16. Srikant S, Lesimple N, O’Reilly U (2020) Dependency-based neural representations for classifying lines of programs

  17. Li Y, Tarlow D, Brockschmidt M, Zemel RS (2016) Gated graph sequence neural networks. In: Bengio Y, LeCun Y (eds) 4th International Conference on Learning Representations, ICLR, San Juan, Puerto Rico, May 2–4

  18. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR, Toulon, France, April 24–26

  19. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7–12, pp 3431–3440

  20. Zheng W, Gao J et al (2020) The impact factors on the performance of machine learning-based vulnerability detection: a comparative study. J Syst Softw 110659

  21. Cho K, van Merrienboer B et al. (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti A, Pang B, Daelemans W (eds) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A Meeting of SIGDAT, a Special Interest Group of The ACL, pp. 1724–1734

  22. Allamanis M, Brockschmidt M, Khademi M (2018) Learning to represent programs with graphs. In: 6th International Conference on Learning Representations, ICLR, Vancouver, BC, Canada, April 30–May 3

  23. Zhou Y, Liu S, et al. (2019) Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, pp 10197–10207

  24. Rabheru R, Hanif H, Maffeis S (2020) A hybrid graph neural network approach for detecting PHP vulnerabilities

  25. Wang H, Ye G et al (2021) Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans Inf Forensics Secur 16:1943–1958

    Article  Google Scholar 

  26. Cheng X, Wang H et al (2019) Static detection of control-flow-related vulnerabilities using graph embedding. In: Pang J, Sun J (eds) 24th International Conference on Engineering of Complex Computer Systems. ICECCS 2019, Guangzhou, China, November 10–13, pp 41–50

  27. Cao S, Sun X, Bo L, Wei Y, Li B (2021) Bgnn4vd: constructing bidirectional graph neural-network for vulnerability detection. Inf Softw Technol 136:106576

    Article  Google Scholar 

  28. Wu Y, Lu J, Zhang Y, Jin S (2021) Vulnerability detection in C/C++ source code with graph representation learning. In: 11th IEEE Annual Computing and Communication Workshop and Conference, CCWC, Las Vegas, NV, USA, January 27–30, pp 1519–1524

  29. Lee J, Lee I, Kang J (2019) Self-attention graph pooling. In: Proceedings of the 36th International Conference on Machine Learning, ICML, 9–15 June 2019, Long Beach, California, USA, vol. 97, pp 3734–3743

  30. Yamaguchi F. Joern. In: https://joern.io/

  31. Russell LKR. Common vulnerabilities and exposures. In: https://osf.io/d45bw/

  32. Russell RL, Kim LY, Hamilton LH, Lazovich T, Harer J, Ozdemir O, Ellingwood PM, McConley MW (2018) Automated vulnerability detection in source code using deep representation learning. In: Wani MA, Kantardzic MM, Mouchaweh MS, Gama J, Lughofer E (eds) 17th IEEE International Conference on Machine Learning and Applications. ICMLA 2018, Orlando, FL, USA, December 17–20, pp 757–762

  33. Yang Z, Yang D, et al. (2016) Hierarchical attention networks for document classification. In: NAACL HLT, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, pp 1480–1489

  34. Lin G, Zhang J, Luo W, Pan L, de Vel OY, Montague P, Xiang Y (2021) Software vulnerability discovery via learning multi-domain knowledge bases. IEEE Trans Depend Secur Comput 18(5):2469–2485

    Article  Google Scholar 

  35. Cppcheck. In: http://cppcheck.net

Download references

Acknowledgements

This work was supported by Nature Science Foundation of China (Grant No. 61872104), Fundamental Research Funds for the Central Universities in China (Grant No. 3072020CF0603), National Natural Science Foundation of China (Grant No. 62106150), CAAC Key Laboratory of Civil Aviation Wide Surveillance and Safety Operation Management and Control Technology (Grant No. 202102), and CCF-NSFOCUS (Grant No. 2021001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiqiang Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, G., Yang, L., Zhang, L. et al. An attention-based automatic vulnerability detection approach with GGNN. Int. J. Mach. Learn. & Cyber. 14, 3113–3127 (2023). https://doi.org/10.1007/s13042-023-01824-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01824-7

Keywords