Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3533767.3534371acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Path-sensitive code embedding via contrastive learning for software vulnerability detection

Published: 18 July 2022 Publication History

Abstract

Machine learning and its promising branch deep learning have shown success in a wide range of application domains. Recently, much effort has been expended on applying deep learning techniques (e.g., graph neural networks) to static vulnerability detection as an alternative to conventional bug detection methods. To obtain the structural information of code, current learning approaches typically abstract a program in the form of graphs (e.g., data-flow graphs, abstract syntax trees), and then train an underlying classification model based on the (sub)graphs of safe and vulnerable code fragments for vulnerability prediction. However, these models are still insufficient for precise bug detection, because the objective of these models is to produce classification results rather than comprehending the semantics of vulnerabilities, e.g., pinpoint bug triggering paths, which are essential for static bug detection.
This paper presents ContraFlow, a selective yet precise contrastive value-flow embedding approach to statically detect software vulnerabilities. The novelty of ContraFlow lies in selecting and preserving feasible value-flow (aka program dependence) paths through a pretrained path embedding model using self-supervised contrastive learning, thus significantly reducing the amount of labeled data required for training expensive downstream models for path-based vulnerability detection. We evaluated ContraFlow using 288 real-world projects by comparing eight recent learning-based approaches. ContraFlow outperforms these eight baselines by up to 334.1%, 317.9%, 58.3% for informedness, markedness and F1 Score, and achieves up to 450.0%, 192.3%, 450.0% improvement for mean statement recall, mean statement precision and mean IoU respectively in terms of locating buggy statements.

References

[1]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. Code2vec: Learning Distributed Representations of Code. 3, POPL (2019), Article 40, Jan., 29 pages. https://doi.org/10.1145/3290353
[2]
Apple Inc. 2021. Clang static analyzer. https://clang-analyzer.llvm.org/scan-build.html
[3]
Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, and Nikunj Saunshi. 2019. A Theoretical Analysis of Contrastive Unsupervised Representation Learning. CoRR, abs/1902.09229 (2019), arxiv:1902.09229. arxiv:1902.09229
[4]
M. Backes, B. Köpf, and A. Rybalchenko. 2009. Automatic Discovery and Quantification of Information Leaks. In 2009 30th IEEE Symposium on Security and Privacy. IEEE, 141–153. https://doi.org/10.1109/SP.2009.18
[5]
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. 1993. Signature Verification Using a "Siamese" Time Delay Neural Network. In Proceedings of the 6th International Conference on Neural Information Processing Systems (NIPS ’93). ACM, 737–744. https://doi.org/10.5555/2987189.2987282
[6]
Nghi D. Q. Bui, Yijun Yu, and Lingxiao Jiang. 2021. Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). ACM, 511–521. isbn:9781450380379 https://doi.org/10.1145/3404835.3462840
[7]
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2020. Deep Learning based Vulnerability Detection: Are We There Yet? CoRR, abs/2009.07235 (2020), arxiv:2009.07235. arxiv:2009.07235
[8]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority over-Sampling Technique. Journal of Artificial Intelligence Research, 321–357. https://doi.org/10.5555/1622407.1622416
[9]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. CoRR, abs/2002.05709 (2020), arxiv:2002.05709. arxiv:2002.05709
[10]
Xiao Cheng, Haoyu Wang, Jiayi Hua, Guoai Xu, and Yulei Sui. 2021. DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network. ACM Trans. Softw. Eng. Methodol., 30, 3 (2021), Article 38, 33 pages. https://doi.org/10.1145/3436877
[11]
X. Cheng, H. Wang, J. Hua, M. Zhang, G. Xu, L. Yi, and Y. Sui. 2019. Static Detection of Control-Flow-Related Vulnerabilities Using Graph Embedding. In ICECCS. 41–50. https://doi.org/10.1109/ICECCS.2019.00012
[12]
Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical Memory Leak Detection Using Guarded Value-Flow Analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, 480–491. https://doi.org/10.1145/1250734.1250789
[13]
Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar. 103–111. https://doi.org/10.3115/v1/W14-4012
[14]
Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-Sensitive Program Verification in Polynomial Time. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI ’02). Association for Computing Machinery, 57–68. https://doi.org/10.1145/512529.512538
[15]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08/ETAPS’08). Springer, 337–340. https://doi.org/10.1007/978-3-540-78800-3_24
[16]
Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why Does Unsupervised Pre-Training Help Deep Learning? J. Mach. Learn. Res., 625–660. issn:1532-4435 https://doi.org/10.5555/1756006.1756025
[17]
Fabian. 2021. joern. https://github.com/ShiftLeftSecurity/joern/
[18]
Facebook. 2021. Infer. https://fbinfer.com/
[19]
WA Falcon and .al. 2021. PyTorch Lightning. GitHub. Note: https://github.com/PyTorchLightning/pytorch-lightning, 3 (2021).
[20]
Jiahao Fan, Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020. A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR). ACM, 508–512. https://doi.org/10.1145/3379597.3387501
[21]
Hongchao Fang and Pengtao Xie. 2020. CERT: Contrastive Self-supervised Learning for Language Understanding. CoRR, abs/2005.12766 (2020), arxiv:2005.12766. arxiv:2005.12766
[22]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. ACL, 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139
[23]
Basura Fernando, Hakan Bilen, Efstratios Gavves, and Stephen Gould. 2016. Self-Supervised Video Representation Learning With Odd-One-Out Networks. CoRR, abs/1611.06646 (2016), arxiv:1611.06646. arxiv:1611.06646
[24]
Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. CoRR, abs/1903.02428 (2019), arxiv:1903.02428. arxiv:1903.02428
[25]
Qing Gao, Sen Ma, Sihao Shao, Yulei Sui, Guoliang Zhao, Luyao Ma, Xiao Ma, Fuyao Duan, Xiao Deng, Shikun Zhang, and Xianglong Chen. 2018. CoBOT: Static C/C++ Bug Detection in the Presence of Incomplete Code. In IEEE/ACM 26th International Conference on Program Comprehension (ICPC ’18). IEEE, 385–3853. https://ieeexplore.ieee.org/document/8973011
[26]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic. 6894–6910. https://aclanthology.org/2021.emnlp-main.552
[27]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. CoRR, abs/2104.08821 (2021), arxiv:2104.08821. arxiv:2104.08821
[28]
Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised Representation Learning by Predicting Image Rotations. CoRR, abs/1803.07728 (2018), arxiv:1803.07728. arxiv:1803.07728
[29]
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18, 5 (2005), 602 – 610. issn:0893-6080 https://doi.org/10.1016/j.neunet.2005.06.042 IJCNN 2005.
[30]
Isabelle Guyon, Lisheng Sun-Hosoya, Marc Boullé, Hugo Jair Escalante, Sergio Escalera, Zhengying Liu, Damir Jajetic, Bisakha Ray, Mehreen Saeed, Michéle Sebag, Alexander Statnikov, WeiWei Tu, and Evelyne Viegas. 2019. Analysis of the AutoML Challenge series 2015-2018. In AutoML (Springer series on Challenges in Machine Learning). https://www.automl.org/wp-content/uploads/2018/09/chapter10-challenge.pdf
[31]
Günter Obiltschnig. 2021. POCO. https://pocoproject.org/
[32]
Jingxuan He, Cheng-Chun Lee, Veselin Raychev, and Martin Vechev. 2021. Learning to Find Naming Issues with Big Code and Small Supervision. ACM, New York, NY, USA. 296–311. https://doi.org/10.1145/3453483.3454045
[33]
Hecht-Nielsen. 1989. Theory of the backpropagation neural network. In International 1989 Joint Conference on Neural Networks. 593–605 vol.1. https://doi.org/10.1109/IJCNN.1989.118638
[34]
ImageMagick Team. 2021. ImageMagick. https://imagemagick.org/
[35]
Israel. 2021. Checkmarx. https://www.checkmarx.com/
[36]
Jaccard. 2021. Jaccard index. https://en.wikipedia.org/wiki/Jaccard_index
[37]
Paras Jain, Ajay Jain, Tianjun Zhang, Pieter Abbeel, Joseph Gonzalez, and Ion Stoica. 2021. Contrastive Code Representation Learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic. 5954–5971. https://doi.org/10.18653/v1/2021.emnlp-main.482
[38]
Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. 2020. A Survey on Contrastive Self-supervised Learning. CoRR, abs/2011.00362 (2020), arxiv:2011.00362. arxiv:2011.00362
[39]
Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. 2002. An Efficient K-Means Clustering Algorithm: Analysis and Implementation. IEEE Trans. Pattern Anal. Mach. Intell., 881–892. https://doi.org/10.1109/TPAMI.2002.1017616
[40]
J. Kiefer and J. Wolfowitz. 1952. Stochastic Estimation of the Maximum of a Regression Function. The Annals of Mathematical Statistics, 23, 3 (1952), 462 – 466. https://doi.org/10.1214/aoms/1177729392
[41]
Thomas N. Kipf and Max Welling. 2016. Semi-Supervised Classification with Graph Convolutional Networks. CoRR, abs/1609.02907 (2016), arxiv:1609.02907. arxiv:1609.02907
[42]
Tassilo Klein and Moin Nabi. 2020. Contrastive Self-Supervised Learning for Commonsense Reasoning. CoRR, abs/2005.00669 (2020), arxiv:2005.00669. arxiv:2005.00669
[43]
Bruno Korbar, Du Tran, and Lorenzo Torresani. 2018. Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS ’18). ACM, 7774–7785. https://doi.org/10.5555/3327757.3327874
[44]
Daniel Kroening and Michael Tautschnig. 2014. CBMC – C Bounded Model Checker. In Tools and Algorithms for the Construction and Analysis of Systems, Erika Ábrahám and Klaus Havelund (Eds.). Springer, 389–391. https://doi.org/10.1007/978-3-642-54862-8_26
[45]
Changsheng Li, Handong Ma, Zhao Kang, Ye Yuan, Xiao-Yu Zhang, and Guoren Wang. 2020. On Deep Unsupervised Active Learning. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, Christian Bessiere (Ed.). International Joint Conferences on Artificial Intelligence Organization, 2626–2632. https://doi.org/10.24963/ijcai.2020/364 Main track.
[46]
Tuo Li, Jia-Ju Bai, Yulei Sui, and Shi-Min Hu. 2022. Path-Sensitive and Alias-Aware Typestate Analysis for Detecting OS Bugs. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2022). Association for Computing Machinery, New York, NY, USA. 859–872. isbn:9781450392051 https://doi.org/10.1145/3503222.3507770
[47]
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard S. Zemel. 2016. Gated Graph Sequence Neural Networks. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). arxiv:1511.05493
[48]
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2021. Vulnerability Detection with Fine-Grained Interpretations. FSE ’21. ACM, 292–303. https://doi.org/10.1145/3468264.3468597
[49]
Z. Li, D. Zou, S. Xu, Z. Chen, Y. Zhu, and H. Jin. 2021. VulDeeLocator: A Deep Learning-based Fine-grained Vulnerability Detector. IEEE Transactions on Dependable and Secure Computing, 1–1. issn:1941-0018 https://doi.org/10.1109/TDSC.2021.3076142
[50]
Z. Li, D. Zou, S. Xu, H. Jin, Y. Zhu, and Z. Chen. 2021. SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities. 1–1. https://doi.org/10.1109/TDSC.2021.3051525
[51]
Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A Deep Learning-Based System for Vulnerability Detection. NDSS, https://doi.org/10.14722/ndss.2018.23158
[52]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, abs/1907.11692 (2019), arxiv:1907.11692. arxiv:1907.11692
[53]
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In Defense of Soundiness: A Manifesto. Commun. ACM, 58, 2 (2015), 44–46. issn:0001-0782 https://doi.org/10.1145/2644805
[54]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS’13). Curran Associates Inc., USA. 3111–3119. http://dl.acm.org/citation.cfm?id=2999792.2999959
[55]
MITRE. 2021. CWE840. https://cwe.mitre.org/data/definitions/840.html
[56]
R. L. Russell, Louis Y. Kim, Lei H. Hamilton, T. Lazovich, Jacob A. Harer, Onur Ozdemir, Paul M. Ellingwood, and Marc W. McConley. 2018. Automated Vulnerability Detection in Source Code Using Deep Representation Learning. ICMLA, 757–762.
[57]
Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: Fast and Precise Sparse Value Flow Analysis for Million Lines of Code. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’18). ACM, 693–706. https://doi.org/10.1145/3192366.3192418
[58]
Qingkai Shi, Peisen Yao, Rongxin Wu, and Charles Zhang. 2021. Path-Sensitive Sparse Analysis without Path Conditions. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI ’21). ACM, 930–943. https://doi.org/10.1145/3453483.3454086
[59]
Tian Shi, Liuqing Li, Ping Wang, and Chandan K. Reddy. 2020. A Simple and Effective Self-Supervised Contrastive Learning Framework for Aspect Detection. CoRR, abs/2009.09107 (2020), arxiv:2009.09107. arxiv:2009.09107
[60]
Inc Secure Software. 2014. RATS. https://code.google.com/archive/p/rough-auditing-tool-for-security/
[61]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res., 15, 1 (2014), jan, 1929–1958. issn:1532-4435
[62]
Yulei Sui, Xiao Cheng, Guanqin Zhang, and Haoyu Wang. 2020. Flow2Vec: Value-Flow-Based Precise Code Embedding. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 233, Nov., 27 pages. https://doi.org/10.1145/3428301
[63]
Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural Static Value-Flow Analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction (CC). ACM, New York, NY, USA. 265–266. isbn:9781450342414 https://doi.org/10.1145/2892208.2892235
[64]
Yulei Sui, Ding Ye, and Jingling Xue. 2012. Static memory leak detection using full-sparse value-flow analysis. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (ISSTA ’12). ACM, 254–264. https://doi.org/10.1145/2338965.2336784
[65]
Yulei Sui, Ding Ye, and Jingling Xue. 2014. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis. IEEE Transactions on Software Engineering, 107–122. https://doi.org/10.1109/TSE.2014.2302311
[66]
Synopsys. 2021. Coverity. https://scan.coverity.com/
[67]
Li Tao, Xueting Wang, and Toshihiko Yamasaki. 2020. Self-Supervised Video Representation Learning Using Inter-Intra Contrastive Framework. In Proceedings of the 28th ACM International Conference on Multimedia (MM ’20). ACM, 2193–2201. https://doi.org/10.1145/3394171.3413694
[68]
The Tcpdump Group. 2021. TCPDUMP. https://www.tcpdump.org/
[69]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS ’17). Curran Associates, Inc., 6000–6010. https://doi.org/10.5555/3295222.3295349
[70]
J. Viega, J. T. Bloch, Y. Kohno, and G. McGraw. 2000. ITS4: a static vulnerability scanner for C and C++ code. In Proceedings 16th Annual Computer Security Applications Conference (ACSAC’00). 257–267. https://doi.org/10.1109/ACSAC.2000.898880
[71]
Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward Deep Learning Software Repositories. In Proceedings of the 12th Working Conference on Mining Software Repositories (MSR). IEEE Press, Piscataway, NJ, USA. 334–345. isbn:978-0-7695-5594-2 https://doi.org/10.5555/2820518.2820559
[72]
Wikipedia. 2021. Norm. https://en.wikipedia.org/wiki/Norm_(mathematics)
[73]
Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. 2018. Representation Learning on Graphs with Jumping Knowledge Networks. CoRR, abs/1806.03536 (2018), arxiv:1806.03536. arxiv:1806.03536
[74]
F. Yamaguchi, A. Maier, H. Gascon, and K. Rieck. 2015. Automatic Inference of Search Patterns for Taint-Style Vulnerabilities. In 2015 IEEE Symposium on Security and Privacy. 797–812. https://doi.org/10.1109/SP.2015.54
[75]
Fabian Yamaguchi, Christian Wressnegger, Hugo Gascon, and Konrad Rieck. 2013. Chucky: Exposing Missing Checks in Source Code for Vulnerability Discovery. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (CCS ’13). ACM, 499–510. https://doi.org/10.1145/2508859.2516665
[76]
Hua Yan, Yulei Sui, Shiping Chen, and Jingling Xue. 2018. Spatio-Temporal Context Reduction: A Pointer-Analysis-Based Static Approach for Detecting Use-after-Free Vulnerabilities. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, 327–337. isbn:9781450356381 https://doi.org/10.1145/3180155.3180178
[77]
Zonghan Yang, Yong Cheng, Yang Liu, and Maosong Sun. 2019. Reducing Word Omission Errors in Neural Machine Translation: A Contrastive Learning Approach. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL ’19). ACM, 6191–6196. https://doi.org/10.18653/v1/P19-1623
[78]
Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. 2019. GNNExplainer: Generating Explanations for Graph Neural Networks. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc. https://doi.org/10.5555/3454287.3455116
[79]
J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu. 2019. A Novel Neural Source Code Representation Based on Abstract Syntax Tree. ICSE. IEEE/ACM, 783–794. https://doi.org/10.1109/ICSE.2019.00086
[80]
Yunhui Zheng, Saurabh Pujar, Burn Lewis, Luca Buratti, Edward Epstein, Bo Yang, Jim Laredo, Alessandro Morari, and Zhong Su. 2021. D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis. In Proceedings of the ACM/IEEE 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). ACM, New York, NY, USA.
[81]
YaQin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NIPS ’19). Curran Associates Inc. https://doi.org/10.5555/3454287.3455202
[82]
Deqing Zou, Sujuan Wang, Shouhuai Xu, Zhen Li, and Hai Jin. 2019. μ VulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection. TDSC, 1–1. https://doi.org/10.1109/tdsc.2019.2942930

Cited By

View all
  • (2024)AI-Assisted Programming Tasks Using Code Embeddings and TransformersElectronics10.3390/electronics1304076713:4(767)Online publication date: 15-Feb-2024
  • (2024)Dynamic Transitive Closure-based Static Analysis through the Lens of Quantum SearchACM Transactions on Software Engineering and Methodology10.1145/364438933:5(1-29)Online publication date: 4-Jun-2024
  • (2024)Fast Graph Simplification for Path-Sensitive Typestate Analysis through Tempo-Spatial Multi-Point SlicingProceedings of the ACM on Software Engineering10.1145/36437491:FSE(494-516)Online publication date: 12-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2022: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis
July 2022
808 pages
ISBN:9781450393799
DOI:10.1145/3533767
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Path sensitive
  2. code embedding
  3. contrastive learning
  4. vulnerabilities

Qualifiers

  • Research-article

Conference

ISSTA '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)421
  • Downloads (Last 6 weeks)27
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)AI-Assisted Programming Tasks Using Code Embeddings and TransformersElectronics10.3390/electronics1304076713:4(767)Online publication date: 15-Feb-2024
  • (2024)Dynamic Transitive Closure-based Static Analysis through the Lens of Quantum SearchACM Transactions on Software Engineering and Methodology10.1145/364438933:5(1-29)Online publication date: 4-Jun-2024
  • (2024)Fast Graph Simplification for Path-Sensitive Typestate Analysis through Tempo-Spatial Multi-Point SlicingProceedings of the ACM on Software Engineering10.1145/36437491:FSE(494-516)Online publication date: 12-Jul-2024
  • (2024)Beyond Fidelity: Explaining Vulnerability Localization of Learning-Based DetectorsACM Transactions on Software Engineering and Methodology10.1145/364154333:5(1-33)Online publication date: 4-Jun-2024
  • (2024)Combining Structured Static Code Information and Dynamic Symbolic Traces for Software Vulnerability PredictionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639212(1-13)Online publication date: 20-May-2024
  • (2024)Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639168(1-13)Online publication date: 20-May-2024
  • (2024)Stealthy Backdoor Attack for Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.336166150:4(721-741)Online publication date: Apr-2024
  • (2024)Software Defect Prediction Based on Deep Representation Learning of Source Code From Contextual Syntax and Semantic GraphIEEE Transactions on Reliability10.1109/TR.2024.335496573:2(820-834)Online publication date: Jun-2024
  • (2024)DP-CCL: A Supervised Contrastive Learning Approach Using CodeBERT Model in Software Defect PredictionIEEE Access10.1109/ACCESS.2024.336289612(22582-22594)Online publication date: 2024
  • (2024)CSVD-TF: Cross-project software vulnerability detection with TrAdaBoost by fusing expert metrics and semantic metricsJournal of Systems and Software10.1016/j.jss.2024.112038213(112038)Online publication date: Jul-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media