Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Specializing Neural Networks for Cryptographic Code Completion Applications

Published: 01 June 2023 Publication History

Abstract

Similarities between natural languages and programming languages have prompted researchers to apply neural network models to software problems, such as code generation and repair. However, program-specific characteristics pose unique prediction challenges that require the design of new and specialized neural network solutions. In this work, we identify new prediction challenges in application programming interface (API) completion tasks and find that existing solutions are unable to capture complex program dependencies in program semantics and structures. We design a new neural network model Multi-HyLSTM to overcome the newly identified challenges and comprehend complex dependencies between API calls. Our neural network is empowered with a specialized dataflow analysis to extract multiple global API dependence paths for neural network predictions. We evaluate Multi-HyLSTM on 64,478 Android Apps and predict 774,460 Java cryptographic API calls that are usually challenging for developers to use correctly. Our Multi-HyLSTM achieves an excellent top-1 API completion accuracy at 98.99%. Moreover, we show the effectiveness of our design choices through an ablation study and have released our dataset.

References

[1]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” 2018,.
[2]
Y. Liu et al., “RoBERTa: A robustly optimized bert pretraining approach,” 2019,.
[3]
T. Brown et al., “Language models are few-shot learners,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2020, pp. 1877–1901.
[4]
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “XLNeT: Generalized autoregressive pretraining for language understanding,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, Art. no.
[5]
Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov, “Transformer-XL: Attentive language models beyond a fixed-length context,” 2019,.
[6]
Z. Feng et al., “CodeBERT: A pre-trained model for programming and natural languages,” 2020,.
[7]
A. Svyatkovskiy, S. K. Deng, S. Fu, and N. Sundaresan, “IntelliCode compose: Code generation using transformer,” in Proc. 28th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2020, pp. 1433–1443.
[8]
S. Lu et al., “CodeXGLUE: A machine learning benchmark dataset for code understanding and generation,” 2021,.
[9]
A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, “On the naturalness of software,” in Proc. 34th Int. Conf. Softw. Eng., 2012, pp. 837–847.
[10]
A. Hindle, E. T. Barr, M. Gabel, Z. Su, and P. Devanbu, “On the naturalness of software,” Commun. ACM, vol. 59, no. 5, pp. 122–131, 2016.
[11]
B. Ray, V. Hellendoorn, S. Godhane, Z. Tu, A. Bacchelli, and P. Devanbu, “On the “naturalness” of buggy code,” in Proc. IEEE/ACM 38th Int. Conf. Softw. Eng., 2016, pp. 428–439.
[12]
M. Allamanis, M. Brockschmidt, and M. Khademi, “Learning to represent programs with graphs,” in Proc. Int. Conf. Learn. Representations, 2018.
[13]
Y. Li et al., “Competition-level code generation with alphacode,” 2022,.
[14]
M. Chen et al., “Evaluating large language models trained on code,” 2021,.
[15]
X. Chen, C. Liu, and D. Song, “Tree-to-tree neural networks for program translation,” 2018,.
[16]
V. Murali, L. Qi, S. Chaudhuri, and C. Jermaine, “Neural sketch learning for conditional program generation,” in Proc. Int. Conf. Learn. Representations, 2018.
[17]
M. Brockschmidt, M. Allamanis, A. L. Gaunt, and O. Polozov, “Generative code modeling with graphs,” in Proc. Int. Conf. Learn. Representations, 2019.
[18]
S. Black, L. Gao, P. Wang, C. Leahy, and S. Biderman, “GPT-Neo: Large scale autoregressive language modeling with mesh-tensorflow,” If You Use This Software, Please Cite it Using These Metadata, Zenodo, Mar. 2021.
[19]
J. Li, Y. Wang, M. R. Lyu, and I. King, “Code completion with neural attention and pointer networks,” 2017,.
[20]
F. Liu, G. Li, B. Wei, X. Xia, Z. Fu, and Z. Jin, “A self-attentional neural architecture for code completion with multi-task learning,” in Proc. 28th Int. Conf. Prog. Comprehension, 2020, pp. 37–47.
[21]
D. Guo et al., “GraphCodebBERT: Pre-training code representations with data flow,” 2020,.
[22]
Y. Zhou, S. Liu, J. Siow, X. Du, and Y. Liu, “Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, pp. 10 197–10 207.
[23]
M. Allamanis, P. Chanthirasegaran, P. Kohli, and C. Sutton, “Learning continuous semantic representations of symbolic expressions,” in Proc. 34th Int. Conf. Mach. Learn., 2017, pp. 80–88.
[24]
R. Mukherjee, Y. Wen, D. Chaudhari, T. Reps, S. Chaudhuri, and C. Jermaine, “Neural program generation modulo static analysis,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2021, pp. 18984–18996.
[25]
Codota AI Autocomplete for Java and JavaScript, 2021. [Online]. Available: https://plugins.jetbrains.com/plugin/7638-codota-ai-autocomplete-for-java-and-javascript
[26]
V. Raychev, M. Vechev, and E. Yahav, “Code completion with statistical language models,” ACM SIGPLAN Notices, vol. 49, no. 6, pp. 419–428, 2014.
[27]
S. Nadi, S. Kruger, M. Mezini, and E. Bodden, “Jumping through hoops: Why do Java developers struggle with cryptography APIs?,” in Proc. 37th Int. Conf. Softw. Eng., 2016, pp. 935–946.
[28]
Y. Acar, M. Backes, S. Fahl, D. Kim, M. L. Mazurek, and C. Stransky, “You get where you’re looking for: The impact of information sources on code security,” in Proc. IEEE Symp. Secur. Privacy, 2016, pp. 289–305.
[29]
N. Meng, S. Nagy, D. Yao, W. Zhuang, and G. Arango-Argoty, “Secure coding practices in Java: Challenges and vulnerabilities,” in Proc. IEEE/ACM 40th Int. Conf. Softw. Eng., 2018, pp. 372–383.
[30]
Y. Acar et al., “Comparing the usability of cryptographic APIs,” in Proc. IEEE Symp. Secur. Privacy, 2017, pp. 154–171.
[31]
S. Rahaman et al., “CryptoGuard: High precision detection of cryptographic vulnerabilities in massive-sized Java projects,” in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., 2019, pp. 2455–2472.
[32]
Y. Xiao, Y. Zhao, N. Allen, N. Keynes, D. D. Yao, and C. Cifuentes, “Industrial experience of finding cryptographic vulnerabilities in large-scale codebases,” ACM Digit. Threats: Res. Pract., vol. 4, 2022, Art. no.
[33]
M. Schlichtig, A.-K. Wickert, S. Krüger, E. Bodden, and M. Mezini, “CamBench - Cryptographic API misuse detection tool benchmark suite,” 2022. arXiv:2204.06447.
[34]
S. Afrose, S. Rahaman, and D. Yao, “CryptoAPI-Bench: A comprehensive benchmark on Java cryptographic API misuses,” in Proc. IEEE Cybersecurity Develop., 2019, pp. 49–61.
[35]
S. Afrose, Y. Xiao, S. Rahaman, B. P. Miller, and D. D. Yao, “Evaluation of static vulnerability detection tools with Java cryptographic API benchmarks,” IEEE Trans. Softw. Eng., vol. 49, no. 2, pp. 485–497, Feb. 2023.
[36]
A. Vaswani et al., “Attention is all you need,” 2017,.
[37]
M. Allamanis and C. Sutton, “Mining source code repositories at massive scale using language modeling,” in Proc. IEEE 10th Work. Conf. Mining Softw. Repositories, 2013, pp. 207–216.
[38]
J. C. Campbell, A. Hindle, and J. N. Amaral, “Syntax errors just aren't natural: Improving error reporting with language models,” in Proc. 11th Work. Conf. Mining Softw. Repositories, 2014, pp. 252–261.
[39]
H. K. Dam, T. Tran, and T. Pham, “A deep language model for software code,” 2016,.
[40]
R.-M. Karampatsis, H. Babii, R. Robbes, C. Sutton, and A. Janes, “Big Code!= big vocabulary: Open-vocabulary models for source code,” in Proc. IEEE/ACM 42nd Int. Conf. Softw. Eng., 2020, pp. 1073–1085.
[41]
C. Maddison and D. Tarlow, “Structured generative models of natural source code,” in Proc. Int. Conf. Mach. Learn., 2014, pp. 649–657.
[42]
V. Raychev, P. Bielik, M. Vechev, and A. Krause, “Learning programs from noisy data,” ACM SIGPLAN Notices, vol. 51, no. 1, pp. 761–774, 2016.
[43]
P. Bielik, V. Raychev, and M. Vechev, “PHOG: Probabilistic model for code,” in Proc. Int. Conf. Mach. Learn., 2016, pp. 2933–2942.
[44]
C. Liu, X. Wang, R. Shin, J. E. Gonzalez, and D. Song, “Neural code completion,” 2017. [Online]. Available: https://openreview.net/forum?id=rJbPBt9lg
[45]
P. Yin and G. Neubig, “A syntactic neural model for general-purpose code generation,” 2017,.
[46]
A. T. Nguyen and T. N. Nguyen, “Graph-based statistical language model for code,” in Proc. IEEE/ACM 37th Int. Conf. Softw. Eng., 2015, pp. 858–868.
[47]
D. E. Knuth, “Semantics of context-free languages,” Math. Syst. Theory, vol. 2, no. 2, pp. 127–145, 1968.
[48]
T. T. Nguyen, A. T. Nguyen, H. A. Nguyen, and T. N. Nguyen, “A statistical semantic language model for source code,” in Proc. 9th Joint Meeting Found. Softw. Eng., 2013, pp. 532–542.
[49]
A. T. Nguyen et al., “API code recommendation using statistical learning from fine-grained changes,” in Proc. 24th ACM SIGSOFT Int. Symp. Found. Softw. Eng., 2016, pp. 511–522.
[50]
T. T. Nguyen, H. V. Pham, P. M. Vu, and T. T. Nguyen, “Learning API usages from bytecode: A statistical approach,” in Proc. IEEE/ACM 38th Int. Conf. Softw. Eng., 2016, pp. 416–427.
[51]
T. T. Nguyen, H. A. Nguyen, N. H. Pham, J. M. Al-Kofahi, and T. N. Nguyen, “Graph-based mining of multiple object usage patterns,” in Proc. 7th Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Found. Softw. Eng., 2009, pp. 383–392.
[52]
V. Raychev, M. Vechev, and A. Krause, “Predicting program properties from “ big code”,” ACM SIGPLAN Notices, vol. 50, no. 1, pp. 111–124, 2015.
[53]
M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, “Suggesting accurate method and class names,” in Proc. 10th Joint Meeting Found. Softw. Eng., 2015, pp. 38–49.
[54]
M. Allamanis, E. T. Barr, C. Bird, and C. Sutton, “Learning natural coding conventions,” in Proc. 22nd ACM SIGSOFT Int. Symp. Found. Softw. Eng., 2014, pp. 281–293.
[55]
C. B. Clement, D. Drain, J. Timcheck, A. Svyatkovskiy, and N. Sundaresan, “PyMT5: Multi-mode translation of natural language and Python code with transformers,” 2020,.
[56]
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proc. 40th Annu. Meeting Assoc. Comput. Linguistics, 2002, pp. 311–318.
[57]
S. Ren et al., “CodeBLEU: A method for automatic evaluation of code synthesis,” 2020,.

Cited By

View all
  • (2024)Measurement of Embedding Choices on Cryptographic API Completion TasksACM Transactions on Software Engineering and Methodology10.1145/362529133:3(1-30)Online publication date: 15-Mar-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 49, Issue 6
June 2023
316 pages

Publisher

IEEE Press

Publication History

Published: 01 June 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Measurement of Embedding Choices on Cryptographic API Completion TasksACM Transactions on Software Engineering and Methodology10.1145/362529133:3(1-30)Online publication date: 15-Mar-2024

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media