research-article

Open access

Example-based vulnerability detection and repair in Java code

Authors:

Md Mahir Asef Kabir,

Danfeng (Daphne) Yao, and

Na MengAuthors Info & Claims

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

May 2022

Pages 190 - 201

https://doi.org/10.1145/3524610.3527895

Published: 20 October 2022 Publication History

Abstract

The Java libraries JCA and JSSE offer cryptographic APIs to facilitate secure coding. When developers misuse some of the APIs, their code becomes vulnerable to cyber-attacks. To eliminate such vulnerabilities, people built tools to detect security-API misuses via pattern matching. However, most tools do not (1) fix misuses or (2) allow users to extend tools' pattern sets. To overcome both limitations, we created Seader---an example-based approach to detect and repair security-API misuses. Given an exemplar (insecure, secure) code pair, Seader compares the snippets to infer any API-misuse template and corresponding fixing edit. Based on the inferred info, given a program, Seader performs inter-procedural static analysis to search for security-API misuses and to propose customized fixes.

For evaluation, we applied Seader to 28 (insecure, secure) code pairs; Seader successfully inferred 21 unique API-misuse templates and related fixes. With these (vulnerability, fix) patterns, we applied Seader to a program benchmark that has 86 known vulnerabilities. Seader detected vulnerabilities with 95% precision, 72% recall, and 82% F-score. We also applied Seader to 100 open-source projects and manually checked 77 suggested repairs; 76 of the repairs were correct. Seader can help developers correctly use security APIs.

References

[1]

2016. SLOTH: TLS 1.2 vulnerability (CVE-2015-7575). https://access.redhat.com/articles/2112261.

[2]

2017. Developers lack skills needed for secure DevOps, survey shows. https://www.computerweekly.com/news/450424614/Developers-lack-skills-needed-for-secure-DevOps-survey-shows.

[3]

2019. Too few cybersecurity professionals is a gigantic problem for 2019. https://techcrunch.com/2019/01/27/too-few-cybersecurity-professionals-is-a-gigantic-problem-for-2019/.

[4]

2020. GitHub. https://github.com.

[5]

2020. Java Secure Socket Extension (JSSE) Reference Guide. https://docs.oracle.com/javase/9/security/java-secure-socket-extension-jsse-reference-guide.htm.

[6]

2020. StackOverflow. https://stackoverflow.com.

[7]

2020. WALA IR. https://github.com/wala/WALA/wiki/Intermediate-Representation-(IR).

[8]

2021. ApacheCryptoAPI-Bench. https://github.com/CryptoAPI-Bench/ApacheCryptoAPI-Bench/tree/main/apache_codes.

[9]

2021. Find Security Bugs. https://find-sec-bugs.github.io/

[10]

2021. Maven. https://maven.apache.org.

[11]

2021. SonarQube. https://github.com/SonarSource/sonarqube.

[12]

Sharmin Afrose, Ya Xiao, Sazzadur Rahaman, Barton P. Miller, Danfeng, and Yao. 2021. Evaluation of Static Vulnerability Detection Tools with Java Cryptographic API Benchmarks. arXiv:2112.04037 [cs.CR]

[13]

Kijin An, Na Meng, and Eli Tilevich. 2018. Automatic Inference of Java-to-Swift Translation Rules for Porting Mobile Applications. In Proceedings of the 5th International Conference on Mobile Software Engineering and Systems (Gothenburg, Sweden) (MOBILESoft '18). Association for Computing Machinery, New York, NY, USA, 180--190.

Digital Library

[14]

Mengsu Chen, Felix Fischer, Na Meng, Xiaoyin Wang, and Jens Grossklags. 2019. How Reliable is the Crowdsourced Knowledge of Security Implementation?. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 536--547.

Digital Library

[15]

Manuel Egele, David Brumley, Yanick Fratantonio, and Christopher Kruegel. 2013. An Empirical Study of Cryptographic Misuse in Android Applications. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (Berlin, Germany) (CCS '13). Association for Computing Machinery, New York, NY, USA, 73--84.

Digital Library

[16]

Sascha Fahl, Marian Harbach, Thomas Muders, Lars Baumgärtner, Bernd Freisleben, and Matthew Smith. 2012. Why Eve and Mallory Love Android: An Analysis of Android SSL (in)Security. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (Raleigh, North Carolina, USA) (CCS '12). Association for Computing Machinery, New York, NY, USA, 50--61.

Digital Library

[17]

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-Grained and Accurate Source Code Differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (Vasteras, Sweden) (ASE '14). Association for Computing Machinery, New York, NY, USA, 313--324.

Digital Library

[18]

Felix Fischer, Konstantin Böttinger, Huang Xiao, Christian Stransky, Yasemin Acar, Michael Backes, and Sascha Fahl. 2017. Stack Overflow Considered Harmful? The Impact of Copy amp;Paste on Android Application Security. In 2017 IEEE Symposium on Security and Privacy (SP). 121--136.

[19]

Beat Fluri, Michael Wursch, Martin PInzger, and Harald Gall. 2007. Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction. IEEE Transactions on Software Engineering 33, 11 (2007), 725--743.

Digital Library

[20]

Martin Georgiev, Subodh Iyengar, Suman Jana, Rishita Anubhai, Dan Boneh, and Vitaly Shmatikov. 2012. The Most Dangerous Code in the World: Validating SSL Certificates in Non-Browser Software. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (Raleigh, North Carolina, USA) (CCS '12). Association for Computing Machinery, New York, NY, USA, 38--49.

Digital Library

[21]

Matthew Green and Matthew Smith. 2016. Developers are Not the Enemy!: The Need for Usable Security APIs. IEEE Security & Privacy 14, 5 (2016), 40--46.

Digital Library

[22]

Boyuan He, Vaibhav Rastogi, Yinzhi Cao, Yan Chen, V.N. Venkatakrishnan, Runqing Yang, and Zhenrui Zhang. 2015. Vetting SSL Usage in Applications with SSLINT. In 2015 IEEE Symposium on Security and Privacy. 519--534.

Digital Library

[23]

Roya Hosseini and Peter Brusilovsky. 2013. Javaparser: A fine-grain concept indexing tool for java problems. In CEUR Workshop Proceedings, Vol. 1009. University of Pittsburgh, 60--63.

[24]

Java Cryptography Architecture 2021. Java Cryptography Architecture. https://docs.oracle.com/javase/9/security/java-cryptography-architecture-jca-reference-guide.htm.

[25]

B. Kaliski. 2000. PKCS #5: Password-Based Cryptography Specification Version 2.0. RFC 2898 (Informational). http://www.ietf.org/rfc/rfc2898.txt

[26]

Stefan Krüger, Sarah Nadi, Michael Reif, Karim Ali, Mira Mezini, Eric Bodden, Florian Göpfert, Felix Günther, Christian Weinert, Daniel Demmler, and Ram Kamath. 2017. CogniCrypt: Supporting developers in using cryptography. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 931--936.

[27]

Stefan Krüger, Johannes Späth, Karim Ali, Eric Bodden, and Mira Mezini. 2021. CrySL: An Extensible Approach to Validating the Correct Usage of Cryptographic APIs. IEEE Transactions on Software Engineering 47, 11, 2382--2400.

[28]

VI Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady 10 (1966), 707.

[29]

Siqi Ma, David Lo, Teng Li, and Robert H Deng. 2016. CDRep: Automatic Repair of Cryptographic Misuses in Android Applications. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security (Xi'an, China) (ASIA CCS '16). Association for Computing Machinery, New York, NY, USA, 711?722.

Digital Library

[30]

Siqi Ma, Ferdian Thung, David Lo, Cong Sun, and Robert Deng. 2017. VuRLE: Automatic Vulnerability Detection and Repair by Learning from Examples. 229--246.

[31]

James Manger. 2001. A Chosen Ciphertext Attack on RSA Optimal Asymmetric Encryption Padding (OAEP) as Standardized in PKCS #1 v2.0. In Advances in Cryptology --- CRYPTO 2001, Joe Kilian (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 230--238.

[32]

Junnosuke Matsumoto, Yoshiki Higo, and Shinji Kusumoto. 2019. Beyond GumTree: A Hybrid Approach to Generate Edit Scripts. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 550--554.

Digital Library

[33]

Florian Mendel, Tomislav Nad, and Martin Schläffer. 2013. Improving Local Collisions: New Attacks on Reduced SHA-256. In Advances in Cryptology - EUROCRYPT 2013, Thomas Johansson and Phong Q. Nguyen (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 262--278.

[34]

Na Meng, Miryung Kim, and Kathryn S. McKinley. 2011. Systematic Editing: Generating Program Transformations from an Example. SIGPLAN Not. 46, 6, 329--342.

Digital Library

[35]

Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. Lase: Locating and applying systematic edits by learning from examples. In 2013 35th International Conference on Software Engineering (ICSE). 502--511.

[36]

Na Meng, Stefan Nagy, Danfeng Yao, Wenjie Zhuang, and Gustavo Arango-Argoty. 2018. Secure Coding Practices in Java: Challenges and Vulnerabilities. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 372--383.

Digital Library

[37]

Robert C. Miller and Brad A. Myers. 2001. Interactive Simultaneous Editing of Multiple Text Regions. In Proceedings of the General Track: 2001 USENIX Annual Technical Conference. USENIX Association, USA, 161--174.

[38]

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping Through Hoops: Why Do Java Developers Struggle with Cryptography APIs?. In Proceedings of the 38th International Conference on Software Engineering (Austin, Texas) (ICSE). ACM, New York, NY, USA, 935--946.

Digital Library

[39]

Nam H. Pham, Tung Thanh Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen. 2010. Detection of Recurring Software Vulnerabilities. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (Antwerp, Belgium) (ASE '10). Association for Computing Machinery, New York, NY, USA, 447--456.

Digital Library

[40]

Sazzadur Rahaman, Ya Xiao, Sharmin Afrose, Fahad Shaon, Ke Tian, Miles Frantz, Murat Kantarcioglu, and Danfeng (Daphne) Yao. 2019. CryptoGuard: High Precision Detection of Cryptographic Vulnerabilities in Massive-Sized Java Projects. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). Association for Computing Machinery, New York, NY, USA, 2455--2472.

Digital Library

[41]

Reudismam Rolim, Gustavo Soares, Loris D'Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning Syntactic Program Transformations from Examples. In Proceedings of the 39th International Conference on Software Engineering (Buenos Aires, Argentina) (ICSE '17). IEEE Press, 404--415.

Digital Library

[42]

Yaron Sheffer, Ralph Holz, and Peter Saint-Andre. 2015. Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS). RFC 7525.

Digital Library

[43]

Harshal Tupsamudre, Monika Sahu, Kumar Vidhani, and Sachin Lodha. 2020. Fixing the Fixes: Assessing the Solutions of SAST Tools for Securing Password Storage. In Financial Cryptography and Data Security: FC 2020 International Workshops, AsiaUSEC, CoDeFi, VOTING, and WTSC, Kota Kinabalu, Malaysia, February 14, 2020, Revised Selected Papers (Kota Kinabalu, Malaysia). Springer-Verlag, Berlin, Heidelberg, 192?206.

Digital Library

[44]

Shengzhe Xu, Ziqi Dong, and Na Meng. 2019. Meditor: Inference and Application of API Migration Edits. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC). 335--346.

Digital Library

Cited By

Firouzi EGhafari M(2024)Time to separate from StackOverflow and match with ChatGPT for encryptionJournal of Systems and Software10.1016/j.jss.2024.112135216(112135)Online publication date: Oct-2024
https://doi.org/10.1016/j.jss.2024.112135
Torres ACosta PAmaral LPastro JBonifácio Rd'Amorim MLegunsen OBodden EDias Canedo E(2023)Runtime Verification of Crypto APIs: An Empirical StudyIEEE Transactions on Software Engineering10.1109/TSE.2023.330166049:10(4510-4525)Online publication date: 21-Aug-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3301660
Yu QHuang ZGu N(2023)Pseudocode to Code Based on Adaptive Global and Local Information2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00016(61-72)Online publication date: Mar-2023
https://doi.org/10.1109/SANER56733.2023.00016
Show More Cited By

Index Terms

Example-based vulnerability detection and repair in Java code
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software notations and tools
    1. Software maintenance tools

Recommendations

Automatic repair of pollution attacks in SDN topology network based on SOM algorithm

When using the current method to recover the vulnerabilities caused by pollution attacks in SDN network, there are some problems, such as low coverage, low accuracy of vulnerability detection results and low repair efficiency. To this end, an automatic ...
Read More
sGuard+: Machine Learning Guided Rule-Based Automated Vulnerability Repair on Smart Contracts
Smart contracts are becoming appealing targets for hackers because of the vast amount of cryptocurrencies under their control. Asset loss due to the exploitation of smart contract codes has increased significantly in recent years. To guarantee that smart ...
Read More
Data-Flow Based Analysis of Java Bytecode Vulnerability
WAIM '08: Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management

Java is widely used because its security and platform independence. Although Java's security model is designed for protecting users from untrusted sources, Java's security is not under fully control at the application level. A large number of Java ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

May 2022

698 pages

ISBN:9781450392983

DOI:10.1145/3524610

Conference Chairs:
Ayushi Rastogi
University of Groningen, The Netherlands
,
Rosalia Tufano
USI Università della Svizzera italiana, Switzerland
,
General Chair:
Gabriele Bavota
USI Università della Svizzera italiana, Switzerland
,
Program Chairs:
Venera Arnaoudova
Washington State University, United States of America
,
Sonia Haiduc
Florida State University, United States of America

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2022

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

ICPC '22

Sponsor:

SIGSOFT

ICPC '22: 30th International Conference on Program Comprehension

May 16 - 17, 2022

Virtual Event

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
727
Total Downloads

Downloads (Last 12 months)507
Downloads (Last 6 weeks)52

Other Metrics

View Author Metrics

Citations

Cited By

Firouzi EGhafari M(2024)Time to separate from StackOverflow and match with ChatGPT for encryptionJournal of Systems and Software10.1016/j.jss.2024.112135216(112135)Online publication date: Oct-2024
https://doi.org/10.1016/j.jss.2024.112135
Torres ACosta PAmaral LPastro JBonifácio Rd'Amorim MLegunsen OBodden EDias Canedo E(2023)Runtime Verification of Crypto APIs: An Empirical StudyIEEE Transactions on Software Engineering10.1109/TSE.2023.330166049:10(4510-4525)Online publication date: 21-Aug-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3301660
Yu QHuang ZGu N(2023)Pseudocode to Code Based on Adaptive Global and Local Information2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00016(61-72)Online publication date: Mar-2023
https://doi.org/10.1109/SANER56733.2023.00016
Zhang NChen QZheng ZZou Y(2023)iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping Algorithm2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00178(863-874)Online publication date: 11-Sep-2023
https://doi.org/10.1109/ASE56229.2023.00178
Wijekoon AWiratunga N(2023)A user-centred evaluation of DisCERNKnowledge-Based Systems10.1016/j.knosys.2023.110830278:COnline publication date: 25-Oct-2023
https://dl.acm.org/doi/10.1016/j.knosys.2023.110830
Kabir MWang YYao DMeng N(2022)How Do Developers Follow Security-Relevant Best Practices When Using NPM Packages?2022 IEEE Secure Development Conference (SecDev)10.1109/SecDev53368.2022.00027(77-83)Online publication date: Oct-2022
https://doi.org/10.1109/SecDev53368.2022.00027
Xiao YSpinellis DGousios GChechik MDi Penta M(2021)Multi-location cryptographic code repair with neural-network-based methodologiesProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473102(1640-1644)Online publication date: 20-Aug-2021
https://dl.acm.org/doi/10.1145/3468264.3473102

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents