Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3196398.3196454acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

VulinOSS: a dataset of security vulnerabilities in open-source systems

Published: 28 May 2018 Publication History

Abstract

Examining the different characteristics of open-source software in relation to security vulnerabilities, can provide the research community with findings that can lead to the development of more secure systems. We present a dataset where the reported vulnerabilities of 8694 open-source project versions, can be correlated with the corresponding source code and a number of software metrics. The metrics were obtained by analyzing the project's source code via well-established tools. Apart from commonly used metrics (e.g. loc), we also provide data related to modern development trends such as continuous integration and testing. We outline motivational examples based on the dataset we describe.

References

[1]
Moritz Beller, Georgios Gousios, and Andy Zaidman. 2017. TravisTorrent: Synthesizing Travis CI and GitHub for Full-Stack Research on Continuous Integration. In Proceedings of the 14th working conference on mining software repositories.
[2]
Nigel Edwards and Liqun Chen. 2012. An historical examination of open source releases and their vulnerabilities. In Proceedings of the 2012 ACM conference on Computer and communications security (CCS '12). ACM, New York, NY, USA, 183--194.
[3]
Georgios Gousios. 2013. The GHTorrent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR '13). 233--236. /pub/ghtorrent-dataset-toolsuite.pdf Best data showcase paper award.
[4]
Fabio Massacci, Stephan Neuhaus, and Viet Hung Nguyen. 2011. After-life vulnerabilities: a study on firefox evolution, its vulnerabilities, and fixes. In Proceedings of the Third international conference on Engineering secure software and systems (ESSoS'11). Springer-Verlag, Berlin, Heidelberg, 195--208.
[5]
Andrew Meneely, Alberto C. Rodriguez Tejeda, Brian Spates, Shannon Trudeau, Danielle Neuberger, Katherine Whitlock, Christopher Ketant, and Kayla Davis. 2014. An Empirical Investigation of Socio-technical Code Review Metrics and Security Vulnerabilities. In Proceedings of the 6th International Workshop on Social Software Engineering (SSE 2014). ACM, New York, NY, USA, 37--44.
[6]
Dimitris Mitropoulos, Vassilios Karakoidas, Panos Louridas, Georgios Gousios, and Diomidis Spinellis. 2014. The Bug Catalog of the Maven Ecosystem. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR 2014). ACM, New York, NY, USA, 372--375.
[7]
Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating GitHub for engineered software projects. Empirical Software Engineering 22, 6 (Dec. 2017), 3219--3253.
[8]
Andy Ozment and Stuart E. Schechter. 2006. Milk or wine: does software security improve with age?. In Proceedings of the 15th conference on USENIX Security Symposium - Volume 15 (USENIX-SS'06). USENIX Association, Berkeley, CA, USA.
[9]
Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu, and Vladimir Filkov. 2015. Quality and Productivity Outcomes Relating to Continuous Integration in GitHub. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, 805--816.

Cited By

View all
  • (2024)Sharing Software-Evolution Datasets: Practices, Challenges, and RecommendationsProceedings of the ACM on Software Engineering10.1145/36607981:FSE(2051-2074)Online publication date: 12-Jul-2024
  • (2024)JLeaks: A Featured Resource Leak Repository Collected From Hundreds of Open-Source Java ProjectsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639162(1-13)Online publication date: 20-May-2024
  • (2024)The impact of hard and easy negative training data on vulnerability prediction performanceJournal of Systems and Software10.1016/j.jss.2024.112003211:COnline publication date: 2-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '18: Proceedings of the 15th International Conference on Mining Software Repositories
May 2018
627 pages
ISBN:9781450357166
DOI:10.1145/3196398
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. continuous integration
  2. open-source software
  3. security vulnerabilities
  4. testing

Qualifiers

  • Short-paper

Conference

ICSE '18
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)126
  • Downloads (Last 6 weeks)9
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Sharing Software-Evolution Datasets: Practices, Challenges, and RecommendationsProceedings of the ACM on Software Engineering10.1145/36607981:FSE(2051-2074)Online publication date: 12-Jul-2024
  • (2024)JLeaks: A Featured Resource Leak Repository Collected From Hundreds of Open-Source Java ProjectsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639162(1-13)Online publication date: 20-May-2024
  • (2024)The impact of hard and easy negative training data on vulnerability prediction performanceJournal of Systems and Software10.1016/j.jss.2024.112003211:COnline publication date: 2-Jul-2024
  • (2024)VALIDATEInformation and Software Technology10.1016/j.infsof.2024.107448170:COnline publication date: 9-Jul-2024
  • (2023)Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598037(52-63)Online publication date: 12-Jul-2023
  • (2023)A Comprehensive Dataset Towards Hands-on Experience Enhancement in a Research-Involved Cybersecurity ProgramProceedings of the 24th Annual Conference on Information Technology Education10.1145/3585059.3611416(118-124)Online publication date: 11-Oct-2023
  • (2023)Empirical Validation of Automated Vulnerability Curation and CharacterizationIEEE Transactions on Software Engineering10.1109/TSE.2023.325047949:5(3241-3260)Online publication date: 1-May-2023
  • (2023)Neural Transfer Learning for Repairing Security Vulnerabilities in C CodeIEEE Transactions on Software Engineering10.1109/TSE.2022.314726549:1(147-165)Online publication date: 1-Jan-2023
  • (2023)Vulnerability of Open-Source Face Recognition Systems to Blackbox Attacks: A Case Study with InsightFace2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371801(1164-1169)Online publication date: 5-Dec-2023
  • (2023)It’s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179320(1527-1544)Online publication date: May-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media