Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3524842.3528486acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

An alternative issue tracking dataset of public Jira repositories

Published: 17 October 2022 Publication History

Abstract

Organisations use issue tracking systems (ITSs) to track and document their projects' work in units called issues. This style of documentation encourages evolutionary refinement, as each issue can be independently improved, commented on, linked to other issues, and progressed through the organisational workflow. Commonly studied ITSs so far include GitHub, GitLab, and Bugzilla, while Jira, one of the most popular ITS in practice with a wealth of additional information, has yet to receive similar attention. Unfortunately, diverse public Jira datasets are rare, likely due to the difficulty in finding and accessing these repositories. With this paper, we release a dataset of 16 public Jiras with 1822 projects, spanning 2.7 million issues with a combined total of 32 million changes, 9 million comments, and 1 million issue links. We believe this Jira dataset will lead to many fruitful research projects investigating issue evolution, issue linking, cross-project analysis, as well as cross-tool analysis when combined with existing well-studied ITS datasets.

References

[1]
Gabriele Bavota and Barbara Russo. 2016. A large-scale empirical study on self-admitted technical debt. In Proceedings of the 13th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, 315--326.
[2]
Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. 2008. What makes a good bug report?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. Association for Computing Machinery, New York,NY, USA, 308--318.
[3]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77--101.
[4]
D. S. Cruzes and T. Dyba. 2011. Recommended Steps for Thematic Synthesis in Software Engineering. In 2011 International Symposium on Empirical Software Engineering and Measurement. IEEE Computer Society, Banff, Alberta, Canada, 275--284.
[5]
J. Deshmukh, K. M. Annervaz, S. Podder, S. Sengupta, and N. Dubash. 2017. Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, Shanghai, China, 115--124.
[6]
N. A. Ernst and G. C. Murphy. 2012. Case Studies in Just-in-Time Requirements Analysis. In 2012 Second IEEE International Workshop on Empirical Requirements Engineering (EmpiRE). IEEE, Chicago, IL, USA, 25--32.
[7]
Camilo Fitzgerald, Emmanuel Letier, and Anthony Finkelstein. 2012. Early failure prediction in feature request management systems. In 2011 IEEE 19th International Requirements Engineering Conference. IEEE, Springer, Chicago, Illinois, USA, 229--238.
[8]
Jianjun He, Ling Xu, Meng Yan, Xin Xia, and Yan Lei. 2020. Duplicate Bug Report Detection Using Dual-Channel Convolutional Neural Networks. In Proceedings of the 28th International Conference on Program Comprehension (Seoul, Republic of Korea) (ICPC '20). Association for Computing Machinery, New York, NY, USA, 117--127.
[9]
Petra Heck and Andy Zaidman. 2013. An Analysis of Requirements Evolution in Open Source Projects: Recommendations for Issue Trackers. In Proceedings of the 2013 International Workshop on Principles of Software Evolution (IWPSE 2013). Association for Computing Machinery, New York, NY, USA, 43--52.
[10]
Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It's not a bug, it's a feature: How misclassification impacts bug prediction. In 2013 35th International Conference on Software Engineering (ICSE). IEEE Press, San Francisco, CA, USA, 392--401.
[11]
Gaeul Jeong, Sunghun Kim, and Thomas Zimmermann. 2009. Improving Bug Triage with Bug Tossing Graphs. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME) (Amsterdam, The Netherlands) (ESEC/FSE '09). Association for Computing Machinery, New York, NY, USA, 111--120.
[12]
Ahmed Lamkanfi, Serge Demeyer, Emanuel Giger, and Bart Goethals. 2010. Predicting the severity of a reported bug. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). IEEE, IEEE, Cape Town, South Africa, 1--10.
[13]
Ahmed Lamkanfi, Serge Demeyer, Quinten David Soetens, and Tim Verdonck. 2011. Comparing mining algorithms for predicting the severity of a reported bug. In 2011 15th European Conference on Software Maintenance and Reengineering. IEEE, IEEE Computer Society, USA, 249--258.
[14]
Thorsten Merten, Matúš Falis, Paul Hübner, Thomas Quirchmayr, Simone Bürsner, and Barbara Paech. 2016. Software Feature Request Detection in Issue Tracking Systems. In 2016 IEEE 24th International Requirements Engineering Conference (RE). IEEE, Beijing, China, 166--175.
[15]
Lloyd Montgomery and Daniela Damian. 2017. Customer support ticket escalation prediction using feature engineering. In 2017 IEEE 25th international requirements engineering conference (RE). IEEE, IEEE, USA, 362--371.
[16]
Marco Ortu, Giuseppe Destefanis, Bram Adams, Alessandro Murgia, Michele Marchesi, and Roberto Tonelli. 2015. The jira repository dataset: Understanding social aspects of software development. In Proceedings of the 11th international conference on predictive models and data analytics in software engineering. Association for Computing Machinery, New York, NY, USA, 1--4.
[17]
Marco Ortu, Alessandro Murgia, Giuseppe Destefanis, Parastou Tourani, Roberto Tonelli, Michele Marchesi, and Bram Adams. 2016. The emotional side of software developers in JIRA. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, IEEE, Austin, Texas, USA, 480--483.
[18]
Werner Reinartz, Manfred Krafft, and Wayne D Hoyer. 2004. The customer relationship management process: Its measurement and impact on performance. Journal of marketing research 41, 3 (2004), 293--305.
[19]
Igor Steinmacher, Christoph Treude, and Marco Aurelio Gerosa. 2018. Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Software 36, 4 (2018), 41--49.
[20]
Xiaoyin Wang, Lu Zhang, Tao Xie, John Anvik, and Jiasu Sun. 2008. An Approach to Detecting Duplicate Bug Reports Using Natural Language and Execution Information. In Proceedings of the 30th International Conference on Software Engineering (Leipzig, Germany) (ICSE '08). Association for Computing Machinery, New York, NY, USA, 461--470.
[21]
Laerte Xavier, Fabio Ferreira, Rodrigo Brito, and Marco Tulio Valente. 2020. Beyond the code: Mining self-admitted technical debt in issue tracker systems. In Proceedings of the 17th International Conference on Mining Software Repositories. IEEE, USA, 137--146.
[22]
Thomas Zimmermann, Rahul Premraj, Nicolas Bettenburg, Sascha Just, Adrian Schröter, and Cathrin Weiss. 2010. What Makes a Good Bug Report? IEEE Transactions on Software Engineering 36, 5 (2010), 618--643.

Cited By

View all
  • (2024)Impact of data quality for automatic issue classification using pre-trained language modelsJournal of Systems and Software10.1016/j.jss.2023.111838210:COnline publication date: 25-Jun-2024
  • (2024)Issue Links Retrieval for New Issues in Issue Tracking SystemsNatural Language Processing and Information Systems10.1007/978-3-031-70242-6_13(126-138)Online publication date: 20-Sep-2024
  • (2023)Enhancing Software Project Monitoring with Multidimensional Data Repository MiningElectronics10.3390/electronics1218377412:18(3774)Online publication date: 6-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories
May 2022
815 pages
ISBN:9781450393034
DOI:10.1145/3524842
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Short-paper

Funding Sources

  • European Union Horizon 2020 Research and Innovation programme
  • Natural Sciences and Engineering Research Council of Canada (NSERC)

Conference

MSR '22
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)119
  • Downloads (Last 6 weeks)11
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Impact of data quality for automatic issue classification using pre-trained language modelsJournal of Systems and Software10.1016/j.jss.2023.111838210:COnline publication date: 25-Jun-2024
  • (2024)Issue Links Retrieval for New Issues in Issue Tracking SystemsNatural Language Processing and Information Systems10.1007/978-3-031-70242-6_13(126-138)Online publication date: 20-Sep-2024
  • (2023)Enhancing Software Project Monitoring with Multidimensional Data Repository MiningElectronics10.3390/electronics1218377412:18(3774)Online publication date: 6-Sep-2023
  • (2023)Duplicate Bug Report Detection: How Far Are We?ACM Transactions on Software Engineering and Methodology10.1145/357604232:4(1-32)Online publication date: 27-May-2023
  • (2023)Technical Debt Classification in Issue Trackers using Natural Language Processing based on Transformers2023 ACM/IEEE International Conference on Technical Debt (TechDebt)10.1109/TechDebt59074.2023.00017(92-101)Online publication date: May-2023
  • (2023)Process Mining from Jira Issues at a Large Company2023 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58846.2023.00055(425-435)Online publication date: 1-Oct-2023
  • (2023)TLDBERT: Leveraging Further Pre-Trained Model for Issue Typed Links Detection2023 30th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC60848.2023.00077(594-598)Online publication date: 4-Dec-2023
  • (2023)On understanding and predicting issue linksRequirements Engineering10.1007/s00766-023-00406-x28:4(541-565)Online publication date: 21-Sep-2023
  • (2023)Requirements quality research: a harmonized theory, evaluation, and roadmapRequirements Engineering10.1007/s00766-023-00405-y28:4(507-520)Online publication date: 12-Aug-2023
  • (2023)Maestro: A Deep Learning Based Tool to Find and Explore Architectural Design Decisions in Issue Tracking SystemsSoftware Architecture. ECSA 2023 Tracks, Workshops, and Doctoral Symposium10.1007/978-3-031-66326-0_24(390-405)Online publication date: 18-Sep-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media