Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3611643.3613887acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Issue Report Validation in an Industrial Context

Published: 30 November 2023 Publication History

Abstract

Effective issue triaging is crucial for software development teams to improve software quality, and thus customer satisfaction. Validating issue reports manually can be time-consuming, hindering the overall efficiency of the triaging process. This paper presents an approach on automating the validation of issue reports to accelerate the issue triaging process in an industrial set-up. We work on 1,200 randomly selected issue reports in banking domain, written in Turkish, an agglutinative language, meaning that new words can be formed with linear concatenation of suffixes to express entire sentences. We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid. Since the issue reports we work on are written in an agglutinative language, we use morphological analysis to extract the features. Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports’ validity, performing a 0.77 F1-score.

References

[1]
Ahmet Afsin Ak∈ and Mehmet Dündar Ak∈. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10, 2007 (2007), 1–5.
[2]
Ethem Utku Aktas and Cemal Yilmaz. 2020. Automated issue assignment: results and insights from an industrial case. Empirical Software Engineering, 25, 5 (2020), 3544–3589.
[3]
Ethem Utku Aktas and Cemal Yilmaz. 2022. Using Screenshot Attachments in Issue Reports for Triaging. Empirical Software Engineering, 27, 7 (2022), 181.
[4]
Giuliano Antoniol, Kamel Ayari, Massimiliano Di Penta, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2008. Is it a bug or an enhancement? A text-based approach to classify change requests. In Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds. 304–318.
[5]
Shikhar Bharadwaj and Tushar Kadam. 2022. Github issue classification using bert-style models. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 40–43.
[6]
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng. 2017. Detecting missing information in bug descriptions. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 396–407.
[7]
Giuseppe Colavito, Filippo Lanubile, and Nicole Novielli. 2022. Issue report classification using pre-trained language models. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 29–32.
[8]
Cagri Cöltekin. 2010. A Freely Available Morphological Analyzer for Turkish. In LREC. 2, 19–28.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[10]
Geoff Dougherty. 2012. Pattern recognition and classification: an introduction. Springer Science & Business Media.
[11]
Jianjun He, Ling Xu, Yuanrui Fan, Zhou Xu, Meng Yan, and Yan Lei. 2020. Deep learning based valid bug reports determination and explanation. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). 184–194.
[12]
Steffen Herbold, Alexander Trautsch, and Fabian Trautsch. 2020. On the feasibility of automated prediction of bug and non-bug issues. Empirical Software Engineering, 25 (2020), 5333–5369.
[13]
Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In 2013 35th international conference on software engineering (ICSE). 392–401.
[14]
Maliheh Izadi. 2022. Catiss: An intelligent tool for categorizing issues reports using transformers. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 44–47.
[15]
Maliheh Izadi, Kiana Akbari, and Abbas Heydarnoori. 2022. Predicting the objective and priority of issue reports in software repositories. Empirical Software Engineering, 27, 2 (2022), 50.
[16]
Thorsten Joachims. 2005. Text categorization with support vector machines: Learning with many relevant features. In Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23, 1998 Proceedings. 137–142.
[17]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
[18]
Rafael Kallis, Oscar Chaparro, Andrea Di Sorbo, and Sebastiano Panichella. 2022. Nlbse’22 tool competition. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 25–28.
[19]
Rafael Kallis, Andrea Di Sorbo, Gerardo Canfora, and Sebastiano Panichella. 2019. Ticket tagger: Machine learning driven issue classification. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). 406–409.
[20]
Rafael Kallis, Andrea Di Sorbo, Gerardo Canfora, and Sebastiano Panichella. 2021. Predicting issue types on GitHub. Science of Computer Programming, 205 (2021), 102598.
[21]
Kemal Oflazer. 1994. Two-level description of Turkish morphology. Literary and linguistic computing, 9, 2 (1994), 137–148.
[22]
Kemal Oflazer. 2014. Turkish and its challenges for language processing. Language resources and evaluation, 48 (2014), 639–653.
[23]
Ahmed Fawzi Otoom, Sara Al-jdaeh, and Maen Hammad. 2019. Automated classification of software bug reports. In proceedings of the 9th international conference on information communication and management. 17–21.
[24]
Nitish Pandey, Debarshi Kumar Sanyal, Abir Hudait, and Amitava Sen. 2017. Automated classification of software issue reports using machine learning techniques: an empirical study. Innovations in Systems and Software Engineering, 13 (2017), 279–297.
[25]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, and Vincent Dubourg. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12 (2011), 2825–2830.
[26]
Quentin Perez, Pierre-Antoine Jean, Christelle Urtado, and Sylvain Vauttier. 2021. Bug or not bug? That is the question. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). 47–58.
[27]
Natthakul Pingclasai, Hideaki Hata, and Ken-ichi Matsumoto. 2013. Classifying bug reports to bugs and other requests using topic modeling. In 2013 20Th asia-pacific software engineering conference (APSEC). 2, 13–18.
[28]
Hanmin Qin and Xin Sun. 2018. Classifying bug reports into bugs and non-bugs using LSTM. In Proceedings of the 10th Asia-Pacific Symposium on Internetware. 1–4.
[29]
Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to information retrieval. 39, Cambridge University Press Cambridge.
[30]
Mohammed Latif Siddiq and Joanna CS Santos. 2022. Bert-based github issue report classification. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 33–36.
[31]
Yang Song and Oscar Chaparro. 2020. BEE: a tool for structuring and analyzing bug reports. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1551–1555.
[32]
Pannavat Terdchanakul, Hideaki Hata, Passakorn Phannachitta, and Kenichi Matsumoto. 2017. Bug or not? bug report classification using n-gram idf. In 2017 IEEE international conference on software maintenance and evolution (ICSME). 534–538.
[33]
Alexander Trautsch and Steffen Herbold. 2022. Predicting issue types with sebert. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE). 37–39.
[34]
Xiaoyuan Xie, Yuhui Su, Songqiang Chen, Lin Chen, Jifeng Xuan, and Baowen Xu. 2021. MULA: A just-in-time multi-labeling system for issue reports. IEEE Transactions on Reliability, 71, 1 (2021), 250–263.
[35]
Yu Zhou, Yanxiang Tong, Ruihang Gu, and Harald Gall. 2016. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process, 28, 3 (2016), 150–176.

Cited By

View all
  • (2024)AgraBOT: Accelerating Third-Party Security Risk Management in Enterprise Setting through Generative AICompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663829(74-79)Online publication date: 10-Jul-2024
  • (2024)Improving the quality of software issue report descriptions in Turkish: An industrial case study at SofttechEmpirical Software Engineering10.1007/s10664-023-10434-429:2Online publication date: 12-Feb-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automated issue classification
  2. issue report validation
  3. text analysis

Qualifiers

  • Research-article

Conference

ESEC/FSE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)65
  • Downloads (Last 6 weeks)2
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)AgraBOT: Accelerating Third-Party Security Risk Management in Enterprise Setting through Generative AICompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663829(74-79)Online publication date: 10-Jul-2024
  • (2024)Improving the quality of software issue report descriptions in Turkish: An industrial case study at SofttechEmpirical Software Engineering10.1007/s10664-023-10434-429:2Online publication date: 12-Feb-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media