Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3530019.3530021acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging

Published: 13 June 2022 Publication History

Abstract

In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users’ experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square (χ2), Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+χ2) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76%, recall of 70%, and F1 of 70%. However, the same approach returned low precision of 48%, recall of 15%, and F1 of 23% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.

References

[1]
Charu C Aggarwal and ChengXiang Zhai. 2012. Mining text data. Springer Science & Business Media.
[2]
Karan Aggarwal, Finbarr Timbers, Tanner Rutgers, Abram Hindle, Eleni Stroulia, and Russell Greiner. 2017. Detecting duplicate bug reports with software engineering domain knowledge. Journal of Software: Evolution and Process 29, 3 (2017), e1821.
[3]
Thazin Win Win Aung, Yao Wan, Huan Huo, and Yulei Sui. 2022. Multi-triage: A multi-task learning framework for bug triage. Journal of Systems and Software 184 (2022), 111133.
[4]
Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiß, Rahul Premraj, and Thomas Zimmermann. 2007. Quality of bug reports in Eclipse. In 2007 OOPSLA workshop on eclipse technology eXchange. ACM, 21–25.
[5]
Barry Boehm and Victor R Basili. 2005. Software defect reduction top 10 list. Foundations of empirical software engineering: the legacy of Victor R. Basili 426, 37 (2005).
[6]
Barry W Boehm, John R Brown, and Mlity Lipow. 1976. Quantitative evaluation of software quality. In 2nd International Conference on Software engineering. IEEE Computer Society Press, 592–605.
[7]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
[8]
Rich Caruana and Alexandru Niculescu-Mizil. 2006. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd international conference on Machine learning. ACM, 161–168.
[9]
Oscar Chaparro, Juan Manuel Florez, and Andrian Marcus. 2019. Using bug descriptions to reformulate queries during text-retrieval-based bug localization. Empirical Software Engineering 24, 5 (2019), 2947–3007.
[10]
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.
[11]
Larissa Chazette and Kurt Schneider. 2020. Explainability as a non-functional requirement: challenges and recommendations. Requirements Engineering 25, 4 (2020), 493–514.
[12]
Lawrence Chung, Brian A Nixon, Eric Yu, and John Mylopoulos. 2012. Non-functional requirements in software engineering. Vol. 5. Springer Science & Business Media.
[13]
Paolo Ciancarini, Angelo Messina, Francesco Poggi, and Daniel Russo. 2018. Agile knowledge engineering for mission critical software requirements. In Synergies Between Knowledge Engineering & Software Engineering. Springer, 151–171.
[14]
Thomas M Cover, Peter E Hart, 1967. Nearest neighbor pattern classification. IEEE transactions on information theory 13, 1 (1967), 21–27.
[15]
Manoranjan Dash and Huan Liu. 1997. Feature selection for classification. Intelligent data analysis 1, 3 (1997), 131–156.
[16]
Organización Internacional de Normalización. 2011. ISO-IEC 25010: Systems and Software Engineering-Systems and Software Quality Requirements and Evaluation -System and Software Quality Models. ISO.
[17]
R. Geoff Dromey. 1995. A model for software product quality. IEEE TSE 21, 2 (1995), 146–162.
[18]
Paul M Duvall, Steve Matyas, and Andrew Glover. 2007. Continuous integration: improving software quality and reducing risk. Pearson Education.
[19]
Jonas Eckhardt, Andreas Vogelsang, and Daniel Méndez Fernández. 2016. On the distinction of functional and quality requirements in practice. In International Conference on Product-Focused Software Process Improvement. Springer, 31–47.
[20]
Jayalath Bandara Ekanayake. 2021. Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments. International Journal of Systems and Service-Oriented Engineering (IJSSOE) 11, 1 (2021), 31–42.
[21]
Katrin Erk and Sebastian Padó. 2008. A structured vector space model for word meaning in context. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 897–906.
[22]
Tom Fawcett. 2004. ROC graphs: Notes and practical considerations for researchers. ML 31, 1 (2004), 1–38.
[23]
Ronen Feldman, James Sanger, 2007. The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press.
[24]
International Organization for Standardization. 2005. Quality Management Systems–Fundamentals and Vocabulary. International Organization for Standardization.
[25]
George Forman. 2003. An extensive empirical study of feature selection metrics for text classification. Journal of ML Research 3, Mar (2003), 1289–1305.
[26]
Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine learning 63, 1 (2006), 3–42.
[27]
Martin Glinz. 2005. Rethinking the notion of non-functional requirements. In Proc. Third World Congress for Software Quality, Vol. 2. 55–64.
[28]
Katerina Goseva-Popstojanova and Jacob Tyo. 2018. Identification of Security Related Bug Reports via Text Mining Using Supervised and Unsupervised Classification. In 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). IEEE, 344–355.
[29]
Donna D Gregorio. 2012. How the Business Analyst supports and encourages collaboration on agile projects. In International Systems Conference. IEEE, 1–4.
[30]
Eduard C Groen, Sylwia Kopczyńska, Marc P Hauer, Tobias D Krafft, and Joerg Doerr. 2017. Users—the hidden software product quality experts?: A study on how app users report quality aspects in online reviews. In 2017 IEEE 25th international requirements engineering conference (RE). IEEE, 80–89.
[31]
Rahul Gupta, Jay Pujara, Craig A Knoblock, Shushyam M Sharanappa, Bharat Pulavarti, Gerard Hoberg, and Gordon Phillips. 2018. Feature selection methods for understanding business competitor relationships. In Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets. 1–6.
[32]
Julian Harty. 2011. Finding usability bugs with automated tests. Commun. ACM 54, 2 (2011), 44–49.
[33]
Haibo He and Edwardo A Garcia. 2009. Learning from imbalanced data. IEEE Data Engineering 21, 9 (2009), 1263–1284.
[34]
Haibo He and Edwardo A Garcia. 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering 21, 9(2009), 1263–1284.
[35]
Haruna Isotani, Hironori Washizaki, Yoshiaki Fukazawa, Tsutomu Nomoto, Saori Ouji, and Shinobu Saito. 2021. Duplicate Bug Report Detection by Using Sentence Embedding and Fine-tuning. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 535–544.
[36]
I Jacobson, G Booch, and J Rumbaugh. 1999. The Unifed Software Development Process.
[37]
Nicholas Jalbert and Westley Weimer. 2008. Automated duplicate detection for bug tracking systems. In Dependable Systems and Networks With FTCS and DCC. IEEE International Conference on. IEEE, 52–61.
[38]
Nishant Jha and Anas Mahmoud. 2019. Mining non-functional requirements from app store reviews. Empirical Software Engineering 24, 6 (2019), 3659–3695.
[39]
Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World Performance Bugs. ACM 47, 6 (2012), 77–88.
[40]
Li-Ping Jing, Hou-Kuan Huang, and Hong-Bo Shi. 2002. Improved feature selection approach TFIDF in text mining. In Machine Learning and Cybernetics, 2002. International Conference on, Vol. 2. IEEE, 944–946.
[41]
Gerald Kotonya and Ian Sommerville. 1998. Requirements engineering: processes and techniques. Wiley Publishing.
[42]
Rrezarta Krasniqi and Ankit Agrawal. 2021. Analyzing and Detecting Emerging Quality-Related Concerns across OSS Defect Report Summaries. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering. IEEE, 12–23.
[43]
Rrezarta Krasniqi and Jane Cleland-Huang. 2020. Enhancing source code refactoring detection with explanations from commit messages. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 512–516.
[44]
Sari Kujala, Marjo Kauppinen, Laura Lehtola, and Tero Kojo. 2005. The role of user involvement in requirements quality and project success. In 13th International Conference on Requirements Engineering. IEEE, 75–84.
[45]
Zijad Kurtanović and Walid Maalej. 2017. Automatically classifying functional and non-functional requirements using supervised machine learning. In 2017 IEEE 25th International Requirements Engineering Conference (RE). Ieee, 490–495.
[46]
Ahmed Lamkanfi, Serge Demeyer, Quinten David Soetens, and Tim Verdonck. 2011. Comparing mining algorithms for predicting the severity of a reported bug. In Software Maintenance and Reengineering. IEEE, 249–258.
[47]
Stefan Lessmann, Bart Baesens, Christophe Mues, and Swantje Pietsch. 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE TSE 34, 4 (2008), 485–496.
[48]
Feng-Lin Li, Jennifer Horkoff, Alexander Borgida, Giancarlo Guizzardi, Lin Liu, and John Mylopoulos. 2015. From stakeholder requirements to formal specifications through refinement. In International Conference on Requirements Engineering: Foundation for Software Quality. Springer, 164–180.
[49]
William Lidwell, Kritina Holden, and Jill Butler. 2010. Universal principles of design, revised and updated: 125 ways to enhance usability, influence perception, increase appeal, make better design decisions, and teach through design. Rockport.
[50]
Edward Loper and Steven Bird. 2002. NLTK: the natural language toolkit. arXiv (2002).
[51]
Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding variable importances in forests of randomized trees. In Advances in Neural Processing Systems. 431–439.
[52]
Mengmeng Lu and Peng Liang. 2017. Automatic Classification of Non-Functional Requirements from Augmented App User Reviews. In Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering. ACM, 344–353.
[53]
Walid Maalej, Zijad Kurtanović, Hadeer Nabil, and Christoph Stanik. 2016. On the automatic classification of app reviews. RE 21, 3 (2016), 311–331.
[54]
Ginika Mahajan and Neha Chaudhary. 2022. Bug Classifying and Assigning System (BCAS): An Automated Framework to Classify and Assign Bugs. In Smart Systems: Innovations in Computing. Springer, 537–547.
[55]
Muhammad Imran Manzoor, Momina Shaheen, Hudaibia Khalid, Aimen Anum, Nisar Hussain, and M Rehan Faheem. 2018. Requirement Elicitation Methods for Cloud Providers in IT Industry.Journal of Modern Education 10, 10 (2018).
[56]
Andrew McCallum, Kamal Nigam, 1998. A comparison of event models for naive bayes text classification. In Workshop on learning for text categorization), Vol. 752. Citeseer, 41–48.
[57]
Thomas Moscibroda and Onur Mutlu. 2007. Memory performance attacks: Denial of memory service in multi-core systems. In Proceedings of 16th USENIX Security Symposium. USENIX, 18.
[58]
Bashar Nuseibeh and Steve Easterbrook. 2000. Requirements engineering: a roadmap. In Proceedings of the Conference on Software Engineering. ACM, 35–46.
[59]
Cristina Palomares, Xavier Franch, Carme Quer, Panagiota Chatzipetrou, Lidia López, and Tony Gorschek. 2021. The state-of-practice in requirements elicitation: an extended interview study at 12 companies. RE 26, 2 (2021), 273–299.
[60]
Daniel Pletea, Bogdan Vasilescu, and Alexander Serebrenik. 2014. Security and emotion: sentiment analysis of security discussions on github. In Proceedings of the 11th working conference on mining software repositories. 348–351.
[61]
Martin F Porter. 1980. An algorithm for suffix stripping. Program 14, 3 (1980), 130–137.
[62]
Michael Rath, David Lo, and Patrick Mäder. 2018. Analyzing requirements and traceability information to improve bug localization. In Proceedings of the 15th Conference on Mining Software Repositories. 442–453.
[63]
Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.
[64]
Walt Scacchi. 2002. Understanding the requirements for developing open source software systems. IEEE Proceedings-Software 149, 1 (2002), 24–39.
[65]
Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering 40, 10 (2014), 993–1006.
[66]
Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM computing surveys 34, 1 (2002), 1–47.
[67]
P Sharma and S Dhir. 2016. Functional & non-functional requirement elicitation and risk assessment for agile processes. International Journal of Control Theory and Applications 9, 18(2016), 9005–9010.
[68]
Frederick T. Sheldon, Krishna M. Kavi, Robert C Tausworthe, James T. Yu, Ralph Brettschneider, and William W. Everett. 1992. Reliability measurement: From theory to practice. IEEE Software 9, 4 (1992), 13–20.
[69]
Emad Shihab, Akinori Ihara, Yasutaka Kamei, Walid M Ibrahim, Masao Ohira, Bram Adams, Ahmed E Hassan, and Ken-ichi Matsumoto. 2013. Studying re-opened bugs in open source software. EMSE 18, 5 (2013), 1005–1042.
[70]
Andreia Silva, Plácido Pinheiro, Adriano Albuquerque, and Jônatas Barroso. 2016. A process for creating the elicitation guide of non-functional requirements. In Computer Science On-line Conference. Springer, 293–302.
[71]
Richard Sproat, Alan W Black, Stanley Chen, Shankar Kumar, Mari Ostendorf, and Christopher Richards. 2001. Normalization of non-standard words. Computer speech & language 15, 3 (2001), 287–333.
[72]
Reinhard Stoiber. 2020. Utilising Perspectives to Improve Completeness in Industrial Requirements Specifications. In 2020 IEEE 28th International Requirements Engineering Conference (RE). IEEE, 408–409.
[73]
Pang-Ning Tan. 2018. Introduction to data mining. Pearson.
[74]
Yuan Tian, Dinusha Wijedasa, David Lo, and Claire Le Goues. 2016. Learning to rank for bug report assignee recommendation. In Program Comprehension (ICPC), 2016 IEEE 24th International Conference on. IEEE, 1–10.
[75]
Michael B Twidale and David M Nichols. 2005. Exploring usability discussions in open source development. In Proceedings of the 38th Annual Hawaii International Conference on. IEEE, 198c–198c.
[76]
Vladimir Naumovich Vapnik. 1999. An overview of statistical learning theory. Transactions on neural networks 10, 5 (1999), 988–999.
[77]
Stephan Vogel, Sanjika Hewavitharana, Muntsin Kolss, and Alex Waibel. 2004. The ISL statistical translation system for spoken language translation. In Proceedings of the 1st International Workshop on Spoken Language Translation.
[78]
Stefan Wagner, Daniel Méndez Fernández, Michael Felderer, Antonio Vetrò, Marcos Kalinowski, Roel Wieringa, Dietmar Pfahl, Tayana Conte, Marie-Therese Christiansson, Desmond Greer, 2019. Status quo in requirements engineering: A theory and a global family of surveys. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 2(2019), 1–48.
[79]
James Walden, Jeff Stuckman, and Riccardo Scandariato. 2014. Predicting vulnerable components: Software metrics vs text mining. In 2014 IEEE 25th international symposium on software reliability engineering. IEEE, 23–33.
[80]
Chong Wang, Fan Zhang, Peng Liang, Maya Daneva, and Marten van Sinderen. 2018. Can app change logs improve requirements classification from app reviews?: an exploratory study. In 12th International Symposium on Empirical Software Methodologies and Engineering. ACM, 43.
[81]
Dumidu Wijayasekara, Milos Manic, and Miles McQueen. 2014. Vulnerability identification and classification via text mining bug databases. In IECON 2014-40th Annual Conference of the IEEE Industrial Electronics Society. IEEE, 3612–3618.
[82]
Grant Williams and Anas Mahmoud. 2017. Mining twitter feeds for software user requirements. In 2017 IEEE 25th International Requirements Engineering Conference (RE). IEEE, 1–10.
[83]
Xin Xia, David Lo, Emad Shihab, Xinyu Wang, and Bo Zhou. 2015. Automatic, high accuracy prediction of reopened bugs. Automated Software Engineering 22, 1 (2015), 75–109.
[84]
Jingwei Yang and Lin Liu. 2020. What Users Think about Predictive Analytics? A Survey on NFRs. In 2020 IEEE 28th International Requirements Engineering Conference (RE). IEEE, 340–345.
[85]
Shahed Zaman, Bram Adams, and Ahmed E. Hassan. 2012. A qualitative study on performance bugs. In 9th Conference of Mining Software Repositories, MSR June 2-3, 2012, Zurich, Switzerland. 199–208.
[86]
Deqing Zou, Zhijun Deng, Zhen Li, and Hai Jin. 2018. Automatically identifying security bug reports via multitype features analysis. In Information Security and Privacy. Springer, 619–633.

Cited By

View all
  • (2023)A Survey on Bug Deduplication and Triage Methods from Multiple Points of ViewApplied Sciences10.3390/app1315878813:15(8788)Online publication date: 29-Jul-2023
  • (2023)Similar Bug Reports Recommendation System using BERTProceedings of the XXXVII Brazilian Symposium on Software Engineering10.1145/3613372.3613396(378-387)Online publication date: 25-Sep-2023
  • (2023)Capturing Contextual Relationships of Buggy Classes for Detecting Quality-Related Bugs2023 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58846.2023.00048(375-379)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EASE '22: Proceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering
June 2022
466 pages
ISBN:9781450396134
DOI:10.1145/3530019
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bug Reports
  2. Classification
  3. Feature Selection
  4. Quality Concerns

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EASE 2022

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)4
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Survey on Bug Deduplication and Triage Methods from Multiple Points of ViewApplied Sciences10.3390/app1315878813:15(8788)Online publication date: 29-Jul-2023
  • (2023)Similar Bug Reports Recommendation System using BERTProceedings of the XXXVII Brazilian Symposium on Software Engineering10.1145/3613372.3613396(378-387)Online publication date: 25-Sep-2023
  • (2023)Capturing Contextual Relationships of Buggy Classes for Detecting Quality-Related Bugs2023 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58846.2023.00048(375-379)Online publication date: 1-Oct-2023
  • (2023)A multi-model framework for semantically enhancing detection of quality-related bug report descriptionsEmpirical Software Engineering10.1007/s10664-022-10280-w28:2Online publication date: 11-Feb-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media