Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2019)

Abstract

Assigning severity level to reported bugs is a critical part of software maintenance to ensure an efficient resolution process. In many bug trackers, e.g. Bugzilla, this is a time consuming process, because bug reporters must manually assign one of seven severity levels to each bug. In addition, some bug types may be reported more often than others, leading to a disproportionate distribution of severity labels. Machine learning techniques can be used to predict the label of a newly reported bug automatically. However, learning from imbalanced data in a multi-class task remains one of the major difficulties for machine learning classifiers. In this paper, we propose a hierarchical classification approach that exploits class imbalance in the training data, to reduce classification bias. Specifically, we designed a classification tree that consists of multiple binary classifiers organised hierarchically, such that instances from the most dominant class are trained against the remaining classes but are not used for training the next level of the classification tree. We used FastText classifier to test and compare between the hierarchical and standard classification approaches. Based on 93,051 bug reports from 38 Eclipse open-source products, the hierarchical approach was shown to perform relatively well with \(65\%\) Micro F-Score and \(45\%\) Macro F-Score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    www.bugzilla.org.

  2. 2.

    promise.site.uottawa.ca/SERepository.

  3. 3.

    Bug reports were downloaded from 38 Eclipse related products.

  4. 4.

    Dominance refers to size. A dominant class contains more instances than another.

  5. 5.

    emorynlp.github.io/nlp4j.

References

  1. Chaturvedi, K.K., Singh, V.B.: Determining Bug severity using machine learning techniques. In: 2012 CSI 6th International Conference on Software Engineering. CONSEG 2012, pp. 1–6. IEEE (2012). https://doi.org/10.1109/CONSEG.2012.6349519

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002). http://dl.acm.org/citation.cfm?id=1622407.1622416

    Article  Google Scholar 

  3. Gegick, M., Rotella, P., Xie, T.: Identifying security bug reports via text mining: an industrial case study. In: Proceedings - International Conference on Software Engineering, pp. 11–20. IEEE (2010). https://doi.org/10.1109/MSR.2010.5463340

  4. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, vol. 2, pp. 427–431 (2017)

    Google Scholar 

  5. Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: Proceedings - International Conference on Software Engineering, pp. 1–10. IEEE (2010). https://doi.org/10.1109/MSR.2010.5463284

  6. Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonckz, T.: Comparing mining algorithms for predicting the severity of a reported bug. In: Proceedings of the European Conference on Software Maintenance and Reengineering. CSMR, pp. 249–258. IEEE (2011). https://doi.org/10.1109/CSMR.2011.31

  7. Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: IEEE International Conference on Software Maintenance. ICSM, pp. 346–355 (2008). https://doi.org/10.1109/ICSM.2008.4658083

  8. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  10. Močkus, J., Tiešis, V., Žilinskas, A.: The application of Bayesian methods for seeking the extremum. In: Szegö, G.P., Dixon, L.C.W. (eds.) Towards Global Optimisation, vol. 2, pp. 117–128, North-Holland (1978)

    Google Scholar 

  11. Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP 2009 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 248–256 (2009). https://doi.org/10.3115/1699510.1699543

  12. Roy, N.K.S., Rossi, B.: Towards an improvement of bug severity classification. Proceedings - 40th Euromicro Conference Series on Software Engineering and Advanced Applications. SEAA 2014, pp. 269–276 (2014). https://doi.org/10.1109/SEAA.2014.51

  13. Singh, V.B., Misra, S., Sharma, M.: Bug severity assessment in cross project context and identifying training candidates. J. Inf. Knowl. Manag. 16(01), 1750005 1–30 (2017). https://doi.org/10.1142/S0219649217500058, http://www.worldscientific.com/doi/abs/10.1142/S0219649217500058

  14. Sun, C., Lo, D., Khoo, S.C., Jiang, J.: Towards more accurate retrieval of duplicate bug reports. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. ASE 2011, pp. 253–262 (2011). https://doi.org/10.1109/ASE.2011.6100061

  15. Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: Proceedings - Working Conference on Reverse Engineering. WCRE, pp. 215–224 (2012). https://doi.org/10.1109/WCRE.2012.31

  16. Yang, C.Z., Hou, C.C., Kao, W.C., Chen, I.X.: An empirical study on improving severity prediction of defect reports using feature selection. In: Proceedings - Asia-Pacific Software Engineering Conference, APSEC. vol. 1, pp. 240–249. IEEE (2012). https://doi.org/10.1109/APSEC.2012.144

  17. Zhang, T., Chen, J., Yang, G., Lee, B., Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Software 117, 166–184 (2016). https://doi.org/10.1016/j.jss.2016.02.034

    Article  Google Scholar 

  18. Zolotov, V., Kung, D.: Analysis and optimization of fast text linear text classifier. arXiv preprint arXiv:1702.05531 (2017)

Download references

Acknowledgments

This research work is part of the CROSSMINER Project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 732223.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yannis Korkontzelos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nnamoko, N., Cabrera-Diego, L.A., Campbell, D., Korkontzelos, Y. (2019). Bug Severity Prediction Using a Hierarchical One-vs.-Remainder Approach. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23281-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23280-1

  • Online ISBN: 978-3-030-23281-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics