article

Comparing and experimenting machine learning techniques for code smell detection

Authors:

Francesca Arcelli Fontana,

Mika V. Mäntylä,

Alessandro MarinoAuthors Info & Claims

Empirical Software Engineering, Volume 21, Issue 3

Pages 1143 - 1191

Published: 01 June 2016 Publication History

Abstract

Several code smell detection tools have been developed providing different results, because smells can be subjectively interpreted, and hence detected, in different ways. In this paper, we perform the largest experiment of applying machine learning algorithms to code smells to the best of our knowledge. We experiment 16 different machine-learning algorithms on four code smells (Data Class, Large Class, Feature Envy, Long Method) and 74 software systems, with 1986 manually validated code smell samples. We found that all algorithms achieved high performances in the cross-validation data set, yet the highest performances were obtained by J48 and Random Forest, while the worst performance were achieved by support vector machines. However, the lower prevalence of code smells, i.e., imbalanced data, in the entire data set caused varying performances that need to be addressed in the future studies. We conclude that the application of machine learning to the detection of these code smells can provide high accuracy (>96 %), and only a hundred training examples are needed to reach at least 95 % accuracy.

References

[1]

Aggarwal KK, Singh Y, Kaur A, Malhotra R (2006) Empirical study of object-oriented metrics. J Object Technol 5(8):149.

[2]

Arcelli Fontana F, Braione P, Zanoni M (2012) Automatic detection of bad smells in code: an experimental assessment. J Object Technol 11(2), p. 5:1.

[3]

Arcelli Fontana F, Ferme V, Marino A, Walter B, Martenka P (2013a) Investigating the impact of code smells on system's quality: an empirical study on systems of different application domains. Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM 2013), 260-269.

[4]

Arcelli Fontana F, Zanoni M, Marino A, Mantyla MV (2013b) Code smell detection: towards a machine learning-based approach. In: Fontana A (ed) Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM 2013), IEEE, Eindhoven, The Netherlands, 396-399.

[5]

Bansiya J, Davis CG (2002) A hierarchical model for object-oriented design quality assessment. IEEE Trans Softw Eng 28(1):4-17.

Digital Library

[6]

Bengio Y, Grandvalet Y (2004) No unbiased estimator of the variance of k-fold cross-validation. J Mach Learn Res 5:1089-1105.

Digital Library

[7]

Berander P (2004) Using students as subjects in requirements prioritization. Proceedings of the International Symposium on Empirical Software Engineering (ISESE'04), 167-176.

[8]

Bowes D, Randall D, Hall T (2013) The inconsistent measurement of message chains. Proceedings of the 4th International Workshop on Emerging Trends in Software Metrics (WETSoM 2013). IEEE, San Francisco, CA, USA, 62-68.

[9]

Capra E, Francalanci C, Merlo F, Rossi-Lamastra C (2011) Firms' involvement in open source projects: a tradeoff between software structural quality and popularity. J Syst Softw 84(1):144-161.

Digital Library

[10]

Carver J, Jaccheri L, Morasca S, Shull F (2003) Issues in using students in empirical studies in software engineering education. Proceedings of the Ninth International Software Metrics Symposium (METRICS 2003). IEEE, Sydney, Australia, 239-249.

[11]

Chen Y-W, Lin C-J (2006) Combining SVMs with various feature selection strategies. Feature extraction. Springer Berlin Heidelberg, 315-324.

[12]

Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476- 493.

Digital Library

[13]

Cohen PR, Jensen D (1997) Overfitting explained, Preliminary Papers of the Sixth International Workshop on Artificial Intelligence and Statistics. Self published. Printed proceedings distributed at the workshop, 115- 122 http://w3.sista.arizona.edu/~cohen/Publications/papers/cohen-ais96b.pdf.

[14]

Dekkers A, Aarts E (1991) Global optimization and simulated annealing. Math Program 50:367-393.

Digital Library

[15]

Deligiannis I, Stamelos I, Angelis L, Roumeliotis M, Shepperd M (2004) A controlled experiment investigation of an object-oriented design heuristic for maintainability. J Syst Softw 72(2):129-143.

[16]

Dem¿ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1-30.

Digital Library

[17]

Dubey SK, Sharma A, Rana A (2012) Comparison of software quality metrics for object-oriented system. Int J Comput Sci Manag Stud 12:12-24.

[18]

Ferme V (2013) JCodeOdor: a software quality advisor through design flaws detection, Master's thesis, University of Milano-Bicocca.

[19]

Ferme V, Marino A, Arcelli Fontana F (2013) Is it a real code smell to be removed or not? Presented at the RefTest 2013 Workshop, co-located event with XP 2013 Conference, 15.

[20]

Fowler M, Beck K (1999) Refactoring: improving the design of existing code. 1-82.

[21]

Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth International Conference on Machine Learning (ICML 1996), Bari, Italy, 148-156.

Digital Library

[22]

Goldberg D (1989) Genetic algorithms in search, optimization, and machine learning, 1st edition. Addison-Wesley, Ed. Longman Publishing Co., Inc.

[23]

Guéhéneuc Y-G, Sahraoui H, Zaidi F (2004) Fingerprinting design patterns. Proceedings. 11th Working Conference on Reverse Engineering (WCRE 2004), IEEE, Victoria, BC, Canada, 172-181.

Digital Library

[24]

Hall MA, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437-1447.

Digital Library

[25]

Hall M, Frank E, Holmes G (2009) The WEKA data mining software: an update. SIGKDD Explor Newsl 11(1): 10-18.

Digital Library

[26]

Hall T, Zhang M, Bowes D, Sun Y (2014) Some code smells have a significant but small effect on faults. ACM Trans Softw Eng Methodol 23(4):33.

Digital Library

[27]

He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263-1284.

Digital Library

[28]

Hollander M, Wolfe DA, Chicken E (2000) Nonparametric statistical methods, 3rd edn. Wiley, New York, pp 39-55, 84-87.

[29]

Höst M, Regnell B, Wohlin C (2000) Using students as subjects--a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201-214.

Digital Library

[30]

Hsu C, Chang C, Lin C (2003) A practical guide to support vector classification. vol. 1, no. 1, pp. 5-8. URL: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf. Accessed: 9 March 2015.

[31]

Khomh F, Vaucher S, Guéhéneuc Y-G, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. Proceedings of the 9th International Conference on Quality Software (QSIC 2009). IEEE, Jeju, pp 305-314.

Digital Library

[32]

F. Khomh, S. Vaucher, Y.-G. Guéhéneuc, and H. Sahraoui, "BDTEX: A GQM-based Bayesian approach for the detection of antipatterns," Journal of Systems and Software, vol. 84, no. 4, pp. 559-572, Apr. 2011, Elsevier Inc.

[33]

Kline RM. Library Example. [Online]. Available: http://www.cs.wcupa.edu/~rkline/java/library.html. Accessed: 23 September 2013.

[34]

Kreimer J (2005) Adaptive detection of design flaws. Electron Notes Theor Comput Sci 141(4):117-136.

Digital Library

[35]

Lamkanfi A, Demeyer S (2010) Predicting the severity of a reported bug. In: 7th working conference on Mining Software Repositories (MSR 2010). IEEE, Cape Town, 1-10.

[36]

Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485-496.

Digital Library

[37]

Li W, Shatnawi R (2007) An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. J Syst Softw 80(7):1120-1128. ISSN 0164-1212.

Digital Library

[38]

Lorenz M, Kidd J (1994) Object-oriented software metrics: a practical guide. Prentice-Hall, Ed.

[39]

Maiga A, Ali N (2012) SMURF: a SVM-based incremental anti-pattern detection approach. Proceedings of the 19th Working Conference on Reverse Engineering (WCRE 2012). IEEE, Kingston, pp 466-475.

Digital Library

[40]

Maiga A, Ali N, Bhattacharya N, Sabané A, Guéhéneuc Y-G, Antoniol G, Aïmeur E (2012) Support vector machines for anti- pattern detection. In: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012). ACM, Essen, Germany, 278-281.

Digital Library

[41]

Mäntylä MV, Vanhanen J, Lassenius C (2004) Bad smells-humans as code critics. Proceedings of the 20th IEEE International Conference on Software Maintenance (ICSM 2004), IEEE, Chicago Illinois, USA, 399-408.

[42]

Mäntylä MV, Lassenius C (2006) Subjective evaluation of software evolvability using code smells: an empirical study. Empir Softw Eng 11(3):395-431.

Digital Library

[43]

Marinescu R (2002) Measurement and quality in object-oriented design. Politechnica University of Timisoara http://loose.upt.ro./download/thesis/thesis.zip.

[44]

Marinescu R (2005) Measurement and quality in object-oriented design. Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM 2005), IEEE, Budapest, Hungary, 701-704.

Digital Library

[45]

Marinescu C, Marinescu R, Mihancea P, Ratiu D, Wettel R (2005) iPlasma: an integrated platform for quality assessment of object-oriented design. Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM 2005). IEEE, Budapest, Hungary, 77-80 http://loose.upt.ro./download/papers/marinescu-iPlasma.pdf.

Digital Library

[46]

McKnight LK, Wilcox A, Hripcsak G (2002) The effect of sample size and disease prevalence on supervised machine learning of narrative data. Proc AMIA Symp. 519-22.

[47]

Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. Proceedings of the 24th IEEE International Conference on Software Maintenance (ICSM 2008). IEEE, Beijing, Beijing, China 346-355.

[48]

Moha N, Guéhéneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20-36.

Digital Library

[49]

Moser R, Abrahamsson P, Pedrycz W, Sillitti A, Succi G (2008) A case study on the impact of refactoring on quality and productivity in an agile team. In: Meyer B, Nawrocki J, Walter B (eds) Balancing agility and formalism in software engineering. Lecture Notes in Computer Science, vol 5082. Springer Berlin Heidelberg, pp. 252-266.

Digital Library

[50]

Murphy-Hill E, Black AP (2010) An interactive ambient visualization for code smells. In: Proceedings of the 5th international symposium on Software visualization (SOFTVIS'10). ACM, Salt Lake City, Utah, USA, 5-14.

[51]

Navot A, Gilad-Bachrach R, Navot Y, Tishby N (2006) Is feature selection still necessary?. In: Subspace, latent structure and feature selection (pp. 127-138). Springer Berlin Heidelberg.

[52]

Nongpong K (2012) Integrating 'Code Smells' detection with refactoring tool support. University of Wisconsin-Milwaukee http://dc.uwm.edu/etd/13.

[53]

Olbrich SM, Cruzes DS, Sjoberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. Proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM 2010). IEEE, Timisoara, Romania, pp 1-10

Digital Library

[54]

Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D (2013) Detecting bad smells in source code using change history information. Proceedings of the 28th IEEE/ACM 28th International Conference on Automated Software Engineering (ASE 2013). IEEE, Silicon Valley, CA, pp. 268-278

Digital Library

[55]

Runeson P (2003) Using Students as Experimental Subjects -An analysis of Graduate and Freshmen PSP Student Data. Proceedings of the 7th international conference on empirical assessment in software engineering. Keele University, UK, pp 95-102.

[56]

Sjøberg DIK, Yamashita AF, Anda BCD, Mockus A, Dybå T (2013) Quantifying the effect of code smells on maintenance effort. IEEE Trans Softw Eng 39(8):1144-1156.

Digital Library

[57]

Spinellis D (2008) A tale of four kernels. In: Proceedings of the 30th international conference on Software engineering (ICSE 2008). ACM, Leipzig, Germany, pp. 381-390

Digital Library

[58]

Stamelos I, Angelis L, Oikonomou A, Bleris GL (2002) Code quality analysis in open source software development. Inf Syst J 12(1):43-60.

[59]

Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc 36(2):111-147.

[60]

Sun Y, Kamel MS, Wong AK, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn 40(12):3358-3378.

Digital Library

[61]

Svahnberg M, Aurum A, Wohlin C (2008) Using students as subjects-an empirical evaluation. Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement (ESEM '08). ACM, Kaiserslautern, Germany, pp 288-290.

[62]

Tempero E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J (2010) The qualitas corpus: a curated collection of java code for empirical studies. Proceedings of the 17th Asia Pacific Software Engineering Conference (APSEC 2010). IEEE, Sydney, NSW, Australia, pp 336-345

Digital Library

[63]

Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: 19thWorking Conference on Reverse Engineering. IEEE, Ontario, Canada, 215-224.

[64]

Tichy WF (2000) Hints for reviewing empirical work in software engineering. Empir Softw Eng 5(4):309-312.

Digital Library

[65]

Tsantalis N, Member S, Chatzigeorgiou A (2009) Identification of move method refactoring opportunities. IEEE Trans Softw Eng 35(3):347-367.

Digital Library

[66]

Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. In NIPS (Vol. 12, pp. 668-674).

[67]

Wieman R (2011) Anti-pattern scanner: an approach to detect anti-patterns and design violations. Delft University of Technology. LAP Lambert Academic Publishing.

[68]

Yamashita A (2014) Assessing the capability of code smells to explain maintenance problems: an empirical study combining quantitative and qualitative data. J Empir Softw Eng 19(4):1111-1143.

Digital Library

[69]

Yang J, Hotta K, Higo Y, Igaki H, Kusumoto S (2012) Filtering clones for individual user based on machine learning analysis. Proceedings of the 6th International Workshop on Software Clones (IWSC 2012). IEEE, Zurich, pp 76-77.

Digital Library

[70]

Zazworka N, Shaw MA, Shull F, Seaman C (2011) Investigating the impact of design debt on software quality. Proceedings of the 2nd Workshop on Managing Technical Debt. ACM, Waikiki, Honolulu, HI, USA, pp. 17-23

[71]

Zhang M, Hall T, Baddoo N (2011) Code Bad Smells : a review of current knowledge. J Softw Maint Evol Res Pract 23(3):179-202.

Digital Library

Cited By

Wu DMu FShi LGuo ZLiu KZhuang WZhong YZhang LFilkov VRay BZhou M(2024)iSMELL: Assembling LLMs with Expert Toolsets for Code Smell Detection and RefactoringProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695508(1345-1357)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695508
Mohammed MAlshayeb MHassine J(2024)A rule-based approach for the identification of quality improvement opportunities in GRL modelsSoftware Quality Journal10.1007/s11219-024-09679-z32:3(1007-1037)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11219-024-09679-z
Kovačević ALuburić NSlivka JProkić SGrujić KVidaković DSladić G(2024)Automatic detection of code smells using metrics and CodeT5 embeddings: a case study in C#Neural Computing and Applications10.1007/s00521-024-09551-y36:16(9203-9220)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s00521-024-09551-y
Show More Cited By

Comparing and experimenting machine learning techniques for code smell detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Code smell detection using multi-label classification approach
Abstract
Code smells are characteristics of the software that indicates a code or design problem which can make software hard to understand, evolve, and maintain. There are several code smell detection tools proposed in the literature, but they produce ...
Code smell severity classification using machine learning techniques

Several code smells detection tools have been developed providing different results, because smells can be subjectively interpreted and hence detected in different ways. Machine learning techniques have been used for different topics in software ...
Comparing heuristic and machine learning approaches for metric-based code smell detection
ICPC '19: Proceedings of the 27th International Conference on Program Comprehension

Code smells represent poor implementation choices performed by developers when enhancing source code. Their negative impact on source code maintainability and comprehensibility has been widely shown in the past and several techniques to automatically ...

Comments

Information & Contributors

Information

Published In

cover image Empirical Software Engineering

Empirical Software Engineering Volume 21, Issue 3

June 2016

688 pages

ISSN:1382-3256

Issue’s Table of Contents

Copyright © Copyright © 2016 Springer Science+Business Media New York.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 2016

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu DMu FShi LGuo ZLiu KZhuang WZhong YZhang LFilkov VRay BZhou M(2024)iSMELL: Assembling LLMs with Expert Toolsets for Code Smell Detection and RefactoringProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695508(1345-1357)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695508
Mohammed MAlshayeb MHassine J(2024)A rule-based approach for the identification of quality improvement opportunities in GRL modelsSoftware Quality Journal10.1007/s11219-024-09679-z32:3(1007-1037)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11219-024-09679-z
Kovačević ALuburić NSlivka JProkić SGrujić KVidaković DSladić G(2024)Automatic detection of code smells using metrics and CodeT5 embeddings: a case study in C#Neural Computing and Applications10.1007/s00521-024-09551-y36:16(9203-9220)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s00521-024-09551-y
Liu BLiu HLi GNiu NXu ZWang YXia YZhang YJiang YChandra SBlincoe KTonella P(2023)Deep Learning Based Feature Envy Detection Boosted by Real-World ExamplesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616353(908-920)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616353
Ho ABui ANguyen PDi Salle A(2023)Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smellsProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering10.1145/3593434.3593476(229-234)Online publication date: 14-Jun-2023
https://dl.acm.org/doi/10.1145/3593434.3593476
Silva RFarias KKunst RDalzochio J(2023)An Approach Based on Machine Learning for Predicting Software Design ProblemsProceedings of the XIX Brazilian Symposium on Information Systems10.1145/3592813.3592888(53-60)Online publication date: 29-May-2023
https://dl.acm.org/doi/10.1145/3592813.3592888
Wang Y(2022)Construction of Intelligent Evaluation Model of English Composition Based on Machine LearningMobile Information Systems10.1155/2022/34997992022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3499799
Liu KYang GChen XZhou Y(2022)EL-CodeBert: Better Exploiting CodeBert to Support Source Code-Related Classification TasksProceedings of the 13th Asia-Pacific Symposium on Internetware10.1145/3545258.3545260(147-155)Online publication date: 11-Jun-2022
https://dl.acm.org/doi/10.1145/3545258.3545260
Yedida RMenzies TLo DMcIntosh SNovielli N(2022)How to improve deep learning for software analyticsProceedings of the 19th International Conference on Mining Software Repositories10.1145/3524842.3528458(156-166)Online publication date: 23-May-2022
https://dl.acm.org/doi/10.1145/3524842.3528458
Zakeri‐Nasrabadi MParsa S(2022)Learning to predict test effectivenessInternational Journal of Intelligent Systems10.1002/int.2272237:8(4363-4392)Online publication date: 28-Jun-2022
https://dl.acm.org/doi/10.1002/int.22722
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents