research-article

Actionable code smell identification with fusion learning of metrics and semantics

Authors:

Yihang XuAuthors Info & Claims

Volume 236, Issue C

https://doi.org/10.1016/j.scico.2024.103110

Published: 18 July 2024 Publication History

Abstract

Code smell detection is one of the essential tasks in the field of software engineering. Identifying whether a code snippet has a code smell is subjective and varies by programming language, developer, and development method. Moreover, developers tend to focus on code smells that have a real impact on development and ignore insignificant ones. However, existing static code analysis tools and code smell detection approaches exhibit a high false positive rate in detecting code smells, which makes insignificant smells drown out those smells that developers value. Therefore, accurately reporting those actionable code smells that developers tend to spend energy on refactoring can prevent developers from getting lost in the sea of smells and improve refactoring efficiency. In this paper, we aim to detect actionable code smells that developers tend to refactor. Specifically, we first collect actionable and non-actionable code smells from projects with numerous historical versions to construct our datasets. Then, we propose a dual-stream model for fusion learning of code metrics and code semantics to detect actionable code smells. On the one hand, code metrics quantify the code's structure and even some rules or patterns, providing fundamental information for detecting code smells. On the other hand, code semantics encompass information about developers' refactoring tendencies, which prove valuable in detecting actionable code smells. Extensive experiments show that our approach can detect actionable code smells more accurately compared to existing approaches.

Highlights

•

We provide a method to collect actionable code smells automatically.

•

We propose a dual-stream model for detecting and identifying actionable code smells.

•

We comprehensively evaluate our approach on the publicly available and collected datasets.

•

We provide several valuable suggestions for practitioners and a benchmark for identifying actionable code smells.

References

[1]

A. Barbez, F. Khomh, Y.-G. Guéhéneuc, Deep learning anti-patterns from code metrics history, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 114–124,.

[2]

D. Bobkov, S. Chen, R. Jian, M.Z. Iqbal, E. Steinbach, Noise-resistant deep learning for object classification in three-dimensional point clouds using a point pair descriptor, IEEE Robot. Autom. Lett. 3 (2018) 865–872,.

[3]

M. Boussaa, W. Kessentini, M. Kessentini, S. Bechikh, S. Ben Chikha, Competitive coevolutionary code-smells detection, in: G. Ruhe, Y. Zhang (Eds.), Search Based Software Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 50–65.

[4]

S. Boutaib, S. Bechikh, F. Palomba, M. Elarbi, M. Makhlouf, L.B. Said, Code smell detection and identification in imbalanced environments, Expert Syst. Appl. 166 (2021),. https://www.sciencedirect.com/science/article/pii/S0957417420308356.

[5]

S. Boutaib, M. Elarbi, S. Bechikh, C.A.C. Coello, L.B. Said, Uncertainty-wise software anti-patterns detection: a possibilistic evolutionary machine learning approach, Appl. Soft Comput. 129 (2022),.

Digital Library

[6]

H. Cervantes, R. Kazman, Software archinaut: a tool to understand architecture, identify technical debt hotspots and manage evolution, in: Proceedings of the 3rd International Conference on Technical Debt TechDebt '20, Association for Computing Machinery, New York, NY, USA, 2020, pp. 115–119,.

Digital Library

[7]

D. Cruz, A. Santana, E. Figueiredo, Detecting bad smells with machine learning algorithms: an empirical study, in: TechDebt ’20: International Conference on Technical Debt, 2020.

[8]

W. Cunningham, The wycash portfolio management system, in: Addendum to the Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Addendum), OOPSLA ’92, Association for Computing Machinery, New York, NY, USA, 1992, pp. 29–30,.

Digital Library

[9]

D. Das, A.A. Maruf, R. Islam, N. Lambaria, S. Kim, A.S. Abdelfattah, T. Cerny, K. Frajtak, M. Bures, P. Tisnovsky, Technical debt resulting from architectural degradation and code smells: a systematic mapping study, ACM SIGAPP Appl. Comput. Rev. 21 (2022) 20–36,.

Digital Library

[10]

D. Di Nucci, F. Palomba, D.A. Tamburri, A. Serebrenik, A. De Lucia, Detecting code smells using machine learning techniques: are we there yet?, in: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018, pp. 612–621,.

[11]

A. Eposhi, W. Oizumi, A. Garcia, L. Sousa, R. Oliveira, A. Oliveira, Removal of design problems through refactorings: are we looking at the right symptoms?, in: 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), 2019, pp. 148–153,.

Digital Library

[12]

Q. Feng, Y. Cai, R. Kazman, D. Cui, T. Liu, H. Fang, Active hotspot: an issue-oriented model to monitor software evolution and degradation, in: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering ASE '19, IEEE Press, 2019, pp. 986–997,.

Digital Library

[13]

F.A. Fontana, V. Ferme, S. Spinelli, Investigating the impact of code smells debt on quality code evaluation, in: 2012 Third International Workshop on Managing Technical Debt (MTD), 2012, pp. 15–22,.

[14]

A. Gong, Y. Zhong, W. Zou, Y. Shi, C. Fang, Incorporating Android code smells into Java static code metrics for security risk prediction of Android applications, in: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 2020, pp. 30–40,.

[15]

Guo, D.; Ren, S.; Lu, S.; Feng, Z.; Tang, D.; Liu, S.; Zhou, L.; Duan, N.; Yin, J.; Jiang, D.; Zhou, M. (2020): Graphcodebert: pre-training code representations with data flow. arXiv:2009.08366 [abs] : Graphcodebert: pre-training code representations with data flow. https://api.semanticscholar.org/CorpusID:221761146.

[16]

X. Guo, C. Shi, H. Jiang, Deep semantic-based feature envy identification, in: Proceedings of the 11th Asia-Pacific Symposium on Internetware Internetware ’19, Association for Computing Machinery, New York, NY, USA, 2019,.

Digital Library

[17]

M. Hadj-Kacem, N. Bouassida, Deep representation learning for code smells detection using variational auto-encoder, in: 2019 International Joint Conference on Neural Networks (IJCNN), 2019, pp. 1–8,.

[18]

A. Imran, Design smell detection and analysis for open source Java software, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 644–648,.

[19]

A. Kaur, S. Jain, S. Goel, A support vector machine based approach for code smell detection, in: 2017 International Conference on Machine Learning and Data Science (MLDS), 2017.

[20]

M. Kessentini, H. Sahraoui, M. Boukadoum, M. Wimmer, Search-based design defects detection by example, in: D. Giannakopoulou, F. Orejas (Eds.), Fundamental Approaches to Software Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 401–415.

[21]

T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Representations, 2017, https://openreview.net/forum?id=SJU4ayYgl.

[22]

E.O. Kiyak, D. Birant, K.U. Birant, Comparison of multi-label classification algorithms for code smell detection, in: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2019, pp. 1–6,.

[23]

A. Kuechler, C. Banse, Representing llvm-ir in a code property graph, in: Information Security Conference, 2022, https://api.semanticscholar.org/CorpusID:253446915.

[24]

Z. Kurbatova, I. Veselov, Y. Golubev, T. Bryksin, Recommendation of Move Method Refactoring Using Path-Based Representation of Code, Association for Computing Machinery, New York, NY, USA, 2020, pp. 315–322,.

Digital Library

[25]

V. Lenarduzzi, A. Martini, D. Taibi, D.A. Tamburri, Towards surgically-precise technical debt estimation: early results and research roadmap, in: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, MaLTeSQuE 2019, Association for Computing Machinery, New York, NY, USA, 2019, pp. 37–42,.

Digital Library

[26]

Y. Li, D. Tarlow, M. Brockschmidt, R. Zemel, Gated graph sequence neural networks, Comput. Sci. (2015).

[27]

T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, IEEE Trans. Pattern Anal. Mach. Intell. 42 (2020) 318–327,.

[28]

H. Liu, J. Jin, Z. Xu, Y. Zou, Y. Bu, L. Zhang, Deep learning based code smell detection, IEEE Trans. Softw. Eng. 47 (2021) 1811–1837,.

[29]

H. Liu, Q. Liu, Z. Niu, Y. Liu, Dynamic and automatic feedback-based threshold adaptation for code smell detection, IEEE Trans. Softw. Eng. 42 (2016) 544–558,.

Digital Library

[30]

H. Liu, Z. Xu, Y. Zou, Deep learning based feature envy detection, in: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2018, pp. 385–396,.

Digital Library

[31]

L. Madeyski, T. Lewowski, Mlcq: industry-relevant code smell data set, in: Proceedings of the 24th International Conference on Evaluation and Assessment in Software Engineering, 2020, https://api.semanticscholar.org/CorpusID:218522226.

[32]

U. Mansoor, M. Kessentini, B.R. Maxim, K. Deb, Multi-objective code-smells detection using good and bad design examples, Softw. Qual. J. 25 (2) (2017) 529–552,.

Digital Library

[33]

C. Marinescu, R. Marinescu, P.F. Mihancea, D. Ratiu, R. Wettel, iplasma: an integrated platform for quality assessment of object-oriented design, in: International Conference on Smart Multimedia, 2005, https://api.semanticscholar.org/CorpusID:17455536.

[34]

N. Moha, Y.-G. Gueheneuc, L. Duchien, A.-F. Le Meur, Decor: a method for the specification and detection of code and design smells, IEEE Trans. Softw. Eng. 36 (2010) 20–36,.

Digital Library

[35]

A. Ouni, M. Kessentini, H. Sahraoui, M. Boukadoum, Maintainability defects detection and correction: a multi-objective approach, Autom. Softw. Eng. 20 (2013) 47–79,.

Digital Library

[36]

F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, D. Poshyvanyk, Detecting bad smells in source code using change history information, in: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013, pp. 268–278,.

Digital Library

[37]

F. Palomba, G. Bavota, M.D. Penta, R. Oliveto, D. Poshyvanyk, A. De Lucia, Mining version histories for detecting code smells, IEEE Trans. Softw. Eng. 41 (2015) 462–489,.

Digital Library

[38]

S. Pang, D. Morris, H. Radha, Clocs: camera-lidar object candidates fusion for 3d object detection, in: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 10386–10393,.

Digital Library

[39]

A. Patnaik, N. Padhy, Does code complexity affect the quality of real-time projects? Detection of code smell on software projects using machine learning algorithms, in: Proceedings of the International Conference on Data Science, Machine Learning and Artificial Intelligence, DSMLAI ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp. 178–185,.

Digital Library

[40]

P. Philipp, R.X. Morales Georgi, J. Beyerer, S. Robert, J. Beyerer, Analysis of control flow graphs using graph convolutional neural networks, in: 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI), 2019, pp. 73–77,.

[41]

M.A. Saca, Refactoring improving the design of existing code, in: 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII), 2017, pp. 1–3,.

[42]

M. Schnappinger, A. Fietzke, A. Pretschner, Human-level ordinal maintainability prediction based on static code metrics, in: Evaluation and Assessment in Software Engineering EASE 2021, Association for Computing Machinery, New York, NY, USA, 2021, pp. 160–169,.

Digital Library

[43]

T. Sharma, Detecting and managing code smells: research and practice, in: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), 2018, pp. 546–547.

[44]

T. Sharma, How deep is the mud: fathoming architecture technical debt using designite, in: 2019 IEEE/ACM International Conference on Technical Debt (TechDebt), 2019, pp. 59–60,.

Digital Library

[45]

T. Sharma, M. Fragkoulis, D. Spinellis, Does your configuration code smell?, in: Proceedings of the 13th International Conference on Mining Software Repositories MSR '16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 189–200,.

Digital Library

[46]

T. Sharma, M. Fragkoulis, D. Spinellis, House of cards: code smells in open-source c# repositories, in: Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement ESEM ’17, IEEE Press, 2017, pp. 424–429,.

Digital Library

[47]

T. Sharma, P. Mishra, R. Tiwari, Designite - a software design quality assessment tool, in: 2016 IEEE/ACM 1st International Workshop on Bringing Architectural Design Thinking into Developers’ Daily Activities (BRIDGE), 2016, pp. 1–4.

[48]

L. Shen, W. Liu, X. Chen, Q. Gu, X. Liu, Improving machine learning-based code smell detection via hyper-parameter optimization, in: 2020 27th Asia-Pacific Software Engineering Conference (APSEC), 2020, pp. 276–285,.

[49]

Taguchi, G. (1987): System of experimental design: engineering methods to optimize quality and minimize costs. https://api.semanticscholar.org/CorpusID:107131363.

[50]

N. Tsantalis, T. Chaikalis, A. Chatzigeorgiou, Jdeodorant: identification and removal of type-checking bad smells, in: 2008 12th European Conference on Software Maintenance and Reengineering, 2008, pp. 329–331,.

Digital Library

[51]

N. Tsantalis, A. Chatzigeorgiou, Ranking refactoring suggestions based on historical volatility, in: 2011 15th European Conference on Software Maintenance and Reengineering, 2011, pp. 25–34,.

Digital Library

[52]

M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M.D. Penta, A. De Lucia, D. Poshyvanyk, When and why your code starts to smell bad (and whether the smells go away), IEEE Trans. Softw. Eng. 43 (2017) 1063–1088,.

Digital Library

[53]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, in: Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS'17, Curran Associates Inc., Red Hook, NY, USA, 2017, pp. 6000–6010.

[54]

H. Wang, J. Liu, J. Kang, W. Yin, H. Sun, H. Wang, Feature envy detection based on bi-lstm with self-attention mechanism, in: IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2020), IEEE, 2020, pp. 448–457,.

[55]

W. Wang, G. Li, B. Ma, X. Xia, Z. Jin, Detecting code clones with graph neural network and flow-augmented abstract syntax tree, in: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2020, pp. 261–271,.

[56]

X. Wang, J. Liu, L. Li, X. Chen, X. Liu, H. Wu, Detecting and explaining self-admitted technical debts with attention-based neural networks, in: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020, pp. 871–882.

[57]

X. Yin, C. Shi, S. Zhao, Local and global feature based explainable feature envy detection, in: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 2021, pp. 942–951,.

[58]

D. Yu, Q. Yang, X. Chen, J. Chen, Y. Xu, Graph-based code semantics learning for efficient semantic code clone detection, Inf. Softw. Technol. 156 (2023),. https://www.sciencedirect.com/science/article/pii/S0950584922002397.

Digital Library

[59]

J. Yu, C. Mao, X. Ye, A novel tree-based neural network for Android code smells detection, in: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), 2021, pp. 738–748,.

[60]

J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, X. Liu, A novel neural source code representation based on abstract syntax tree, in: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019, pp. 783–794,.

Digital Library

Recommendations

Empirical Evaluation of Hunk Metrics as Bug Predictors
IWSM '09 /Mensura '09: Proceedings of the International Conferences on Software Process and Product Measurement

Reducing the number of bugs is a crucial issue during software development and maintenance. Software process and product metrics are good indicators of software complexity. These metrics have been used to build bug predictor models to help developers ...
Automatic detection of Feature Envy and Data Class code smells using machine learning
Abstract
Code smells in software indicate poor design and implementation choices. Detecting and removing them is critical for sustainable software development. Machine learning (ML) can automate code smell detection. Most ML solutions train models from ...
Highlights
- We detect Feature Envy and Data Class code smells using pre-trained code embeddings.
- We compare handcrafted code metrics with automatically inferred code embeddings.
- We test the performance of smell detectors on the large manually ...
Automatic detection of Long Method and God Class code smells through neural source code embeddings
Highlights
- We compare machine learning approaches against heuristics for code smell detection.
Abstract
Code smells are structures in code that often harm its quality. Manually detecting code smells is challenging, so researchers proposed many automatic detectors. Traditional code smell detectors employ metric-based heuristics, but ...

Comments

Information & Contributors

Information

Published In

cover image Science of Computer Programming

Science of Computer Programming Volume 236, Issue C

Sep 2024

300 pages

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 18 July 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents