Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Actionable code smell identification with fusion learning of metrics and semantics

Published: 18 July 2024 Publication History

Abstract

Code smell detection is one of the essential tasks in the field of software engineering. Identifying whether a code snippet has a code smell is subjective and varies by programming language, developer, and development method. Moreover, developers tend to focus on code smells that have a real impact on development and ignore insignificant ones. However, existing static code analysis tools and code smell detection approaches exhibit a high false positive rate in detecting code smells, which makes insignificant smells drown out those smells that developers value. Therefore, accurately reporting those actionable code smells that developers tend to spend energy on refactoring can prevent developers from getting lost in the sea of smells and improve refactoring efficiency. In this paper, we aim to detect actionable code smells that developers tend to refactor. Specifically, we first collect actionable and non-actionable code smells from projects with numerous historical versions to construct our datasets. Then, we propose a dual-stream model for fusion learning of code metrics and code semantics to detect actionable code smells. On the one hand, code metrics quantify the code's structure and even some rules or patterns, providing fundamental information for detecting code smells. On the other hand, code semantics encompass information about developers' refactoring tendencies, which prove valuable in detecting actionable code smells. Extensive experiments show that our approach can detect actionable code smells more accurately compared to existing approaches.

Highlights

We provide a method to collect actionable code smells automatically.
We propose a dual-stream model for detecting and identifying actionable code smells.
We comprehensively evaluate our approach on the publicly available and collected datasets.
We provide several valuable suggestions for practitioners and a benchmark for identifying actionable code smells.

References

[1]
A. Barbez, F. Khomh, Y.-G. Guéhéneuc, Deep learning anti-patterns from code metrics history, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 114–124,.
[2]
D. Bobkov, S. Chen, R. Jian, M.Z. Iqbal, E. Steinbach, Noise-resistant deep learning for object classification in three-dimensional point clouds using a point pair descriptor, IEEE Robot. Autom. Lett. 3 (2018) 865–872,.
[3]
M. Boussaa, W. Kessentini, M. Kessentini, S. Bechikh, S. Ben Chikha, Competitive coevolutionary code-smells detection, in: G. Ruhe, Y. Zhang (Eds.), Search Based Software Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 50–65.
[4]
S. Boutaib, S. Bechikh, F. Palomba, M. Elarbi, M. Makhlouf, L.B. Said, Code smell detection and identification in imbalanced environments, Expert Syst. Appl. 166 (2021),. https://www.sciencedirect.com/science/article/pii/S0957417420308356.
[5]
S. Boutaib, M. Elarbi, S. Bechikh, C.A.C. Coello, L.B. Said, Uncertainty-wise software anti-patterns detection: a possibilistic evolutionary machine learning approach, Appl. Soft Comput. 129 (2022),.
[6]
H. Cervantes, R. Kazman, Software archinaut: a tool to understand architecture, identify technical debt hotspots and manage evolution, in: Proceedings of the 3rd International Conference on Technical Debt TechDebt '20, Association for Computing Machinery, New York, NY, USA, 2020, pp. 115–119,.
[7]
D. Cruz, A. Santana, E. Figueiredo, Detecting bad smells with machine learning algorithms: an empirical study, in: TechDebt ’20: International Conference on Technical Debt, 2020.
[8]
W. Cunningham, The wycash portfolio management system, in: Addendum to the Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Addendum), OOPSLA ’92, Association for Computing Machinery, New York, NY, USA, 1992, pp. 29–30,.
[9]
D. Das, A.A. Maruf, R. Islam, N. Lambaria, S. Kim, A.S. Abdelfattah, T. Cerny, K. Frajtak, M. Bures, P. Tisnovsky, Technical debt resulting from architectural degradation and code smells: a systematic mapping study, ACM SIGAPP Appl. Comput. Rev. 21 (2022) 20–36,.
[10]
D. Di Nucci, F. Palomba, D.A. Tamburri, A. Serebrenik, A. De Lucia, Detecting code smells using machine learning techniques: are we there yet?, in: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018, pp. 612–621,.
[11]
A. Eposhi, W. Oizumi, A. Garcia, L. Sousa, R. Oliveira, A. Oliveira, Removal of design problems through refactorings: are we looking at the right symptoms?, in: 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), 2019, pp. 148–153,.
[12]
Q. Feng, Y. Cai, R. Kazman, D. Cui, T. Liu, H. Fang, Active hotspot: an issue-oriented model to monitor software evolution and degradation, in: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering ASE '19, IEEE Press, 2019, pp. 986–997,.
[13]
F.A. Fontana, V. Ferme, S. Spinelli, Investigating the impact of code smells debt on quality code evaluation, in: 2012 Third International Workshop on Managing Technical Debt (MTD), 2012, pp. 15–22,.
[14]
A. Gong, Y. Zhong, W. Zou, Y. Shi, C. Fang, Incorporating Android code smells into Java static code metrics for security risk prediction of Android applications, in: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 2020, pp. 30–40,.
[15]
Guo, D.; Ren, S.; Lu, S.; Feng, Z.; Tang, D.; Liu, S.; Zhou, L.; Duan, N.; Yin, J.; Jiang, D.; Zhou, M. (2020): Graphcodebert: pre-training code representations with data flow. arXiv:2009.08366 [abs] : Graphcodebert: pre-training code representations with data flow. https://api.semanticscholar.org/CorpusID:221761146.
[16]
X. Guo, C. Shi, H. Jiang, Deep semantic-based feature envy identification, in: Proceedings of the 11th Asia-Pacific Symposium on Internetware Internetware ’19, Association for Computing Machinery, New York, NY, USA, 2019,.
[17]
M. Hadj-Kacem, N. Bouassida, Deep representation learning for code smells detection using variational auto-encoder, in: 2019 International Joint Conference on Neural Networks (IJCNN), 2019, pp. 1–8,.
[18]
A. Imran, Design smell detection and analysis for open source Java software, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 644–648,.
[19]
A. Kaur, S. Jain, S. Goel, A support vector machine based approach for code smell detection, in: 2017 International Conference on Machine Learning and Data Science (MLDS), 2017.
[20]
M. Kessentini, H. Sahraoui, M. Boukadoum, M. Wimmer, Search-based design defects detection by example, in: D. Giannakopoulou, F. Orejas (Eds.), Fundamental Approaches to Software Engineering, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 401–415.
[21]
T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Representations, 2017, https://openreview.net/forum?id=SJU4ayYgl.
[22]
E.O. Kiyak, D. Birant, K.U. Birant, Comparison of multi-label classification algorithms for code smell detection, in: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2019, pp. 1–6,.
[23]
A. Kuechler, C. Banse, Representing llvm-ir in a code property graph, in: Information Security Conference, 2022, https://api.semanticscholar.org/CorpusID:253446915.
[24]
Z. Kurbatova, I. Veselov, Y. Golubev, T. Bryksin, Recommendation of Move Method Refactoring Using Path-Based Representation of Code, Association for Computing Machinery, New York, NY, USA, 2020, pp. 315–322,.
[25]
V. Lenarduzzi, A. Martini, D. Taibi, D.A. Tamburri, Towards surgically-precise technical debt estimation: early results and research roadmap, in: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation, MaLTeSQuE 2019, Association for Computing Machinery, New York, NY, USA, 2019, pp. 37–42,.
[26]
Y. Li, D. Tarlow, M. Brockschmidt, R. Zemel, Gated graph sequence neural networks, Comput. Sci. (2015).
[27]
T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, IEEE Trans. Pattern Anal. Mach. Intell. 42 (2020) 318–327,.
[28]
H. Liu, J. Jin, Z. Xu, Y. Zou, Y. Bu, L. Zhang, Deep learning based code smell detection, IEEE Trans. Softw. Eng. 47 (2021) 1811–1837,.
[29]
H. Liu, Q. Liu, Z. Niu, Y. Liu, Dynamic and automatic feedback-based threshold adaptation for code smell detection, IEEE Trans. Softw. Eng. 42 (2016) 544–558,.
[30]
H. Liu, Z. Xu, Y. Zou, Deep learning based feature envy detection, in: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2018, pp. 385–396,.
[31]
L. Madeyski, T. Lewowski, Mlcq: industry-relevant code smell data set, in: Proceedings of the 24th International Conference on Evaluation and Assessment in Software Engineering, 2020, https://api.semanticscholar.org/CorpusID:218522226.
[32]
U. Mansoor, M. Kessentini, B.R. Maxim, K. Deb, Multi-objective code-smells detection using good and bad design examples, Softw. Qual. J. 25 (2) (2017) 529–552,.
[33]
C. Marinescu, R. Marinescu, P.F. Mihancea, D. Ratiu, R. Wettel, iplasma: an integrated platform for quality assessment of object-oriented design, in: International Conference on Smart Multimedia, 2005, https://api.semanticscholar.org/CorpusID:17455536.
[34]
N. Moha, Y.-G. Gueheneuc, L. Duchien, A.-F. Le Meur, Decor: a method for the specification and detection of code and design smells, IEEE Trans. Softw. Eng. 36 (2010) 20–36,.
[35]
A. Ouni, M. Kessentini, H. Sahraoui, M. Boukadoum, Maintainability defects detection and correction: a multi-objective approach, Autom. Softw. Eng. 20 (2013) 47–79,.
[36]
F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, D. Poshyvanyk, Detecting bad smells in source code using change history information, in: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013, pp. 268–278,.
[37]
F. Palomba, G. Bavota, M.D. Penta, R. Oliveto, D. Poshyvanyk, A. De Lucia, Mining version histories for detecting code smells, IEEE Trans. Softw. Eng. 41 (2015) 462–489,.
[38]
S. Pang, D. Morris, H. Radha, Clocs: camera-lidar object candidates fusion for 3d object detection, in: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 10386–10393,.
[39]
A. Patnaik, N. Padhy, Does code complexity affect the quality of real-time projects? Detection of code smell on software projects using machine learning algorithms, in: Proceedings of the International Conference on Data Science, Machine Learning and Artificial Intelligence, DSMLAI ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp. 178–185,.
[40]
P. Philipp, R.X. Morales Georgi, J. Beyerer, S. Robert, J. Beyerer, Analysis of control flow graphs using graph convolutional neural networks, in: 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI), 2019, pp. 73–77,.
[41]
M.A. Saca, Refactoring improving the design of existing code, in: 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII), 2017, pp. 1–3,.
[42]
M. Schnappinger, A. Fietzke, A. Pretschner, Human-level ordinal maintainability prediction based on static code metrics, in: Evaluation and Assessment in Software Engineering EASE 2021, Association for Computing Machinery, New York, NY, USA, 2021, pp. 160–169,.
[43]
T. Sharma, Detecting and managing code smells: research and practice, in: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), 2018, pp. 546–547.
[44]
T. Sharma, How deep is the mud: fathoming architecture technical debt using designite, in: 2019 IEEE/ACM International Conference on Technical Debt (TechDebt), 2019, pp. 59–60,.
[45]
T. Sharma, M. Fragkoulis, D. Spinellis, Does your configuration code smell?, in: Proceedings of the 13th International Conference on Mining Software Repositories MSR '16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 189–200,.
[46]
T. Sharma, M. Fragkoulis, D. Spinellis, House of cards: code smells in open-source c# repositories, in: Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement ESEM ’17, IEEE Press, 2017, pp. 424–429,.
[47]
T. Sharma, P. Mishra, R. Tiwari, Designite - a software design quality assessment tool, in: 2016 IEEE/ACM 1st International Workshop on Bringing Architectural Design Thinking into Developers’ Daily Activities (BRIDGE), 2016, pp. 1–4.
[48]
L. Shen, W. Liu, X. Chen, Q. Gu, X. Liu, Improving machine learning-based code smell detection via hyper-parameter optimization, in: 2020 27th Asia-Pacific Software Engineering Conference (APSEC), 2020, pp. 276–285,.
[49]
Taguchi, G. (1987): System of experimental design: engineering methods to optimize quality and minimize costs. https://api.semanticscholar.org/CorpusID:107131363.
[50]
N. Tsantalis, T. Chaikalis, A. Chatzigeorgiou, Jdeodorant: identification and removal of type-checking bad smells, in: 2008 12th European Conference on Software Maintenance and Reengineering, 2008, pp. 329–331,.
[51]
N. Tsantalis, A. Chatzigeorgiou, Ranking refactoring suggestions based on historical volatility, in: 2011 15th European Conference on Software Maintenance and Reengineering, 2011, pp. 25–34,.
[52]
M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M.D. Penta, A. De Lucia, D. Poshyvanyk, When and why your code starts to smell bad (and whether the smells go away), IEEE Trans. Softw. Eng. 43 (2017) 1063–1088,.
[53]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, in: Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS'17, Curran Associates Inc., Red Hook, NY, USA, 2017, pp. 6000–6010.
[54]
H. Wang, J. Liu, J. Kang, W. Yin, H. Sun, H. Wang, Feature envy detection based on bi-lstm with self-attention mechanism, in: IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2020), IEEE, 2020, pp. 448–457,.
[55]
W. Wang, G. Li, B. Ma, X. Xia, Z. Jin, Detecting code clones with graph neural network and flow-augmented abstract syntax tree, in: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2020, pp. 261–271,.
[56]
X. Wang, J. Liu, L. Li, X. Chen, X. Liu, H. Wu, Detecting and explaining self-admitted technical debts with attention-based neural networks, in: 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020, pp. 871–882.
[57]
X. Yin, C. Shi, S. Zhao, Local and global feature based explainable feature envy detection, in: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 2021, pp. 942–951,.
[58]
D. Yu, Q. Yang, X. Chen, J. Chen, Y. Xu, Graph-based code semantics learning for efficient semantic code clone detection, Inf. Softw. Technol. 156 (2023),. https://www.sciencedirect.com/science/article/pii/S0950584922002397.
[59]
J. Yu, C. Mao, X. Ye, A novel tree-based neural network for Android code smells detection, in: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), 2021, pp. 738–748,.
[60]
J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, X. Liu, A novel neural source code representation based on abstract syntax tree, in: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019, pp. 783–794,.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Science of Computer Programming
Science of Computer Programming  Volume 236, Issue C
Sep 2024
300 pages

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 18 July 2024

Author Tags

  1. Actionable code smell
  2. Design smell
  3. Implementation smell
  4. Code semantics
  5. Code metrics

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media