research-article

Applying Machine Learning to Customized Smell Detection: A Multi-Project Study

Authors:

Daniel Oliveira,

Wesley K. G. Assunção,

Leonardo Souza,

Willian Oizumi,

Alessandro Garcia,

Baldoino FonsecaAuthors Info & Claims

SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering

Pages 233 - 242

https://doi.org/10.1145/3422392.3422427

Published: 21 December 2020 Publication History

Get Access

Abstract

Code smells are considered symptoms of poor implementation choices, which may hamper the software maintainability. Hence, code smells should be detected as early as possible to avoid software quality degradation. Unfortunately, detecting code smells is not a trivial task. Some preliminary studies investigated and concluded that machine learning (ML) techniques are a promising way to better support smell detection. However, these techniques are hard to be customized to promote an early and accurate detection of specific smell types. Yet, ML techniques usually require numerous code examples to be trained (composing a relevant dataset) in order to achieve satisfactory accuracy. Unfortunately, such a dependency on a large validated dataset is impractical and leads to late detection of code smells. Thus, a prevailing challenge is the early customized detection of code smells taking into account the typical limited training data. In this direction, this paper reports a study in which we collected code smells, from ten active projects, that were actually refactored by developers, differently from studies that rely on code smells inferred by researchers. These smells were used for evaluating the accuracy regarding early detection of code smells by using seven ML techniques. Once we take into account such smells that were considered as important by developers, the ML techniques are able to customize the detection in order to focus on smells observed as relevant in the investigated systems. The results showed that all the analyzed techniques are sensitive to the type of smell and obtained good results for the majority of them, especially JRip and Random Forest. We also observe that the ML techniques did not need a high number of examples to reach their best accuracy results. This finding implies that ML techniques can be successfully used for early detection of smells without depending on the curation of a large dataset.

References

[1]

Marwen Abbes, Foutse Khomh, Yann-Gael Gueheneuc, and Giuliano Antoniol. 2011. An empirical study of the impact of two antipatterns, blob and spaghetti code, on programcomprehension. In 15th European Conference on Software Maintenance and Reengineering (CSMR). IEEE, 181--190.

Abstract

References

Cited By

Index Terms

Recommendations

Developers’ perception matters: machine learning to detect developer-sensitive smells

Using developers' feedback to improve code smell detection

DT: an upgraded detection tool to automatically detect two kinds of code smell: duplicated code and feature envy

Comments

Information

Published In

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations