Abstract
Software metrics are used to measure different attributes of software. To practically measure software attributes using these metrics, metric thresholds are needed. Many researchers attempted to identify these thresholds based on personal experiences. However, the resulted experience-based thresholds cannot be generalized due to the variability in personal experiences and the subjectivity of opinions. The goal of this paper is to propose an automated clustering framework based on the expectation maximization (EM) algorithm where clusters are generated using a simplified 3-metric set (LOC, LCOM, and CBO). Given these clusters, different threshold levels for software metrics are systematically determined such that each threshold reflects a specific level of software quality. The proposed framework comprises two major steps: the clustering step where the software quality historical dataset is decomposed into a fixed set of clusters using the EM algorithm, and the threshold extraction step where thresholds, specific to each software metric in the resulting clusters, are estimated using statistical measures such as the mean (μ) and the standard deviation (σ) of each software metric in each cluster. The paper’s findings highlight the capability of EM-based clustering, using a minimum metric set, to group software quality datasets according to different quality levels.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Erni K, Lewerentz C. Applying design metrics to objectoriented frameworks. In Proc. the 3rd IEEE International Software Metrics Symposium, March 1996, pp.64-74.
Abílio R, Padilha J, Figueiredo E, Costa H. Detecting code smells in software product lines — An exploratory study. In Proc. the 12th International Conference on Information Technology — New Generations, April 2015, pp.433-438.
McCabe T J. A complexity measure. IEEE Transactions on Software Engineering, 1976, SE-2(4): 308-320.
Nejmeh B A. NPATH: A measure of execution path complexity and its applications. Commun. ACM, 1988, 31(2): 188-200.
Henderson-Sellers B. Object-Oriented Metrics: Measures of Complexity. Prentice Hall, 1995.
Coleman D, Lowther B, Oman P. The application of software maintainability models in industrial software systems. Journal of Systems and Software, 1995, 29(1): 3-16.
Lanza M, Marinescu R. Object-Oriented Metrics in Practice: Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer, 2006.
Wheeldon R, Counsell S. Power law distributions in class relationships. In Proc. the 3rd IEEE International Workshop on Source Code Analysis and Manipulation, September 2003, pp.45-54.
Concas G, Marchesi M, Pinna S, Serra N. Power-laws in a large object-oriented software system. IEEE Transactions on Software Engineering, 2007, 33(10): 687-708.
Baxter G, Frean M, Noble J et al. Understanding the shape of Java software. In Proc. the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, October 2006, pp.397-412.
French V. Establishing software metric thresholds. In Proc. the 9th International Workshop on Software Measurement, September 1999, Article No. 7.
Shatnawi R, Li W, Swain J, Newman T. Finding software metrics threshold values using ROC curves. Journal of Software Maintenance and Evolution: Research and Practice, 2010, 22(1): 1-16.
Catal C, Alan O, Balkan K. Class noise detection based on software metrics and ROC curves. Information Sciences, 2011, 181(21): 4867-4877.
Herbold S, Grabowski J, Waack S. Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engineering, 2011, 16(6): 812-841.
Do C B, Batzoglou S. What is the expectation maximization algorithm? Nature Biotechnology, 2008, 26: 897-899.
He P, Li B, Liu X, Chen J, Ma Y. An empirical study on software defect prediction with a simplified metric set. Information and Software Technology, 2015, 59: 170-190.
Sharma N, Bajpai A, Litoriya M R. Comparison the various clustering algorithms of Weka tools. International Journal of Emerging Technology and Advanced Engineering, 2012, 2(5): 73-80.
Hill T, Lewicki P. Statistics: Methods and Applications; A Comprehensive Reference for Science, Industry, and Data Mining. StatSoft, 2006.
Chidamber S R, Kemerer C F. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 1994, 20(6): 476-493.
Vale G A D, Figueiredo E M L. A method to derive metric thresholds for software product lines. In Proc. the 29th Brazilian Symposium on Software Engineering, September 2015, pp.110-119.
Benlarbi S, Emam K E, Goel N, Rai S. Thresholds for object-oriented measures. In Proc. the 11th International Symposium on Software Reliability Engineering, October 2000, pp.24-39.
Emam K E, Benlarbi S, Goel N, Melo W, Lounis H, Rai S N. The optimal class size for object-oriented software. IEEE Transactions on Software Engineering, 2002, 28(5): 494-509.
Spinellis D, Jureczko M. Metric descriptions. http://gromit.iiar.pwr.wroc.pl/pinf/ckjm/metric.html, December 2018.
Jureczko M, Madeyski L. Towards identifying software project clusters with regard to defect prediction. In Proc. the 6th International Conference on Predictive Models in Software Engineering, September 2010, Article No. 9.
Jureczko M, Spinellis D. Using object-oriented design metrics to predict software defects. In Proc. the 5th International Conference on Dependability of Computer Systems, June 2010, pp.69-81.
Zhang H. An investigation of the relationships between lines of code and defects. In Proc. the 25th IEEE International Conference on Software Maintenance, September 2009, pp.274-283.
Lipow M. Number of faults per line of code. IEEE Transactions on Software Engineering, 1982, SE-8(4): 437-439.
Ferreira K A M, Bigonha M A S, Bigonha R S, Mendes L F O, Almeida H C. Identifying thresholds for object-oriented software metrics. Journal of Systems and Software, 2012, 85(2): 244-257.
Alves T L, Ypma C, Visser J. Deriving metric thresholds from benchmark data. In Proc. the 26th IEEE International Conference on Software Maintenance, September 2010, Article No. 44.
Oliveira P, Valente M T, Lima F P. Extracting relative thresholds for source code metrics. In Proc. the 2014 IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, February 2014, pp.254-263.
Veado L, Vale G, Fernandes E, Figueiredo E. TDTool: Threshold derivation tool. In Proc. the 20th International Conference on Evaluation and Assessment in Software Engineering, June 2016, Article No. 24.
Lincke R, Lundberg J, Löwe W. Comparing software metrics tools. In Proc. the 2008 International Symposium on Software Testing and Analysis, July 2008, pp.131-142.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
ESM 1
(PDF 504 kb)
Rights and permissions
About this article
Cite this article
Alqmase, M., Alshayeb, M. & Ghouti, L. Threshold Extraction Framework for Software Metrics. J. Comput. Sci. Technol. 34, 1063–1078 (2019). https://doi.org/10.1007/s11390-019-1960-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-019-1960-6