Software_Defect_Prediction_Using_an_Intelligent_Ensemble-Based_Model
Software_Defect_Prediction_Using_an_Intelligent_Ensemble-Based_Model
Corresponding authors: Muhammad Amir Khan (amirkhan@uitm.edu.my) and Tehseen Mazhar (tehseenmazhar719@gmail.com)
This work was supported by Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia, through the Princess Nourah bint
Abdulrahman University Researchers Supporting Project under Grant PNURSP2024R136.
ABSTRACT Software defect prediction plays a crucial role in enhancing software quality while achieving
cost savings in testing. Its primary objective is to identify and send only defective modules to the testing
stage. This research introduces an intelligent ensemble-based software defect prediction model that combines
diverse classifiers. The proposed model employs a two-stage prediction process to detect defective modules.
In the first stage, four supervised machine learning algorithms are employed: Random Forest, Support Vector
Machine, Naïve Bayes, and Artificial Neural Network. These algorithms are optimized through iterative
parameter optimization to achieve the highest accuracy possible. In the second stage, the predictive accuracy
of the individual classifiers is integrated into a voting ensemble to make the final predictions. This ensemble
approach further improves the accuracy and reliability of the defect predictions. Seven historical defect
datasets from the NASA MDP repository, namely CM1, JM1, MC2, MW1, PC1, PC3, and PC4, were
utilized to implement and evaluate the proposed defect prediction system. The results demonstrate that each
dataset’s proposed intelligent system achieved remarkable accuracy, outperforming twenty state-of-the-art
defect prediction techniques, including base classifiers and ensemble methods.
INDEX TERMS Machine learning, software defect prediction, ensemble classification, heterogeneous
classifiers, random forest, support vector machine, naïve Bayes.
product is achieved [6], [7]. The workflow from the power of classification techniques to enhance the accuracy
Development-to-QA team in an SDLC is shown in Figure 1. of defect prediction models. However, prior work in this field
However, defect-free software is not without its challenges. has limitations, including challenges related to classification
Three critical factors that prominently influence software techniques, overfitting, and underfitting [14].
quality assurance are time, financial resources, and the Parallel to classification, Ensemble Modeling (EM) has
availability of skilled manpower. The industry’s ever-growing also emerged as a promising technique combining the
demand necessitates formulating effective testing strategies predictions from multiple ML models to improve overall
to optimize these valuable resources while maintaining the performance [15]. Ensemble methods, such as bagging,
highest software quality standards [8]. boosting, stacking, and random forest, contribute to the field
This is where software defect prediction (SDP) steps into by mitigating inherent challenges [15]. By aggregating the
the spotlight. SDP is the art of leveraging historical data and predictions of multiple base classifiers, ensemble techniques
machine learning (ML) techniques to forecast and identify reduce the risks of overfitting, underfitting, and biases
potential defects in software systems before the testing phase that can affect individual classifiers. They have proven to
[9]. It investigates the complex set of software metrics such be valuable in enhancing the accuracy and robustness of
as code complexity, size, and historical defect data to build defect prediction models by reducing the inherent biases
models capable of gauging the likelihood of defects [10]. of individual classifiers [16], [17]. Nevertheless, researchers
The integration of software defect prediction disrupts have encountered a standard stumbling block: the inherent
the traditional development-to-QA workflow. The feedback susceptibility of ensemble techniques to biases that can influ-
loop is altered by proactively identifying potential defects ence their efficacy [18], [19]. Figure 3 presents an overview
before the testing phase. The predictive insights allow of the software defect prediction using ML techniques.
developers to address and rectify potential issues before the To meet the ever-pressing need for cost-effective, high-
software reaches the QA team. This process streamlines the quality software, the modern software development paradigm
process and reduces the traditional back-and-forth between must be equipped with an SDP system that is both
development and QA. This shift promotes a more efficient effective and efficient. This research unveils an intelligent
and cost-effective software development life cycle [11]. ensemble-based software defect prediction model that com-
A visual representation highlighting the reduced feedback bines the strengths of diverse classifiers and ensemble tech-
loop when the SDP model is in place is shown in Figure 2. niques while optimizing resource utilization cost-effectively.
In the context of software defect prediction, classification The proposed model, known as the Voting Ensemble-Based
techniques take centre stage. They involve categorising data Software Defect Prediction model (VESDP), integrates four
into classes or labels, making them particularly relevant heterogeneous supervised classifiers: Random Forest (RF),
in identifying potential software defects. Various classifica- Support Vector Machine (SVM), Naïve Bayes (NB), and
tion techniques include decision trees, logistic regression, Artificial Neural Network (ANN). Through iterative tuning,
support vector machines, etc [12], [13]. These techniques each classifier is optimized for maximum accuracy, thereby
aid in assessing and addressing software quality concerns elevating the model’s predictive performance. The predictive
proactively. Historically, several studies have utilized the accuracy of these diverse classifiers is further integrated into
a voting ensemble, offering a remedy for the biases often to progress in the field by serving as a benchmark for
encountered when individual classifiers are relied upon in evaluating and fostering the development of advanced
isolation. solutions [21], [22].
The VESDP model’s performance is rigorously evaluated
using seven datasets extracted from the NASA MDP C. CONTRIBUTION OF THE STUDY
repository. In this evaluation, a comprehensive suite of The contributions of this research can be summarized as
eight performance measures, including Predicted Positive follows
Value (PPV), Predicted Negative Value (PNV), True Positive • Novel Ensemble-Based Model: We introduce a pioneer-
Rate (TPR), True Negative Rate, accuracy, Misclassification ing ensemble-based software defect prediction model
Rate (MR), False Positive Ratio (FPR), and False Negative (VESDP) that proficiently combines classification and
Ratio (FNR), are employed to measure the model’s efficacy ensemble techniques, pushing the boundaries of tradi-
and robustness. The results conclusively demonstrate the tional approaches.
higher predictive power of the proposed VESDP model, • Thorough Evaluation with Diverse Datasets: We rigor-
achieving remarkable accuracy across all datasets and ously evaluate the VESDP model by subjecting it to
surpassing state-of-the-art techniques in the field. testing with seven carefully selected datasets from the
NASA MDP repository, offering a robust foundation for
A. OBJECTIVE OF THE STUDY assessment.
This study is aimed at achieving the following research • Remarkable Accuracy: Our VESDP model attains
objectives exceptional accuracy rates across all datasets, clearly
1. Create a software defect prediction model that combines surpassing established state-of-the-art techniques in the
diverse classifiers using a voting ensemble technique field, underscoring its groundbreaking potential.
2. Assess the model’s accuracy using historical defect • Quantified Impact of Ensemble Technique: We quantify
datasets and eight performance measures the tangible impact of integrating predictive accuracy
3. Compare the model’s performance with existing from diverse classifiers through the voting ensemble
techniques to demonstrate its effectiveness technique, revealing its effect through a comprehensive
analysis of multiple performance measures.
B. MOTIVATION OF THE STUDY • Benchmarking Against Modern Methods: We con-
The study explores ensemble models in SDP, aiming duct an extensive comparative analysis, comparing
to enhance predictive performance, stability, and robust- the proposed VESDP model against twenty published
ness [20]. It investigates the impact of heterogeneous techniques, highlighting its excellence in the realm of
supervised machine learning classifiers on software defect software defect prediction
prediction models. Additionally, the research conducts a
comparative analysis with twenty state-of-the-art techniques D. ORGANIZATION OF THE STUDY
to establish the effectiveness of the proposed framework, The rest of the paper is organized as follows: Section II
validate its novelty, and provide valuable insights for the thoroughly reviews existing literature, while Section III
research community and industry. This approach contributes outlines the proposed VESDP framework, detailing its phases
and activities. Section IV presents results and a comparative lower costs. A thorough comparative analysis of several
analysis of the VESDP framework against state-of-the-art classifiers was conducted in the context of software defect
techniques. Section V addresses potential research validity prediction [23]; the authors analyzed ten machine learning
concerns, and Section VI delivers a summary, findings algorithms, including Decision Tree, Naive Bayes, K-Nearest
analysis, and suggestions for future research. Neighbor, Support Vector Machine, Random Forest, Extra
Trees, Adaboost, Gradient Boosting, Bagging, and Multi-
II. LITERATURE REVIEW Layer Perceptron. The analysis was performed on benchmark
A. CLASSIFICATION NASA datasets from the PROMISE warehouse, specifically
Authors in [3] developed an intelligent cloud-based SDP CM1, KC1, KC2, JM1, and PC1. The experimental results
system incorporating data fusion and decision-level machine showed that the employed algorithms achieved higher
learning fusion techniques. The system integrated predictive average accuracy rates on the PC1 dataset. Among classifiers,
accuracy from three classifiers, namely naïve Bayes (NB), the Random Forest learning models with the PCA approach
artificial neural network (ANN), and decision tree (DT) exhibited boosted average performance across the datasets.
using a fuzzy logic-based fusion method. The proposed Figure 4 illustrates the classification process.
system, evaluated using NASA datasets, outperformed other In [24], the authors addressed the challenge of managing
techniques and aimed to achieve high-quality software with a large volume of software defect reports in software
development and maintenance. They introduced a software approaches. Researchers in [28] extensively studied the
defect prediction (SDP) model based on LASSO–SVM to behaviour of various machine learning classification tech-
improve prediction accuracy. The model combined feature niques for software defect prediction, including Naïve Bayes,
selection using the minimum absolute value compression and Multi-Layer Perceptron, Radial Basis Function, Support
selection method with the support vector machine algorithm. Vector Machine, K Nearest Neighbor, kStar, One Rule,
This approach significantly enhanced prediction accuracy, PART, Decision Tree, and Random Forest by employing
with simulation results showing an accuracy of 93.25% NASA datasets. The performance of these techniques was
and 66.67%, a recall rate of 78.04%, and an f-measure evaluated using performance measures such as Precision,
of 72.72%. The proposed model outperformed traditional Recall, F-measure, accuracy, MCC, and ROC Area. The
methods in terms of both accuracy and speed. In [25], results indicated that accuracy and ROC may not be effective
researchers presented a cloud-based framework for real-time performance measures due to class imbalance issues, while
software defect prediction, comparing four back-propagation precision, recall, F-Measure, and MCC are more reliable. The
training algorithms. Bayesian regularization (BR) emerged researchers in [29] introduced a novel approach that inte-
as the most effective. The framework also included a grates genetic algorithms (GA) with support vector machine
fuzzy layer to select the best training function based on (SVM) classifiers and particle swarm algorithms for software
performance. Publicly available NASA datasets were used for fault prediction. The approach applies to large-scale (NASA
evaluation, employing various measures. The results showed MDP) and small-scale (Java open-source projects) datasets.
BR outperformed other training algorithms and widely used The results demonstrate that integrating GA with SVM
machine-learning techniques. and particle swarm algorithms enhances the performance
The researchers in [26] applied machine learning tech- of software fault prediction, addressing limitations observed
niques to analyze the performance of different numbers of in previous studies. Authors in [30]introduced an algorithm
trees in the RF algorithm in the context of software defect based on support vector machines (SVM) and extreme
prediction using the RAPIDMINER machine learning tool. learning machines (ELM) for software reliability prediction.
They compared the performance of different numbers of trees They investigated the factors influencing prediction accuracy,
in the RF algorithm. The results indicate that increasing the such as using previous failure data and the appropriate type
number of trees slightly improves accuracy, with a maximum of failure information. They proposed a model for feature
accuracy of 99.59% and a minimum accuracy of 85.96%. selection using ELM and SVM by addressing dataset imbal-
The research also highlighted the effectiveness of the RF ance issues through the resampling method and applying it
algorithm in software defect prediction, particularly when to NASA Metrics datasets. Experimental results showed that
using around a hundred trees. the ELM-based reliability prediction model achieves higher
The author [27] evaluated various semi-supervised learn- accuracy, specificity, recall, precision, and F1-measure than
ing (SSL) techniques, mainly the extended random forest SVM. The authors in [31] conducted research to highlight
(extRF) technique, for predicting defective modules. The the use of ML methods, particularly SVM, for building
extRF technique extends the random forest approach and software defect prediction models. Figure 5 illustrates defect
employs a weighted mixture of random trees for final prediction process using different classifiers. They evaluated
predictions. The study concludes that SSL techniques, the performance of SVM with different kernel functions on
including active learning, can achieve improved predic- various datasets collected from software repositories. The
tion performance compared to traditional machine learning researchers conducted 1520 experiments on 38 datasets.
The performance of kernel functions was not significantly Regression, Decision Trees, K-nearest neighbour, Support
affected by the granularity of the data but did show Vector Machines, and Ensemble Learning, along with feature
differences in datasets with high imbalance ratios. RBF extraction and selection techniques for classifying software
performed significantly better in highly imbalanced ratios, modules as defect-prone or non-defect-prone. They proposed
while other kernel functions like polynomial and linear a model that utilized partial least squares (PLS) regression
performed well on moderately imbalanced datasets. The and RFE for dimension reduction and the synthetic minority
authors suggested using SVM with the RBF kernel for defect oversampling technique (SMOTE) to handle imbalanced
datasets due to its higher performance than other kernel datasets [35]. The results show that XGBoost and Stacking
functions. To improve software efficiency and reliability, the Ensemble techniques yield the best results with a defect
authors in [32] proposed an efficient and reliable framework prediction accuracy above 0.9. Figure 4 represents defect
for software defect prediction based on naive Bayes and linear prediction using different classifiers.
regression. The framework consists of three main steps: data
preprocessing (noise removal and normalization), feature
extraction using correlation-based analysis, and applying B. ENSEMBLE LEARNING
machine learning models such as naive Bayes and linear The authors in [36] proposed an intelligent system based on
regression. Results show that the proposed framework can feature selection and ensemble machine learning techniques
reach an accuracy of 98.7% using the naive Bayes algorithm, to predict defective modules in the software. A novel metric
which significantly reduces maintenance costs, lowers code selection technique was introduced to select the most relevant
complexity, and improves software quality by predicting features, and a three-step nested approach was employed for
defects early in the development process. accurate prediction. The first step involves using a decision
The authors in [33] conducted a study to investigate tree, support vector machines, and naïve Bayes to detect
the impact of automated parameter optimization on defect faulty modules. In the second step, the predictive accuracy
prediction models using 18 datasets. The authors analyzed of these techniques is integrated using ensemble methods
the classifiers’ efficiency and stability, parameter transfer- such as bagging, voting, and stacking. Finally, fuzzy logic
ability, computational cost, and ranking of the different was applied to further combine the predictive accuracy of the
classifying methods. The results reflected that automated ensemble techniques. The experiments were conducted on a
parameter optimization improved the performance of defect fused software defect dataset, combining five NASA datasets,
prediction models, increasing the AUC (area under the which demonstrated that the proposed system outperforms
curve) performance by up to forty percent. It was also other advanced techniques, achieving an impressive accuracy
observed that optimized classifiers were as stable as those rate of 92.08%.
with default settings, except for random forest classifiers. Researchers in [37] developed a business failure prediction
Grid-search optimization had a low computational cost, model for the US restaurant industry by using a majority vot-
adding less than 30 minutes of additional computation ing ensemble method with a decision tree. They used experi-
time. Furthermore, some rarely-used techniques, like C5.0, mental data from 1980 to 2017 and developed three models:
outperform widely-used techniques after optimization. an entire period model, an economic downturn model, and an
Researchers in [34] addressed the challenges of data economic expansion model. The models achieved prediction
imbalance and high dimensionality in defect datasets. accuracies of 88.02%, 80.81%, and 87.02% respectively.
They employed several ML algorithms, such as Logistic Authors in [38] conducted thorough research to study
the effectiveness of seven Tree-based ensembles including as well as boosting ensembles like AdaBoost, Gradient
bagging ensembles like Random Forest and Extra Trees, Boosting, Hist Gradient Boosting, XGBoost, and CatBoost.
They employed 11 publicly available NASA software defect the predictive power of the model by synthesizing diverse
datasets. The empirical results showed that the tree-based classifier outputs, contributing to a more robust and accurate
bagging ensembles, particularly random forest and extra software defect prediction. An overview of the proposed
trees, outperform the tree-based boosting ensembles. model is shown in Figure 5. It shows that the proposed
In [39], a software defect prediction system was introduced model VESDP model comprises two layers i.e. training
by researchers. This system utilized a nested ensemble and testing. There are three stages in the training layer:
learning (EL) approach, with a Voting classifier as the main 1) data preprocessing 2) base classification 3) ensemble
classifier and three base ensemble classifiers: bagging, boost- classification.
ing, and stacking. The accuracy achieved by this framework In the training layer, the dataset is preprocessed for
on two distinct NASA datasets was 83.46% and 79.65%. classification. The predictive accuracy of base classifiers is
A software defect prediction model was proposed in a study then aggregated in an ensemble classifier, contributing to the
conducted by researchers in [40]. This model employed development of the proposed ensemble-based model. The
multi-layer feed-forward neural networks in combination testing layer comprises one stage only namely prediction.
with stacking as an ensemble technique. Six different search This layer involves the defect prediction for unseen modules
methods were applied for feature selection to enhance based on the trained model. The experiments have been
the model’s performance, with the multilayer perceptron performed using a Python tool that streamlined our data anal-
serving as a subset evaluator. The achieved accuracies ysis process, allowing for efficient preprocessing, advanced
on NASA’s datasets using the best-first search, greedy statistical analysis, and accurate ML modeling enabling us to
stepwise search, and GS methods were 80%, 75%, and 76%, extract valuable insights from research data.
respectively. Existing studies in software defect prediction The following key steps were executed to identify defective
have commonly employed individual classifiers, which, modules in the software:
despite their utility, may suffer from limitations such as • In the first step, datasets comprising various software
overfitting, lack of robustness, and biases inherent to specific metrics were collected and reused [41].
algorithms. • In the second step, preprocessing on the datasets was
Moreover, these standalone classifiers might not capture performed that further included three sub-activities,
the diverse patterns present in complex software datasets, namely dataset splitting, cleaning, and normalization
leading to suboptimal predictive performance [3], [28]. [42], [43]
Although some studies have explored ensemble tech- • In the third step, the VESDP model was trained based
niques, most have predominantly focused on homogeneous on the diverse combination of base classifiers.
classifiers within their ensembles [23], [36]. In con- • In the fourth step, the base classifiers were integrated
trast, the proposed framework introduces a paradigm shift into an ensemble learning technique that aggregated the
by integrating the predictive accuracy of heterogeneous accuracy of base classifiers and produced unbiased and
classifiers–Random Forest (RF), Support Vector Machine more accurate results.
(SVM), Naïve Bayes (NB), and Artificial Neural Network • Finally, in the last step, the preprocessing technique was
(ANN)–through a voting ensemble classification technique. applied to the new modules, and the dataset was input to
This innovative combination addresses the shortcomings of the trained model, which predicts defective modules.
both individual classifiers and conventional homogeneous The primary objective of the proposed approach is to
ensemble approaches. By leveraging the strengths of diverse predict the defective modules in software. The proposed
classifiers, the proposed model enhances interpretability, VESDP model can be expressed by the following mapping.
generalizability, and predictive accuracy, setting it apart as a
more comprehensive and effective solution in software defect Y = f (X ) + ϵ (1)
prediction. The summary of the literature review is shown in
Table 2. It presents the ML techniques proposed for SDP, the where Y represents the prediction made by the model i.e.,
source of datasets employed for research, the specific datasets the module is defective or non-defective, X represents the
used, and the performance measures implemented to analyze module to be passed to the model for prediction purposes, and
the results. ϵ accounts for any deviation between the predicted output Y
and the actual output. The key objective of this research is
III. MATERIALS AND METHODS to minimize the amount of ϵ to make the defect prediction
The proposed research introduces an intelligent ensemble- model more reliable and efficient.
based software defect prediction model. This model utilizes X = x1 , x2 , x3 , . . . . . . xn (2)
heterogeneous supervised ML classifiers for enhanced accu-
racy. The innovative approach aims to address challenges where x1,x2,x3, . . .xn denote the attributes associated with
in predicting software defects efficiently. Individual base the software module. This research aims to find this mapping
classifier outputs are consolidated through a voting ensemble through ensemble-based machine-learning techniques. The
model, strategically leveraging the strengths embedded in the graphical representation of the proposed model having details
proposed VESDP model. This ensemble approach enhances of each step is shown in Figure 6.
A software module X consists of multiple attributes and and unbiased learning process. Furthermore, normalization
can be represented as is instrumental in handling variations in data distribution,
Comprehensive details of each step involved in the training ensuring that the model performs consistently across different
and testing layer are provided in the following sections. datasets [45]. Its role extends beyond numerical stability,
encompassing the robustness and generalization capabilities
A. DATASET COLLECTION
of the proposed VESDP model.
The first step in the training layer is the collection
of historical software defect datasets. The selection of C. CLASSIFICATION
well-established benchmark datasets from NASA’s MDP
Classification is the third step in the training layer that
repository, including CM1, JM1, MC2, MW1, PC1, PC3,
differentiates between defective and non-defective modules.
and PC4, was driven by a commitment to representative-
It trains the models using labeled data and performs the
ness and comparability with existing research. Including
identification of potentially faulty modules [46], [47]. In this
these datasets aligns with industry standards, enhancing
research, four supervised machine learning classifiers of
the generalizability of our proposed solution [41]. Each
a heterogeneous nature have been implemented namely
dataset corresponds to one software component, and each
Random Forest (RF), Support Vector Machine (SVM), Naïve
instance in the dataset represents one software module; also,
Bayes, and Artificial Neural Network (ANN).
one module consists of multiple software quality attributes,
The selection of Random Forest (RF), Support Vector
including LOC_COMMENT, LOC_TOTAL CALL_PAIRS,
Machine (SVM), Naïve Bayes, and Artificial Neural Network
HALSTEAD_LENGTH, and HALSTEAD_CONSTANT,
(ANN) as our base classifiers is rooted in their diverse and
etc., that have been recorded during the development
complementary strengths. RF excels in capturing complex
phase of SDLC. There are various dependent attributes and
relationships within data, particularly useful in software
one independent attribute in each module. The dependent
defect prediction scenarios [26]. SVM, with its ability to
attributes (x1,x2,x3,. . .x) are used to make predictions,
handle non-linear data through kernel functions, offers robust
whereas the independent attribute also called the target
classification capabilities [29]. Naïve Bayes, relying on
variable, represents whether the module is defective (Y) or
probabilistic principles, provides simplicity and efficiency
non-defective (N).
in handling large datasets with conditional independence
assumptions [48].
B. PREPROCESSING Lastly, inspired by neural networks, ANN demonstrates
Pre-processing is the second step of the training layer. strong pattern recognition, which is essential for intricate
This step further involves three sub-activities: 1) dataset software defect patterns [49]. The selection and optimization
splitting, 2) cleaning, and 3) normalization, which enhance of these classifiers contribute to a well-rounded and resilient
the effectiveness of the proposed model. Dataset splitting ensemble, enhancing the adaptability and accuracy of our
is the first sub-activity of the pre-processing step. In this proposed VESDP model. Initially, the model is developed
step, the used datasets are divided into two groups of using training data; based on the training data results, the
training and testing with a ratio of 70:30 employing classifiers have been optimized iteratively to achieve the
the class-based splitting rule [3]. The second sub-activity, highest accuracy.
cleaning, is pivotal for the model’s robustness. Cleaning It has been observed that RF performs best when split
ensures the accuracy of predictions by removing incon- quality is measured using the Gini criterion and the depth of
sistent, inaccurate, or irrelevant data points. This step the tree is restricted to 10. SVM shows maximum accuracy
improves the quality and integrity of the data by reducing when the kernel is set as poly and the complexity factor is
noise, handling missing values, ensuring consistency, and set to 2. ANN performs best when hidden layers are set to 2,
correcting errors within the dataset [44]. The cleaning with 10 neurons in each layer. The rest of the parameters in
activity is performed using the mean imputation method that RF, SVM, and ANN are used with default values. However,
replaced the missing values in the dataset, leading to better parameter tuning is not a primary concern in NB as it relies
predictions. on the assumption that features are conditionally independent
The third sub-activity in the preprocessing step is nor- given the class variable; thus, it has been implemented with
malization. Normalization is a widely used technique that default parameters. The classification step ends by producing
scales and standardizes the input attributes of the dataset the optimized versions of all base classifiers.
by equalizing feature scales within a range of 0 to 1
[42]. Normalization not only facilitates convergence but also
contributes to the stability and efficiency of the machine- D. ENSEMBLE MODELING
learning model. The process of equalizing feature scales Ensemble modeling is the fourth step in the training layer.
eliminates the dominance of attributes with larger scales, It refers to combining multiple individual models to make
preventing them from disproportionately influencing the more accurate predictions or classifications. It is based on
model [43]. This, in turn, aids in achieving a balanced the concept of ‘‘wisdom of the crowd,’’ where combining the
predictions of multiple models often leads to better overall used. The complete source code files along with the datasets
performance than relying on a single model [20], [50]. employed in the framework; have been uploaded to GitHub
This research employs a voting ensemble model that boosts Repository [51]. The pseudo-code of the main ensemble
the accuracy and reliability of predictions [37]. The voting classifier is shown in Figure 7.
ensemble leverages the unique strengths of each base model,
promoting a more robust and reliable prediction system. E. DEFECT PREDICTIONS USING THE VESDP MODEL
The diversity inherent in using multiple models mitigates In this research, a voting ensemble-based software defect
biases and errors that might be present in any single model, prediction model is proposed. The proposed model is applied
fostering a clear understanding of software quality attributes. in the testing layer, which comprises one step only, which is
Furthermore, the ensemble’s resilience to outliers and noise real-time predictions using unseen software modules. In this
in the data adds an extra layer of robustness, ensuring more layer, unlabeled data is passed as input to the function f (X )
consistent and accurate predictions. Overfitting, a common that attaches labels to the respective software modules. It is
challenge in machine learning, is also addressed as the voting observed that the proposed model has a less error rate ϵ
ensemble method naturally reduces the risk of models fitting as compared to the modern techniques implemented for
too closely to specific patterns in the training data. SDP. The output Y of the function f (X ) containing resultant
The ensemble’s ability to aggregate predictions leads to predictions is sent back to the development team of SDLC
an overall improvement in accuracy, making it a valuable that can debug the defective modules before passing it to
asset in the realm of software defect prediction, where the testing team; thus, saving time and effort for the quality
precision and reliability are paramount for effective quality assurance team and making it an economic process for the
assurance in software development [20]. In the proposed organization as well.
model, the predictive accuracy of four heterogeneous base
classifiers, including RF, SVM, NB, and ANN, is given F. PERFORMANCE EVALUATIONL
as input to the voting ensemble model. The proposed In the above-mentioned formulas, α λ reflects the defective
VESDP model exhibits better performance on the voting modules in the software which were correctly predicted
ensemble as compared to base classifiers for the datasets as defective by the model; similarly, α θ represents the
αλ + αθ
non-defective modules in the software which were correctly Accuracy = (7)
predicted as non-defective by the model. βλ and βθ Values αλ + αθ + βλ + βθ
indicate that there is a conflict between actual and predicted Misclassification rate(MR) = 1 − Accuracy (8)
values. βλ Shows that the module was non-defective but it False positive rate(MR) = 1 − TNR (9)
was predicted as defective; on the other hand, βθ indicates False negative rate(MR) = 1 − TPR (10)
that the module was defective but it was predicted as
non-defective by the model.
IV. RESULTS AND DISCUSSION
αλ
Predicted positive value (PPV ) = (3) In this research, an intelligent voting ensemble-based soft-
αλ + βλ ware defect prediction model (VESDP) was implemented.
αθ To perform the experiments, seven publicly accessible NASA
Predicted negative value (PNV ) = (4)
αθ + βθ datasets (CM1, JM1, MC2, MW1, PC1, PC3, and PC4 were
αλ extracted from the MDP repository. In the preprocessing step,
True positive rate (TPR) = (5)
αλ + βθ the datasets were subjected to three sub-activities, namely
αθ splitting, cleaning, and normalization. The splitting activity
True negative rate (TNR) = (6)
αθ + βλ divides the datasets into two sub-sets, namely training, and
V. THREATS TO VALIDITY
FIGURE 14. Performance measures on PC4 dataset.
Validity threats refer to factors or issues that may undermine
the CM1 dataset, eight on MW1, JM1, and PC3 datasets, the accuracy, credibility, or generalizability of the findings in
seven on MC2 and PC1 datasets, and five on the PC4 a research paper. These threats can arise at different stages of
dataset. It is clear from the results that the integration of the research process and can affect the validity of the study’s
conclusions [66]. Some most crucial validity threats are listed costs by minimizing the resources dedicated to quality assur-
below: ance activities during testing. In this research, an intelligent
ensemble-based model for software defect prediction was
A. INTERNAL VALIDITY proposed. The model was implemented using benchmark
This type of validity assesses the adequacy of the chosen datasets extracted from the NASA defect repository. The pro-
prediction techniques for the specific datasets employed in posed model integrated the predictive accuracy of four het-
the research or for other datasets used to tackle the same erogeneous supervised classifiers using the voting ensemble
problem [67]. classification technique. For statistical analysis, eight perfor-
In this research, four supervised classification algorithms mance measures were implemented. To prove the effective-
were used to implement the proposed VESDP model namely ness of the strategy adopted in the proposed model, a compar-
RF, SVM, NB, and ANN based on the diversity in their ative analysis was conducted with state-of-the-art techniques.
computation mechanism and performance. In the future, The proposed VESDP model outperformed modern research
researchers can implement clustering algorithms along with and proved its efficiency for the software defect prediction
feature selection techniques to analyze the performance of process.
software defect prediction models.
VII. LIMITATION OF PROPOSED MODEL
B. EXTERNAL VALIDITY The training data has a significant impact on the effectiveness
This form of validity examines whether the proposed solution of any machine-learning model, including ensemble-based
is equally effective when applied to other datasets associated models. The training dataset’s disparity, missing data,
with the same problem domain [68]. In this research, seven or noise may have a detrimental effect on the model’s
benchmark datasets namely CM1, MW1, PC1, PC3, and PC4 ability to predict the future. The requirements, development
from NASA’s defect repository were employed to implement processes, and code used in software development are
the proposed VESDP model. Hence, the conclusion of continually changing. An ensemble-based model trained
this research cannot be generalized to other defect datasets on historical data may struggle to adapt when faced with
having different attributes. However, the preprocessing steps unexpected changes in project dynamics or new development
including dataset splitting, cleaning, and normalization paradigms.The model looks for trends in historical data to
along with parameter optimization in the classification produce forecasts. If the current task differs greatly from the
step can be implemented by other researchers in their projects in the training dataset, the model’s performance can
studies. suffer.
[4] S. Goyal, ‘‘Heterogeneous stacked ensemble classifier for software [22] A. Iqbal and S. Aftab, ‘‘A classification framework for software defect
defect prediction,’’ in Proc. 6th Int. Conf. Parallel, Distrib. Grid prediction using multi-filter feature selection technique and MLP,’’ Int.
Comput. (PDGC), Waknaghat, India, Nov. 2020, pp. 126–130, doi: J. Mod. Educ. Comput. Sci., vol. 12, no. 1, pp. 18–25, Feb. 2020, doi:
10.1109/PDGC50313.2020.9315754. 10.5815/ijmecs.2020.01.03.
[5] S. Mehta and K. S. Patnaik, ‘‘Stacking based ensemble learning for [23] M. Cetiner and O. K. Sahingoz, ‘‘A comparative analysis for machine
improved software defect prediction,’’ in Proc. 5th Int. Conf. Microelec- learning based software defect prediction systems,’’ in Proc. 11th Int. Conf.
tron., Comput. Commun. Syst., vol. 748, 2021, pp. 167–178. Comput., Commun. Netw. Technol. (ICCCNT), Kharagpur, India, Jul. 2020,
[6] M. Shafiq, F. H. Alghamedy, N. Jamal, T. Kamal, Y. I. Daradkeh, pp. 1–7, doi: 10.1109/ICCCNT49239.2020.9225352.
and M. Shabaz, ‘‘Retracted: Scientific programming using optimized [24] K. Wang, L. Liu, C. Yuan, and Z. Wang, ‘‘Software defect prediction
machine learning techniques for software fault prediction to improve model based on LASSO–SVM,’’ Neural Comput. Appl., vol. 33, no. 14,
software quality,’’ IET Softw., vol. 17, no. 4, pp. 694–704, Jan. 2023, doi: pp. 8249–8259, Jul. 2021, doi: 10.1007/s00521-020-04960-1.
10.1049/sfw2.12091.
[25] M. S. Daoud, S. Aftab, M. Ahmad, M. A. Khan, A. Iqbal, S. Abbas,
[7] Y. Tang, Q. Dai, M. Yang, T. Du, and L. Chen, ‘‘Software defect prediction M. Iqbal, and B. Ihnaini, ‘‘Machine learning empowered software
ensemble learning algorithm based on adaptive variable sparrow search defect prediction system,’’ Intell. Autom. Soft Comput., vol. 31, no. 2,
algorithm,’’ Int. J. Mach. Learn. Cybern., vol. 14, no. 6, pp. 1967–1987, pp. 1287–1300, 2022, doi: 10.32604/iasc.2022.020362.
Jan. 2023, doi: 10.1007/s13042-022-01740-2.
[26] Y. N. Soe, P. I. Santosa, and R. Hartanto, ‘‘Software defect prediction
[8] S. Goyal, ‘‘3PcGE: 3-parent child-based genetic evolution for software
using random forest algorithm,’’ in Proc. 12th South East Asian Technical
defect prediction,’’ Innov. Syst. Softw. Eng., vol. 19, no. 2, pp. 197–216,
Univ. Consortium, Yogyakarta, Indonesia, Mar. 2018, pp. 1–5, doi:
Jun. 2023, doi: 10.1007/s11334-021-00427-1.
10.1109/SEATUC.2018.8788881.
[9] J. Liu, J. Ai, M. Lu, J. Wang, and H. Shi, ‘‘Semantic feature
learning for software defect prediction from source code and external [27] F. H. Alshammari, ‘‘Software defect prediction and analysis using
knowledge,’’ J. Syst. Softw., vol. 204, Oct. 2023, Art. no. 111753, doi: enhanced random forest (extRF) technique: A business process man-
10.1016/j.jss.2023.111753. agement and improvement concept in IoT-based application processing
environment,’’ Mobile Inf. Syst., vol. 2022, pp. 1–11, Sep. 2022, doi:
[10] A. K. Gangwar and S. Kumar, ‘‘Concept drift in software defect prediction:
10.1155/2022/2522202.
A method for detecting and handling the drift,’’ ACM Trans. Internet
Technol., vol. 23, no. 2, pp. 1–28, May 2023, doi: 10.1145/3589342. [28] A. Iqbal, S. Aftab, U. Ali, Z. Nawaz, L. Sana, M. Ahmad, and A. Husen,
[11] M. S. Alkhasawneh, ‘‘Software defect prediction through neural network ‘‘Performance analysis of machine learning techniques on software defect
and feature selections,’’ Appl. Comput. Intell. Soft Comput., vol. 2022, prediction using NASA datasets,’’ Int. J. Adv. Comput. Sci. Appl., vol. 10,
pp. 1–16, Sep. 2022, doi: 10.1155/2022/2581832. no. 5, 2019, doi: 10.14569/IJACSA.2019.0100538.
[12] T. F. Husin and M. R. Pribadi, ‘‘Implementation of LSSVM in [29] H. Alsghaier and M. Akour, ‘‘Software fault prediction using particle
classification of software defect prediction data with feature selection,’’ in swarm algorithm with genetic algorithm and support vector machine
Proc. 9th Int. Conf. Electr. Eng., Comput. Sci. Informat. (EECSI), Jakarta, classifier,’’ Softw., Pract. Exper., vol. 50, no. 4, pp. 407–427, Apr. 2020,
Indonesia, Oct. 2022, pp. 126–131, doi: 10.23919/EECSI56542.2022. doi: 10.1002/spe.2784.
9946611. [30] S. K. Rath, M. Sahu, S. P. Das, S. K. Bisoy, and M. Sain, ‘‘A
[13] J. A. Richards, ‘‘Supervised classification techniques,’’ in Remote Sensing comparative analysis of SVM and ELM classification on software
Digital Image Analysis. Cham, Switzerland: Springer, 2022, pp. 263–367. reliability prediction model,’’ Electronics, vol. 11, no. 17, p. 2707,
[14] B. J. Odejide, A. O. Bajeh, A. O. Balogun, Z. O. Alanamu, K. S. Adewole, Aug. 2022, doi: 10.3390/electronics11172707.
A. G. Akintola, and S. A. Salihu, ‘‘An empirical study on data sampling [31] M. Azzeh, Y. Elsheikh, A. B. Nassif, and L. Angelis, ‘‘Examining the
methods in addressing class imbalance problem in software defect performance of kernel methods for software defect prediction based on
prediction,’’ in Proc. Comput. Sci. Online Conf. Cham, Switzerland: support vector machine,’’ Sci. Comput. Program., vol. 226, Mar. 2023,
Springer, Apr. 2022, pp. 594–610. Art. no. 102916, doi: 10.1016/j.scico.2022.102916.
[15] X. Wu and J. Wang, ‘‘Application of bagging, boosting and stacking [32] A. Rahim, Z. Hayat, M. Abbas, A. Rahim, and M. A. Rahim, ‘‘Software
ensemble and EasyEnsemble methods for landslide susceptibility mapping defect prediction with Naïve Bayes classifier,’’ in Proc. Int. Bhurban
in the three Gorges reservoir area of China,’’ Int. J. Environ. Res. Public Conf. Appl. Sci. Technol. (IBCAST), Islamabad, Pakistan, Jan. 2021,
Health, vol. 20, no. 6, p. 4977, Mar. 2023, doi: 10.3390/ijerph20064977. pp. 293–297, doi: 10.1109/ibcast51254.2021.9393250.
[16] F. Jiang, X. Yu, D. Gong, and J. Du, ‘‘A random approximate reduct- [33] C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto,
based ensemble learning approach and its application in software ‘‘The impact of automated parameter optimization on defect prediction
defect prediction,’’ Inf. Sci., vol. 609, pp. 1147–1168, Sep. 2022, doi: models,’’ IEEE Trans. Softw. Eng., vol. 45, no. 7, pp. 683–711, Jul. 2019,
10.1016/j.ins.2022.07.130. doi: 10.1109/TSE.2018.2794977.
[17] H. Chen, X.-Y. Jing, Y. Zhou, B. Li, and B. Xu, ‘‘Aligned metric represen- [34] S. Mehta and K. S. Patnaik, ‘‘Improved prediction of software defects using
tation based balanced multiset ensemble learning for heterogeneous defect ensemble machine learning techniques,’’ Neural Comput. Appl., vol. 33,
prediction,’’ Inf. Softw. Technol., vol. 147, Jul. 2022, Art. no. 106892, doi: no. 16, pp. 10551–10562, Aug. 2021, doi: 10.1007/s00521-021-05811-3.
10.1016/j.infsof.2022.106892. [35] A. Khalid, G. Badshah, N. Ayub, M. Shiraz, and M. Ghouse, ‘‘Software
[18] A. O. Balogun, A. O. Bajeh, V. A. Orie, and A. W. Yusuf-Asaju, ‘‘Software defect prediction analysis using machine learning techniques,’’ Sustain-
defect prediction using ensemble learning: An ANP based evaluation ability, vol. 15, no. 6, p. 5517, Mar. 2023, doi: 10.3390/su15065517.
method,’’ FUOYE J. Eng. Technol., vol. 3, no. 2, pp. 50–55, Sep. 2018,
[36] S. Abbas, S. Aftab, M. A. Khan, T. M. Ghazal, H. A. Hamadi, and
doi: 10.46792/fuoyejet.v3i2.200.
C. Y. Yeun, ‘‘Data and ensemble machine learning fusion based intelligent
[19] A. O. Balogun, F. B. Lafenwa-Balogun, H. A. Mojeed, V. E. Adeyemo, software defect prediction system,’’ Comput., Mater. Continua, vol. 75,
O. N. Akande, A. G. Akintola, A. O. Bajeh, and F. E. Usman-Hamza, no. 3, pp. 6083–6100, 2023, doi: 10.32604/cmc.2023.037933.
‘‘SMOTE-based homogeneous ensemble methods for software defect
[37] S. Y. Kim and A. Upneja, ‘‘Majority voting ensemble with a deci-
prediction,’’ in Computational Science and Its Applications—ICCSA 2020,
sion trees for business failure prediction during economic down-
vol. 12254, O. Gervasi, B. Murgante, S. Misra, C. Garau, I. B. D. Taniar,
turns,’’ J. Innov. Knowl., vol. 6, no. 2, pp. 112–123, Apr. 2021, doi:
B. O. Apduhan, A. M. A. C. Rocha, E. Tarantino, C. M. Torre, and
10.1016/j.jik.2021.01.001.
Y. Karaca, Eds. Cham, Switzerland: Springer, 2020, pp. 615–631.
[20] R. J. Jacob, R. J. Kamat, N. M. Sahithya, S. S. John, and S. P. Shankar, [38] A. Alazba and H. Aljamaan, ‘‘Software defect prediction using stacking
‘‘Voting based ensemble classification for software defect prediction,’’ generalization of optimized tree-based ensembles,’’ Appl. Sci., vol. 12,
in Proc. IEEE Mysore Sub Sect. Int. Conf. (MysuruCon), Hassan, no. 9, p. 4577, Apr. 2022, doi: 10.3390/app12094577.
India, Oct. 2021, pp. 358–365, doi: 10.1109/MysuruCon52639.2021. [39] M. A. Javed. (2021). A Framework for Software Defect Prediction Using
9641713. Nested-Ensemble Learning and Feature Selection Techniques. [Online].
[21] A. Alsaeedi and M. Z. Khan, ‘‘Software defect prediction using Available: https://vspace.vu.edu.pk/detail.aspx?id=592
supervised machine learning and ensemble techniques: A comparative [40] F. Matloob. (2020). Software Defect Prediction Model Using
study,’’ J. Softw. Eng. Appl., vol. 12, no. 5, pp. 85–100, 2019, doi: Multi-Layer Feed Forward Neural Networks. [Online]. Available:
10.4236/jsea.2019.125007. https://vspace.vu.edu.pk/detail.aspx?id=342
[41] M. Shepperd, Q. Song, Z. Sun, and C. Mair, ‘‘Data quality: Some [61] A. Balogun, R. O. Oladele, H. A. Mojeed, and B. Amin-Balogun,
comments on the NASA software defect datasets,’’ IEEE Trans. Softw. ‘‘Performance analysis of selected clustering techniques for software
Eng., vol. 39, no. 9, pp. 1208–1215, Sep. 2013, doi: 10.1109/TSE.2013.11. defects prediction,’’ Afr. J. Comput. ICT, vol. 12, no. 2, pp. 30–42, 2019.
[42] J. Shi, X. Li, L. Li, C. Ouyang, and C. Xu, ‘‘An efficient deep learning- [62] M. A. Javed, ‘‘A framework for software defect prediction using
based troposphere ZTD dataset generation method for massive GNSS nested-ensemble learning and feature selection techniques,’’ M.S. thesis,
CORS stations,’’ IEEE Trans. Geosci. Remote Sens., 2023. Virtual Univ. Pakistan, Lahore, Pakistan, 2021. [Online]. Available:
[43] W. Du, C. Wu, H. Yu, Q. Kong, Y. Xu, and W. Zhang, ‘‘Determination https://vspace.vu.edu.pk/details.aspx?id=592
of multicomponents in Rubi Fructus by near-infrared spectroscopy [63] A. O. Balogun, S. Basri, L. F. Capretz, S. Mahamad, A. A. Imam,
technique,’’ Int. J. Anal. Chem., vol. 2023, pp. 1–9, Nov. 2023, doi: M. A. Almomani, V. E. Adeyemo, A. K. Alazzawi, A. O. Bajeh, and
10.1155/2023/5575944. G. Kumar, ‘‘Software defect prediction using wrapper feature selection
[44] P. Suresh Kumar, H. S. Behera, J. Nayak, and B. Naik, ‘‘Bootstrap based on dynamic re-ranking strategy,’’ Symmetry, vol. 13, no. 11, p. 2166,
aggregation ensemble learning-based reliable approach for software defect Nov. 2021, doi: 10.3390/sym13112166.
prediction by using characterized code feature,’’ Innov. Syst. Softw. Eng., [64] S. Singh and T. U. Haider, ‘‘Selection of best feature reduction
vol. 17, no. 4, pp. 355–379, Dec. 2021, doi: 10.1007/s11334-021-00399-2. method for module-based software defect prediction,’’ J. Phys., Conf.
[45] H. Tong, S. Wang, and G. Li, ‘‘Credibility based imbalance boosting Ser., vol. 2273, no. 1, May 2022, Art. no. 012002, doi: 10.1088/1742-
method for software defect proneness prediction,’’ Appl. Sci., vol. 10, 6596/2273/1/012002.
no. 22, p. 8059, Nov. 2020, doi: 10.3390/app10228059. [65] S. Amin. (2019). Software Defect Prediction via Machine Learning Clas-
sifiers. [Online]. Available: https://vspace.vu.edu.pk/detail.aspx?id=378
[46] H. Alsawalqah, N. Hijazi, M. Eshtay, H. Faris, A. A. Radaideh, I. Aljarah,
[66] F. Yucalar, A. Ozcift, E. Borandag, and D. Kilinc, ‘‘Multiple-classifiers in
and Y. Alshamaileh, ‘‘Software defect prediction using heterogeneous
software quality engineering: Combining predictors to improve software
ensemble classification based on segmented patterns,’’ Appl. Sci., vol. 10,
fault prediction ability,’’ Eng. Sci. Technol., Int. J., vol. 23, no. 4,
no. 5, p. 1745, Mar. 2020, doi: 10.3390/app10051745.
pp. 938–950, Aug. 2020, doi: 10.1016/j.jestch.2019.10.005.
[47] A. Iqbal, S. Aftab, I. Ullah, M. S. Bashir, and M. A. Saeed, ‘‘A feature [67] U. Sharma B and R. Sadam, ‘‘Towards developing and analysing
selection based ensemble classification framework for software defect metric-based software defect severity prediction model,’’ 2022,
prediction,’’ Int. J. Modern Educ. Comput. Sci., vol. 11, no. 9, pp. 54–64, arXiv:2210.04665.
Sep. 2019, doi: 10.5815/ijmecs.2019.09.06. [68] A. Abdu, Z. Zhai, R. Algabri, H. A. Abdo, K. Hamad, and M. A. Al-antari,
[48] F. M. Tua and W. Danar Sunindyo, ‘‘Software defect prediction using ‘‘Deep learning-based software defect prediction via semantic key features
software metrics with Naïve Bayes and rule mining association methods,’’ of source code—Systematic survey,’’ Mathematics, vol. 10, no. 17, p. 3120,
in Proc. 5th Int. Conf. Sci. Technol. (ICST), Yogyakarta, Indonesia, Aug. 2022.
Jul. 2019, pp. 1–5, doi: 10.1109/icst47872.2019.9166448. [69] Z. Xu, J. Liu, X. Luo, Z. Yang, Y. Zhang, P. Yuan, Y. Tang, and T. Zhang,
[49] S. I. Ayon, ‘‘Neural network based software defect prediction using genetic ‘‘Software defect prediction based on kernel PCA and weighted extreme
algorithm and particle swarm optimization,’’ in Proc. 1st Int. Conf. Adv. learning machine,’’ Inf. Softw. Technol., vol. 106, pp. 182–200, Feb. 2019.
Sci., Eng. Robot. Technol. (ICASERT), Dhaka, Bangladesh, May 2019,
pp. 1–4, doi: 10.1109/ICASERT.2019.8934642.
[50] T. Zhang, Y. Yu, X. Mao, Y. Lu, Z. Li, and H. Wang, ‘‘FENSE: A feature-
based ensemble modeling approach to cross-project just-in-time defect
prediction,’’ Empirical Softw. Eng., vol. 27, no. 7, p. 162, Dec. 2022, doi:
10.1007/s10664-022-10185-8.
[51] [Online]. Available: https://github.com/misbah-here/VESDP_Repository
[52] A. O. Balogun, S. Basri, S. A. Jadid, S. Mahamad, M. A. Al-momani,
A. O. Bajeh, and A. K. Alazzawi, ‘‘Search-based wrapper feature selection
methods in software defect prediction: An empirical analysis,’’ in MISBAH ALI received the B.S. degree (Hons.) in
Intelligent Algorithms in Software Engineering, vol. 1224, R. Silhavy, Ed. computer science from the Punjab University Col-
Cham, Switzerland: Springer International Publishing, 2020, pp. 492–503. lege of Information Technology, Lahore, Pakistan,
[53] I. Kaur and A. Kaur, ‘‘Comparative analysis of software fault prediction in 2015. She is currently pursuing the M.S. degree
using various categories of classifiers,’’ Int. J. Syst. Assurance Eng. in computer science with the Virtual University of
Manage., vol. 12, no. 3, pp. 520–535, Jun. 2021, doi: 10.1007/s13198-021-
Pakistan, with a focus on software engineering.
01110-1.
Her research interests include machine learning,
[54] B. Mumtaz, S. Kanwal, S. Alamri, and F. Khan, ‘‘Feature selection using
data mining, and software process improvement.
artificial immune network: An approach for software defect prediction,’’
Intell. Autom. Soft Comput., vol. 29, no. 3, pp. 669–684, 2021, doi:
10.32604/iasc.2021.018405.
[55] S. Goyal, ‘‘Handling class-imbalance with KNN (neighbourhood) under-
sampling for software defect prediction,’’ Artif. Intell. Rev., vol. 55, no. 3,
pp. 2023–2064, Mar. 2022, doi: 10.1007/s10462-021-10044-w.
[56] A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, ‘‘Performance
analysis of feature selection methods in software defect prediction:
A search method approach,’’ Appl. Sci., vol. 9, no. 13, p. 2764, Jul. 2019.
[57] H. Aljamaan and A. Alazba, ‘‘Software defect prediction using tree-
based ensembles,’’ in Proc. 16th ACM Int. Conf. Predictive Models
Data Anal. Softw. Eng., New York, NY, USA, Nov. 2020, pp. 1–10, doi:
TEHSEEN MAZHAR received the B.Sc. degree
10.1145/3416508.3417114. in computer science from Bahauddin Zakariya
[58] M. Azam, M. Nouman, and A. R. Gill, ‘‘Comparative analysis of University, Multan, Pakistan, the M.Sc. degree in
machine learning technique to improve software defect prediction,’’ computer science from Quaid-i-Azam University,
KIET J. Comput. Inf. Sci., vol. 5, no. 2, pp. 1–11, Jul. 2022, doi: Islamabad, Pakistan, and the M.S. degree (Hons.)
10.51153/kjcis.v5i2.96. in computer science from the Virtual University
[59] S. Goyal and P. K. Bhatia, ‘‘Comparison of machine learning techniques of Pakistan, where he is currently pursuing the
for software quality prediction,’’ Int. J. Knowl. Syst. Sci., vol. 11, no. 2, Ph.D. degree. He is with SED and a Lecturer
pp. 20–40, Apr. 2020, doi: 10.4018/IJKSS.2020040102. with GCUF. He has more than 21 publications in
[60] U. S. Bhutamapuram and R. Sadam, ‘‘With-in-project defect prediction well-reputed journals, such as Electronics (MDPI),
using bootstrap aggregation based diverse ensemble learning technique,’’ Health, Applied Science, Brain Sciences, Symmetry, Future Internet, Peer
J. King Saud Univ.-Comput. Inf. Sci., vol. 34, no. 10, pp. 8675–8691, j, and Computers, Materials & Continua. His research interests include
Nov. 2022, doi: 10.1016/j.jksuci.2021.09.010. machine learning, the Internet of Things, and networks.
YASIR ARIF received the B.S. degree (Hons.) MUHAMMAD AMIR KHAN received the M.Sc.
in computer science from the Global Institute, degree in computer engineering from COMSATS
Lahore, Pakistan, in 2017. His research interests University Islamabad, Abbottabad Campus, and
include machine learning, artificial intelligence, the Ph.D. degree in information technology from
and natural language processing. Universiti Teknologi PETRONAS, Malaysia, with
a focus on cutting-edge research. He is cur-
rently with the Department of Computer Science,
Universiti Teknologi Mara, Malaysia. He is an
accomplished academician and a researcher. As an
Associate Professor, he continues to inspire and
guide the next generation of computer scientists, leaving an indelible mark
on the academic landscape. He laid the groundwork for a future dedicated to
technological advancements with COMSATS University Islamabad. With an
SHAHA AL-OTAIBI (Member, IEEE) received the M.S. degree in computer impressive record of more than 50 research papers published in ISI/Impact
science and the Ph.D. degree in artificial intelligence from King Saud Factor journals and international conferences, he stands as a prominent
University. She is currently an Associate Professor with the Department figure shaping the discourse and advancements within the realm of computer
of Information Systems, College of Computer and Information Sciences, science. His research interests include a broad spectrum, notably focusing on
Princess Nourah bint Abdulrahman University, Saudi Arabia. Her main communication protocols for the Internet of Things (IoT), wireless sensor
research interests include data science, artificial intelligence, machine networks, wireless ad hoc networks, software-defined networks (SDN), and
learning, bio-inspired computing, cybersecurity, and information security. medical imaging. His contributions to these areas underscore his visionary
She is a Senior Fellow of the U.K. Higher Education Academy (SFHEA). approach to technology and his dedication to addressing contemporary
She is a reviewer of some journals and an editorial board member of other challenges.
journals.