ISE 527 IEEE Access LaTeX Template

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
Feature Selection and Machine Learning

Approach for Employee Turnover
Prediction
MOHAMMED AYUB1 , MUHAMMAD AL FIRDAUSI 2 , HAITHAM SALEH 2,3 , ANAS ALGHAZI 2,4 ,
AND YASSER ALMOGHATHAWI 2,3
1
Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia (e-mail: g201707490@kfupm.edu.sa)
2
Industrial and Systems Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia (e-mail: g202210120@kfupm.edu.sa,
haithamhs@kfupm.edu.sa, alghazi@kfupm.edu.sa, moghathawi@kfupm.edu.sa)
3
Interdisciplinary Research Center for Smart Mobility and Logistics
4
Interdisciplinary Research Center for Intelligent Secure Systems
Corresponding author: Muhammad Al Firdausi (e-mail: g202210120@kfupm.edu.sa).
This work has been supported by the Research Center on Smart Mobility and Logistics, King Fahd University of Petroleum Minerals,
under grant INML2204.
ABSTRACT Turnover is regarded a big loss in organizations in terms of cost, time, and effort. Especially
when the outflow of talented employees exceed the replacement level of the organization, this turnover event
becomes dysfunctional for the organization. Thus, knowledge of factors that affect the decision of employee
turnover is invaluable for managers to make the best strategy for workforce planning. This study aims to
reveal the most important factors that retain employees in organizations based on feature selections and
machine learning, then deep learning is used for result validations. The results show that the best features
obtained from the feature selections phase could give a better prediction for machine learning methods,
particularly when dealing with a relatively small dataset, the XGBT algorithm gives a satisfactory level of
prediction result.
INDEX TERMS Employee retention, employee turnover prediction, feature selection, machine learning
for decision making
I. INTRODUCTION turnover were known, a better decision could be made to

ECISION making is unavoidable phenomenon in hu- build a long-lasting relationship between the organization
D man life whether in personal, managerial, or political
aspects. But it becomes more complex when multiple con-
and its employees. Indirectly, turnover affects the cost of
the organization where its capital is lost to hire, train, and
flicting criteria are to be considered, such as in human nature. accommodate the employee. Our idea is motivated by [1] and
Nowadays in any organization, human capital is regarded as [2].
the most valuable resources and retaining them is a compet- The proposed study examines five publicly available HR
itive and strategic advantage for the organization. This study datasets, including the IBM Human Resource dataset which
aims to work on decision making in an organization’s human includes up to almost 1500 employee records with a to-
resource (HR) problem. The problems involved are related tal of 38 features. When data are ready for analysis after
to the turnover of the employee and the factors that affect some cleaning process, a machine learning method can be
the employee on whether to stay or leave the organization. implemented to analyze which feature is more prominent
The application of a machine learning method to reveal the than others and then predict the employees’ decision on their
turnover factor can help decision makers, such as the human position in the organization. The results of the method can
resource manager, to make a better decision. be utilized by the HR manager to decide which aspect the
Controlling employee turnover can lead to a successful company should focus on for the sake of a better environment
organization and gives more advantage in its competitiveness to maintain a long-lasting relationship between the organi-
since the organizational strategic plan can be maintained for zation and its employees. Having mentioned that this study
a long-term period [1]. If the prominent factors of employee is limited to automated multiple criteria decision making
VOLUME 4, 2016 1
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
(MCDM) or multiple criteria decision analysis (MCDA) for II. LITERATURE REVIEW
analysis of the importance of different attributes in decision Human resource management has changed significantly over
making for employee retention using machine learning and time, and topics such as artificial intelligence and data mining
deep learning. have gained a lot of attention. Investigations on the elements
The voluntary turnover which is marked by the willingness that drive employee turnover and affect visible and hidden
of a knowledgeable and talented employee to quit the organi- costs of the organization have been scrutinized in [3]. The
zation while he/she is still needed is a big concern for HR. It hiring approach was reported to be upgraded after evalu-
is even worse if the employee joined another organization’s ating the company’s historical data. Another insight after
competitor when the ex-employee knew many aspects of performing an HR analysis is [4] to identify strategies to in-
his/her ex-organization and could give better contribution fluence employee performance and decision-making process
in the new place. Therefore, retaining talented employees in various departments. Since human resource management
will actually keep the organization on its best performance, and relevant topics are a complex issue containing many
and it can be achieved by recognizing the causes of the constraints with different types, managers usually use multi-
turnover along with appropriate management approach im- criteria decision-making (MCDM) techniques in personnel
plementation, which is widely believed to reduce the turnover selection, candidate evaluation, and employee classification.
rate. Therefore, this study focuses on automated MCDA and Several MCDM methods have been applied in human re-
machine learning techniques to identify the factors affecting source management such as the Best-Worst Method (BWM)
human resource turnover, rank the different attribute / criteria combined with the technique for ordering priority based on
using machine learning-based feature selection techniques similarity to the ideal solution (TOPSIS) [5], the combination
that help managers make wise and strategic decisions, and between BWM and the Decision-Making Trial and Evalu-
validate the results of the above methods applying them to ation Laboratory (DEMATEL) [6] and the fuzzy variant of
different machine and deep learning techniques. the Analytic Hierarchy Process (FAHP) and TOPSIS [7].
As such, a number of machine learning techniques for [8] enhanced hiring and placement decisions by providing
the automated analysis of human resource factor importance, a comprehensive analytics framework that can be used as a
cutting-edge machine learning, and deep learning techniques decision support tool for HR recruiters in practical contexts.
are leveraged to validate the importance of turnover factors They showed that it is possible to forecast a candidate’s
in the employee retention decision process. performance in a particular job at the pre-hire stage and
Random Forest (RF) is an ensemble method along with use predictions to create a global optimization model. They
its variant of the decision tree: Extras Tree (ET) and Gra- used a machine learning (ML) model, which in this case
dient Boosting Tree (GBT) are employed for feature impor- is obtained by applying the Variable-Order Bayesian Net-
tance, and Convolutional Neural Network (CNN), FastONN, work (VOBN) model to the recruitment data. Their findings
XGBT and LGBM are used for attrition prediction. The demonstrated that, despite the inherent trade-off between
experimental results show that the results of the models the two, the proposed framework is capable of producing a
presented in this paper have performance comparable to those balanced recruitment strategy while increasing both diver-
of the literature. Additionally, our models achieve the most sity and recruitment success rates compared to real hiring
promising results even with using 4 or 5 features as mini- decisions.The summary of previous conventional works is
mum. This leads to reducing the complexity of the decision provided in Table 1.
process in HR resource retention. Furthermore, to the best The application of machine learning in the field of human
of our knowledge, there are no prior work on FastONN for resource management has a great impact that helps the man-
employee retention prediction. This paves the way for further ager make the right decision based on the huge data available
research, though we cannot have much better performance in the company. This type of data contains many variables
score using FastONN. and thousands of records, which is suitable for machine
The remainder of the paper is organized as follows. Sec- methods to analyze. Several aspects of human resource man-
tion II discusses the related recent literature, and Section agement benefit from machine learning applications, such
III is about methodology. Then Section IV talks about the as in the recruitment process [9], where the application of
datasets used in this work and related preprocessing tech- the contracted Autoencoder in the recommendation system
niques needed to prepare the data for analysis. Afterward, helps candidates find a better match between their qualifica-
Section V is mainly about the experiment setup environment tions and the available job openings. The author overcame
for this work, the result obtained from every dataset from the cold start problem of the recommendation system with
different algorithms. In Section VI, we analyze the result ob- the proposed prototype, but the prototype still needs more
tained from the previous section and describe the factors that improvement in scalability to analyze more realistic data.
mainly affect the retention decision of employees. Finally, Another study related to job requirements and applicant re-
in the last two sections, we report on the problems that we sume matching is done in [35]. The difficulty that companies
faced in the work and suggest a recommendation for future face today is to find the right candidates with sufficient skills
improvement, and then we conclude our work, in Sections and willing to work within the company. Specifically, in the
VII and VIII, respectively. manufacturing type of company which needs a huge number
2 VOLUME 4, 2016
of people, hundreds or even thousands of applicant data are III. METHODOLOGY

available, and HR need great effort to mix and match the MCDA is an operations research subdiscipline that explicitly
resume with the requirement. The model which utilizes La- evaluates multiple conflicting criteria in decision making.
tent Factor Model to establish a hidden relationship between People usually weigh multiple criteria implicitly in their daily
the job requirements and application characteristics. Then the lives and may be content with the consequences of such
deep forest algorithm was used to represent the probability decisions made solely on intuition. Hence, effectively struc-
of matching a candidate with a job requirement. It is claimed turing complex problems and explicitly considering multiple
that a good result was obtained by applying those algorithms, criteria leads to better and more informed decisions. But this
however, a limited factor of human resource system was MCDA requires expert judgment to come up with the final
considered in that study. Another study by [10] used the decision. This is intuitively manual laborious work and an
survey method to collect quantitative data to determine the error-prone job most of the time. Therefore, automation of
optimal voluntary turnover rate for South African organiza- this process is a must for efficient decision-making support.
tions. By analyzing the questionnaire of 83 respondents with To mitigate this, researchers are trying to automate the pro-
multiple linear and non-linear regression, it was found that cess since last few decades. On the other hand, machine
the optional turnover rate is between 14% and 19%. [11] also learning techniques that are based on learned model from
used regression with various time series modeling techniques data are playing a vital role in various fields including com-
to predict turnover in time series data of 135 months. They puter vision, healthcare, decision making and many others.
found that a dynamic regression model with additive trend, In this paper, we deploy different state-of-the-art machine
seasonality, interventions, and a very important economic learning techniques to automate the decision support in
indicator effectively predicted turnover. Wang and Sun [12] determining the influencing factors to determine employee
studied the effect of employee turnover on organization per- turnover in an organization. The high-level methodological
formance and discovered that there is statistically significant pipe line is depicted in Fig. 1. The machine and deep learning
evidence that subsequent teacher turnover and school perfor- models used in this article are depicted in Fig. 2.
mance have a negative association. Unlike previous research
that concluded that organizational performance has no effect A. CONVOLUTIONAL NEURAL NETWORK (CNN)
on turnover, they identified organizational performance feeds Convolutional Neural Network has been state-of-the-art
back to organizations. Lyons and Bandura [13] provide a player in many classification or regression tasks, especially
compact summary of the performances that influence the in computer vision (CV) and Natural Language Processing
employee turnover or stay decision. Emotional attachment (NLP). Though it gives state of the results in complex high
and compensation satisfaction are two important drivers for data dimension, the application of it is not limited to any
an employee to stay in an organization. Not only the internal tabular data such as the one human resource decision support,
factors that affect turnover, Yin et al. [14] investigated the investigated in this paper. The human resource retention
impact of COVID-19 on hotel employee turnover. Their problem is studied using many other machine learning tech-
findings suggested that the strength of the COVID-19 event niques, however, we feel that there is still enough room and
might not significantly affect the intention to increase the reason to investigate CNN for human resource retention deci-
turnover of the employees. Differently from previous work, sion support using different datasets comprising of different
[15] applied the knowledge of supply chain management factors. CNN works by extracting features or representing
to HR issues, where they implemented a dynamic aging features throughout the layers up to flatten or fully con-
chain model to improve the recruiting and training process. nected layer before output layer. After calculating the error,
The study case showed that the aging chain experienced a the control is propagated back until optimum weights are
variety of delays in recruitment, training, and promotions. achieved by undergoing different numbers of epochs based
As a result, the structure of the planning process leads to on certain optimization strategy. As the model is made up of a
alternating periods of surplus and shortage of workers. Using complex architecture with different parameters, it is difficult
descriptive statistics and regression, Collini et al. [17] sug- to control overfitting or underfitting. Taking into account
gested that three statistically significant predictors, mission this factor together with other factors, different regularization
achievement, interpersonal respect, and participation, are techniques are incorporated into the model architecture [18]
good predictors of turnover in health care. However, diversity as shown in Fig. 2 (a). Based on the number of classes to be
climate has no relation to turnover, based on their findings. classified, the last dense layer can be followed by the sigmoid
or Softmax function.
There are still a limited number of studies that deep dive
into the application of machine learning on human resource B. FAST OPERATIONAL NEURAL NETWORK
problem. Since this fact is obvious, this study aims to con- (FASTONN)
tribute in this field where the human resource problem is Multi-Layer Perceptrons (MLPs), sometimes referred to
approached with machine learning algorithms. as feed-forward, fully linked Artificial Neural Networks
(ANNs), are a common type of universal approximator. How-
ever, depending on the function or solution space they try to
VOLUME 4, 2016 3
TABLE 1: Summary of previous conventional works
Sr.# Ref. Problem Data acquisition Techniques Lim./Adv.*

1 [3] Employees’ turnover predic- Arak company HR MLP, PNN, SVM, CART, New integration of HR
tion management KNN, NB, RF and Decision Making
2 [4] Employee performance mea- Attrition data from HR Analytics Very simple analysis
surement several departments
3 [5] Organizations’ performance of Thirty-nine attributes BWM, FTOPSIS Green HR consideration
evaluation obtained from experts
4 [6] Green Supply Chain Manage- Survey data BWM, DEMATEL New GSCM framework
ment implementation- HR re-
lated issues
5 [7] HR in science and technology IMD dataset FAHP, FTOPSIS Revealing HRST compet-
evaluation itiveness and indicators
6 [8] Employees recruitment Nonprofit service or- VOBN New turnover and perfor-
ganization mance prediction
7 Optimal voluntary turnover Survey data Univariate, bivariate and Clear explanation about
[10] rate multivariate analyses. variables relationship
8 Employee turnover forecasting US research organiza- Time series modelling Simple methodologies
[11] tion data techniques
9 Organization performance- Public schools Cross-lagged structural The relationship is better
[12] turnover relationship equation models understood
10 Employee retaining factors Books, research and Statistical discussion Suggestions to retaining
[13] opinion articles talented people
11 Covid-19 impact on turnover Hotels in China Moderated mediation Only focus on event
[14] model strength
12 Workforce planning Employee records System dynamics New application of sys-
[15] tem dynamics
13 Turnover in health care Survey and company Descriptive statistics, Re- Big survey used
[17] records gression
Notes: * = Limitations or Advantages; PNN = Probabilistic Neural Network; SVM = Support Vector Machine;
CART = Classification and Regression Tree; KNN = K-nearest neighbor; NB = Naive Bayes
approximate, their learning performance differs considerably. well-known shortcomings and restrictions of conventional
Their homogeneous configuration, which is mainly based Convolutional Neural Networks (CNNs), such as network
on the linear neuron model, is the main cause of this. As homogeneity with a single linear neuron model. ONNs are
a result, they may entirely fail when the solution space is heterogeneous networks with a generalized neuron model.
very non-linear and complicated, even when they learn quite However, because the same set of operators is applied to
well when the solution space is monotonous, straightforward, all neurons in each layer, the operator search approach in
and linearly separable. It is not surprising that, in many ONNs restricts network heterogeneity in addition to being
difficult problems, only deep CNNs with massive complexity computationally heavy. Furthermore, there is a potential that
and depth can achieve the required diversity and learning performance will decrease since the library of operator sets
performance. This is also true for conventional Convolutional being used directly influences how effectively ONNs func-
Neural Networks (CNNs), which share the same linear neu- tion, particularly when the optimal operator set required for a
ron model with two additional constraints (local connections certain operation is lacking from the library [20]. As a result,
and weight sharing). Operational Neural Networks (ONNs), its newest variation, Self-organized ONNs (Self-ONNs), has
which can be heterogeneous and encapsulate neurons with been shown to perform better than CNNs in a variety of
any set of operators to boost diversity and learn highly regression tests and suggests adding more non-linearity to the
complex and multi-modal functions or spaces with minimal convolutional neuron model [21]. The convolution process in
network complexity and training data, are proposed by [19] CNN is generalized by an operational neuron following this
to address this limitation and achieve a more generalized formula [19]:
model over convolutional neurons. ONNs, a newly developed
type of neural network, are machine learning technology K−1,K−1
that offers better performance, efficiency, and scalability xlik = Plk ψ wik
l
(r, t), yil−1 (m + t, n + t) (r,t)=(0,0)
compared to Convolutional Neural Networks (CNN). By
including neurons with any combination of operators, ONNs Where ψ(·) and P (·) are called nodal and pool operators,
can be heterogeneous, enabling them to learn extremely com- respectively. Every neuron in a heterogeneous ONN system
plex and multimodal functions or spaces with little network has its own set of ψ and P operators. To identify the best
complexity and training data. operators (see [19]) Self-ONNs presented a composite nodal
function that is created and updated frequently during back-
Despite the fact that Operational Neural Networks (ONNs) propagation using the Taylor series-based function approx-
have just recently been put forth as a potential remedy for the imation [20], [22]. An indefinitely differentiable function
4 VOLUME 4, 2016
FIGURE 1: High level methodological pipeline. Extra Tree (ET), Extreme Gradient Boosting Tree (XGBT), Light Gradient
Boosting Machine (LGBM), Fast Operational Neural Network (FastONN), Convolutional Neural Network (CNN), Recursive
feature elimination (RFE).
f (x) centered on a point a has the following Taylor series As a new kind of neural network, which has several
expansion: advantages over CNN [19] especially in restoring an image
from a noisy image, we tried to implement FastONN of its
∞
X f n (a) 1-dimensional type on human resource retention problem
f (x) = (x − a)n where this attempt requires a detailed process to construct
n=0
n!
the model. In our architecture, we stacked 2 FastONN layers
From the above formula, the Taylor polynomial of Qth followed by 2 Linear layers, Drop Out of 0.2 neurons, and
order truncated approximation has the following closed form: sigmoid function. A SelfONN Layer expects the inputs to be
within the range [-1, 1]. A Tanh or Sigmoid activation layer
Q must come before it to achieve that range. Especially when
X f n (a) n working with high values of the q parameter, because the
f (x)Q,a = x
n=0
n! input increases to the q-th power and, if not bounded between
[-1, 1], can explode. In practice, for low q values and shallow
With the aforementioned formulation, any function f (x) networks, the ReLU activation layer can also be used.
may be approximated to the extent a. The operational neuron
model’s main building block extends the Generalized Oper- C. EXTREME GRADIENT BOOSTING TREE (XGBT)
ational Perceptrons (GOP) to the convolutional realm prin- It was initially put forward in [24] and and is based on the
ciples. While maintaining the benefits of sparse-connectivity gradient boosting approach. Due to its quicker training, con-
and weight-sharing, an operational neuron gives the ability to vergence, and performance improvement, this ensemble has
apply nonlinear changes within local receptive fields without grown in favor in many fields of machine learning research.
the burden of additional trainable parameters. According to In addition on its performance improvement, the use of L1
[19], segmentation, denoising, and image-to-image transla- and L2 regularization avoids XGBT to overfit. We utilize the
tion are just a few of the challenging learning tasks that rich default parameters, which include a total of 100 estimators, a
non-linear operators (operator sets) powering ONNs have maximum tree depth of 3, a minimum sample requirement of
been shown to outperform CNN. In the case of any operator 2 for splitting, and a minimum sample requirement of 1 for a
set θ = (ϕ, ψ, f ), an operational neuron with θ alters the leaf.
operation as shown in Figure 3 and is expressed as:
D. LIGHT GRADIENT BOOSTING MACHINE(LGBM)
It has several advantages over XGBT and was initially
x(i, j) = ϕm−1 n−1
u=0 ϕu=0 (ψ(w(u, v), y(i − u, j − v))) suggested by Microsoft [25]. The main difference between
VOLUME 4, 2016 5
FIGURE 2: The architecture of all models used in this paper. CNN1D stands for one dimensional convolutional layer, MaxP
represents MaxPooling, BN stands for Batch normalization, DO is an abbreviation of Dropout, and FONN1D represents one
dimensional FastONN.
XGBT and LGBM is in the method used to develop the trees. of unnecessary or poorly linked features wastes computa-
In XGBT, the grow of trees starts accross the nodes, level- tional resources, training time, and money. Because of this,
by-level, while in LGBM, they are grown starting from one choosing highly significant features helps the machine learn-
node or leafwise. Due to its sampling method called GOSS ing model perform better. The training time of the model for
(Gradient-based One Side Sampling) and the method to real-time deployment is an important factor that is attract-
reduce total effective features called EFB (Exclusive Feature ing serious attention from business owners and researchers
Bundling), LGBM executes more quickly and has higher alike in the context of employee retention factors, where a
accuracy [25]. Please be aware that we utilize the same large volume of data is overwhelming due to the availability
default values and parameter settings as in XGBT. Figures of high data acquisition technology. Therefore, prioritizing
2 (c) and (d) show the high-level architecture of the XGBT features or eliminating less beneficial aspects is the best
and LGBM, respectively. acceptable alternative. As such, we deploy the Recursive
Feature Elimination (RFE) method in combination with three
E. RECURSIVE FEATURE ELIMINATION (RFE) powerful ensemble estimators such as Random Forest, Extra
Trees and Gradient Boosting, one at a time. RFE works as
High-quality features are crucial for machine learning mod-
follows: With this feature selection method, we investigate
els. In addition to lowering model performance, the addition
6 VOLUME 4, 2016
FIGURE 3: Non-linear operations of ONNs functions. [23]
how features of retention prediction with heterogeneous na- tures and splitting them using the optimal (best) split rather
ture reveal their importance as a whole inclusion fashion than a random split [28].
using external classifiers. RFE’s selection of features begin
first with all features and later, as name suggest, recursively 3) Gradient Boosting Tree
abandoning less important features and selecting fewer sets
The GBT is another ensemble methodology that enhances
of important features in each iteration. In the training, each
a decision tree (DT) using a boosting mechanism, with the
feature will have its importance level assigned, and the
idea of fusing disparate weak models into a single strong
features with smallest level are removed in each subsequent
consensus model. Instead of creating a new optimized tree,
iteration. In this way, the process is continued on the re-
GBT requires each tree to reduce the error of the preceding
maining feature set until pre-defined number of feature set
tree. The final model combines the results from the previous
is obtained [26]. Here in this method, selection of number
stage to provide the idea of a more powerful learner [29],
of features to select and estimator is the two most sensitive
[28].
parameters. For the purpose of this work, we exploited three
different ensembles, which are briefly discussed below, and
the detailed parameters are discussed in Section VI-B. IV. DATASETS AND DATA PREPROCESSING
The high quality dataset and efficient preprocessing is an
1) Random Forest important preliminary step for almost all machine or deep
learning techniques to achieve the best results. In this study,
The bagging approaches [27] are used along with combina-
we focus on IBM Employee Dataset, which has 35 variables
tion with single tree predictors in Random Forest. To create
and 1470 records [30]. Although this data set is popular,
a decision tree for each set of samples, RF selects random
several different data sets are taken into account in this study
records from the training set. A majority vote is obtained
and are summarized in Table 3. As seen in the table, the cho-
for the classification problem, or an average is calculated for
sen datasets consist of different number of samples and fea-
the regression problem in order to determine the final output
tures, making the problem investigation more inclusive. First,
from each decision tree’s findings.
we prepare each dataset by removing unnecessary columns,
and encoding non-numerical columns. Subsequently, if the
2) Extra Trees samples are imbalanced, SMOTE upsampling was applied to
It is an ensemble approach in which various data subsamples make the number of samples for each class almost equal. For
are fitted using a randomized decision tree. Accuracy is example, the IBM Employee dataset comprises 26 numeri-
calculated, and over-learning is avoided using an averaging cal variables and 9 categorical variables which come from
technique. In Extra Tree, a split is also formed based on 3 different departments of the company: HR, Research &
random subsets of the feature at each node, and the trees Development, and sales. The exploratory analysis is done,
are constructed using a random subset of features without an initial observation is that the data set imbalance with 237
replacement. These two characteristics set it apart from the employee with attrition and the other 1233 with no attrition.
random forest, which creates random trees by replacing fea- Therefore, data upsampling is applied; it should be noted
VOLUME 4, 2016 7
that before applying upsampling, the data standardization is Dataset#1, belong to XGBT. This case was for using full
applied for data scaling. features, however, when 5 best features were used, XGBT
Furthermore, correlation among the features of a particu- has better performance also in precision.
lar dataset was also studied for illustration purposes. From
dataset [31], the correlation values between variables were C. DATASET #3
calculated and found that the most strong correlation variable The third dataset, titled US firm Employee Turnover dataset
with attrition is satisfaction level of an employee, with value (no name given for privacy reasons), is composed of 2385
of -0.388. Another interesting fact is from [32] where the and 7155 data points for testing and training, respectively.
decision to left a company has a positive correlation of 0.3 Here, LGBM worked better when dealing with all 9 fea-
with the composite score the employees received in their last tures, but CNN performed best on the best 4 features.
evaluation. The correlation table was given in Fig. 4.
D. DATASET #4
V. EXPERIMENTS AND RESULTS Employee Attrition dataset, the fourth dataset, has 25491 and
All the experiments are implemented using Python and Py- 4507 data points for training and testing, respectively.
torch using Scikit-learn, Keras and Tensorflow library. All In all metrics, CNN performed best, except in precision,
experiments run on (i) MacBook Pro 2015 with core i5 CPU where XGBT performed better. But when 5 best features
and 8GB RAM, and (ii) Google colaboratory free account. were analyzed, XGBT performed well in every metric.
For CNN model, binary cross-entropy loss function and
Adam optimizer are used with the batch size of 64 and E. DATASET #5
epoch number of 2000 for all experiments. As for Fast ONN, The fifth dataset, the IBM dataset which is a fictional data
the batch size of 32 and the number of epochs of 100 for created by IBM data scientists, contains 1249 and 221 data
all experiments. The corresponding results are tabulated in points, respectively, for training and testing.
Table 6 in the following sections. These sections provide the CNN gave the best recall metric in full feature setting, and
training and testing results of four different models (XGBT, LGBM performed better in terms of accuracy, F1-score, and
LGBM, CNN, and FastONN) on five different datasets, the precision. For 17 best features, XGBT and LGBM shared the
metric results, and number of samples in training and testing same two best metrics, where XGBT did better in accuracy
dataset. The learning curves of CNN for each datasets with all and precision, while LGBM was good in F1-score and recall.
and best features are shown in Fig. 5. From the figure it can be
observed that, the learning curves of Dataset #1 and #3 (with VI. DISCUSSION AND ANALYSIS
all features) show overfitting. This may be due to inclusion A. DECISION SUPPORT FOR ATTRITION
of less co-related or representative features in training. CLASSIFICATION MODEL SELECTION
The results of the experiment in various data sets show
A. DATASET #1 that XGBT performs well when dealing with HR-related
First dataset is a real dataset shared firstly by Edward data. From the experiment and dataset used, it is obvious
Babushkin [33], named Employee Turnover dataset which is that the dimension of the dataset plays a great effect on
divided into 903 data points for training and 226 data points the result obtained by the machine learning models. More
for testing. complex models such as Convolutional Neural Network and
In this dataset, XGBT model perform better in all metrics. Self-Operational Neural Network were inferior in terms of
The learning curves of CNN tells that the selection of best accuracy, F1-value, precision, and recall, compared to XGBT
features gives better result in validation dataset. With full and LGBM since the datasets have only from 9 up to 34 input
15 features, the model seems to overfit the training dataset features (variables).
as seen in Figure 5 where the model performed well on As an overall comparison in dataset-wise, the highest
training data but poorly on validation data. This is why accuracy of 98.87% is achieved by CNN using all features
XGBT performed better in this dataset since regularization of dataset # 4. The reason is obvious because it has the larger
techniques used in it preventing overfitting and enhance gen- number of samples compared to other datasets. It is interest-
eralization. It includes both L1 and L2 regularization terms ing to mention that even when the features are reduced from
in its objective function, which help control the complexity 9 features to best 5 features the accuracy is 98.27% which
of the model and reduce the impact of individual trees on the is quite impressive. In terms of F1-score as well, when all
final prediction. features are used CNN has the highest score of 98.50% in
dataset #4. But when best features are used, the highest F1-
B. DATASET #2 score is achieved by LGBM on the same dataset. This suggest
The HR Analytics dataset, the second dataset, is composed that as much as 5 human resource factors are enough to
of 3750 data points for testing and 11249 data points for decide an employee attrition in a company. Therefore, the HR
training. manager can take support from such model in analysing and
LGBM gave better precision in this dataset, and other planing HR resources for the organization’s efficient resource
three metrics (Accuracy, F1-score, and Recall) are same as management and growth.
8 VOLUME 4, 2016
TABLE 2: Features from Dataset under study
Dataset Feature Name Description Feature ID

"stag" Employee’s experience in time D1F1
"event" Employee turnover, 1 for yes, 0 for no D1F2
"gender" Employee’s gender, female(f), or male(m) D1F3
"age" Employee’s age (year) D1F4
"industry" Employee’s Industry D1F5
#1 "profession" Employee’s profession D1F6
"traffic" From what pipeline employee came to the company D1F7
"coach" Presence of a coach (training) on probation D1F8
"head_gender" Head (supervisor) gender D1F9
"greywage" The salary does not seem to the tax authorities D1F10
"way" Employee’s way of transportation D1F11
"extraversion, independ,
selfcontrol, anxiety, novator" Test score of employee D1F12–D1F16
"satisfaction_level" Employee satisfaction from 0 (not satisfied) to 1 (very satisfied) D2F1
"last_evaluation" Evaluation score of employee, 1 for yes, 0 for no D2F2
"number_project" how many projects the employee involved D2F3
"average_montly_hours" average working hours of employee in month D2F4
"time_spend_company" years spent in company D2F5
#2 "Work_accident" 0 for no and 1 for yes, whether the employee experienced work accident D2F6
"left" 0 for no, 1 for yes, whether the employee left company D2F7
"promotion_last_5years" 0 for no, 1 for yes, whether the employee got promotion in the last 5 years D2F8
"department" Employee’s department D2F9
"salary" Employee’s salary, low or medium or high D2F10
"department" employee’s department D3F1
"promoted" whether yes (1) or not (0), the employee get promotion in the previous 2 years D3F2
"review" evaluation composite score received by employee D3F3
"projects" number of projects the employee involves in D3F4
"salary" low, medium, or high D3F5
#3 "tenure" years spent in the company D3F6
"satisfaction" employee satisfaction from surveys D3F7
"bonus" whether yes (1) or not (0), the employee received bonus in the past 2 years D3F8
"avg_hrs_month" how many hours in average the employee work in a month D3F9
"left" yes or no, the employee leaves the company D3F10
"ID" employee index D4F1
"satisfaction_level" employee satisfaction from surveys D4F2
"last_evaluation_rating" evaluation score of employee D4F3
"projects_worked_on" how many projects the employee involved D4F4
"average_montly_hours" how many hours in average the employee work in a month D4F5
#4 "time_spend_company" years spent in the company D4F6
"Work_accident" 0 for no and 1 for yes, whether the employee experienced work accident D4F7
"promotion_last_5years" 0 for no, 1 for yes, whether the employee got promotion in the last 5 years D4F8
"Department" employee department D4F9
"salary" low, medium, high D4F10
"Age" employee age D5F1
"Attrition" yes or no, the employee leaves the company D5F2
"BusinessTravel" no travel, rarely, frequently D5F3
"DailyRate" Salary Level D5F4
"Department" employee department D5F5
"DistanceFromHome" home to work distance D5F6
"Education" numerical value for education D5F7
"EducationField" employee education D5F8
"EmployeeCount" only number 1 D5F9
"EmployeeNumber" employee ID D5F10
"EnvironmentSatisfaction" employee satisfaction with the environment D5F11
"Gender" employee gender D5F12
"HourlyRate" hourly salary D5F13
"JobInvolvement" employee involvement in job D5F14
"JobLevel" job level of employee D5F15
"JobRole" role of the employee D5F16
"JobSatisfaction" satisfaction to the job D5F17
#5 "MaritalStatus" divorced, married, single D5F18
"MonthlyIncome" Monthly total income D5F19
"MonthlyRate" Monthly rate D5F20
"NumCompaniesWorked" how many companies the employee worked for D5F21
"Over18" yes or no, the employee’s age is over 18 D5F22
"OverTime" yes or no D5F23
"PercentSalaryHike" percentage of increase in salary for year 2017 and 2018 D5F24
"PerformanceRating" employee performance assessment D5F25
"RelationshipSatisfaction" satisfaction assessment of employee D5F26
"StandardHours" minimum hours for work D5F27
"StockOptionLevel" How much company stocks the employee owns from this company D5F28
"TotalWorkingYears" years of experience D5F29
"TrainingTimesLastYear" hours spent for training D5F30
"WorkLifeBalance" time spent between work and life D5F31
"YearsAtCompany" years spent in the company D5F32
"YearsInCurrentRole" years spent in the current role D5F33
"YearsSinceLastPromotion" years spent after last promotion D5F34
"YearsWithCurrManager" years spent with current manager D4F35
TABLE 3: HR related dataset
Num. Dataset # of Variables # of Records Source

1 Employee Turnover dataset 16 1129 (no = 558, yes = 571) [33]
2 HR Analytics 10 14999 (no = 11428, yes = 571) [31]
3 US company Employee Turnover 10 9540 (no = 6756, yes = 3571) [32]
4 Employee Attrition 11 29998 (no = 22856, yes =7142) [34]
5 IBM 35 1470 (no = 1233, yes = 237) [30]
VOLUME 4, 2016 9
FIGURE 4: Correlation Between Variables from dataset [31]
TABLE 4: Feature Ranking on different datasets

Datasets RF ET GBT Common features
Dataset #1 D1F1, D1F3, D1F4, D1F5, D1F1, D1F4, D1F5, D1F7, D1F1, D1F4, D1F7 D1F1, D1F4, D1F7
D1F6, D1F7, D1F8, D1F9, D1F12, D1F13, D1F14,
D1F11, D1F12, D1F13, D1F15, D1F16
D1F14, D1F15, D1F16
Dataset #2 D2F1, D2F2, D2F3, D2F4, D2F1, D2F2, D2F3, D2F4, D2F1, D2F2, D2F3, D2F4, D2F1, D2F2, D2F3, D2F4,
D2F5, D2F9, D2F10 D2F5, D2F9 D2F5, D2F6, D2F10 D2F5
Dataset #3 D3F3, D3F6, D3F7, D3F9 D3F3, D3F6, D3F7, D3F9 D3F3, D3F7, D3F9 D3F3, D3F7, D3F9
D4F6, D4F7, D4F8, D4F9, D4F6, D4F9, D4F10 D4F6 D4F6
D4F10
D5F11, D5F13, D5F15, D5F13, D5F14, D5F15, D5F6, D5F7, D5F8, D5F11, D5F13, D5F15, D5F16,
D5F16, D5F17, D5F19, D5F16, D5F17, D5F18, D5F12, D5F13, D5F14, D5F17, D5F19, D5F21,
D5F30, D5F32, D5F33, D5F29, D5F32, D5F33, D5F21, D5F23, D5F24, D5F35
D5F34, D5F35 D5F35 D5F26, D5F28, D5F29,
D5F30, D5F31, D5F32,
D5F33, D5F34, D5F35
We analyze further the classification and prediction be- B. DECISION SUPPORT FOR EMPLOYEE RETENTION
havior of attrition in the organizations, with respect to all FACTOR SELECTION
machine learning models. For this, F1-score is chosen as a As discussed earlier, manually focusing and analyzing which
evaluation metric and the scores are shown in Fig. 7. From factors to pay the most attention to retain highly compati-
the figure, it can be noticed that the same performance trend ble employees in the organization, is laborious and costly,
for both scenarios; using all features and using only best Therefore, we deployed three tree-based ensemble machine
features can be seen for all machine learning models, except learning models for selecting factors affecting employee re-
FastONN. In more details, the performance of FastONN is tention. We used the RFE approach to experiment with three
stable when all features are used, compared to when best estimators for human resource feature ranking. As such, we
features are used. For dataset 1 and 5, the performance of applied an enhanced RFE approach with cross-validation in
CNN is inferior than LGBM and XGBT due to lower number which RF is used as an estimator. It was trained using the
of training instances. For other datasets, the performance of Stratified KFold cross-validation method with 10 splits. 100
CNN, LGBM and XGBT are almost the same. Therefore, trees are used for estimation with sample replacements, and
decision maker and HR manager can take the support of any Gini impurity is used to determine information value. Second
of the three techniques in their decisions making regarding ensemble technique we used is Extra Tree (ET) in which the
employee attrition classification. Additionally, the perfor- same parameters as Random Forest for the estimator is used.
mance of CNN and FastONN are more sensitive to irrelevant But in ET, bootstrap is set to equal false, which means that
features in the training process. The highest F1-score of about the subset of samples used to create one tree is not replaced
99% is achieved by CNN in Dataset 4. As a final conclusion while creating future trees. The third technique we used
in this regards, when there are more than ten of thousands of is Gradient Boosting Tree (GBT). As in previous methods,
input samples for training, CNN or fastONN is recommended this estimator is trained with 100 trees using stratified 10-
for employee attrition classification, otherwise LGBM or fold cross-validation of RFE. The Friedman Mean Square
XGBT is the suitable choice. Error (FMSE) is used to calculate the information value.
HR factors importance for each machine learning ensembles
are tabulated in Fig. 4. Note that for implementation of this
RFECV, we made use of scikit-learn library.
If the results are analyzed, it is observed that the number of
10 VOLUME 4, 2016
FIGURE 5: Accuracy curves for different datasets
VOLUME 4, 2016 11
FIGURE 6: Classification results of all models on different datasets using all features
FIGURE 7: F1-score comparison of all machine learning models on different dataset
important factors in human resource is not the same for each three most influential features. Differently, RF ranks traffic
of the three different techniques. This fact can be exploited as the sixth important feature, ET ranks it as the fourth, and
by HR managers to play with their decision boundaries by GBT ranks it as the third important factor. These rankings are
excluding or including the most important HR factors. For validated by training the deep learning models again with the
dataset #1, the number of best HR factors selected are 14, most common features ranked for each dataset listed in Fig.
9 and 3 for RF, ET and GBT, respectively. It is interesting 4 by the three methods of the ensemble of machine learning.
to note that all three techniques rank the same factors as If the attrition classification of considering all features and
importance, showing that the ranking is not inconsistent. best 3 features is analyzed based on Table 6, it is seen that
From Fig. 4 and from the first row of Fig. 8, we can see there is a significance decrease in the classification scores
that all the ML techniques rank ’experience’ and ’age’ within for CNN, XGBT, and LGBM. But FastONN has the best
12 VOLUME 4, 2016
FIGURE 8: Feature ranking of different data sets based on different methods
F1-score, precision, and recall when trained using only three can ignore the less important features (factors) such as salary
best features. It shows the applicability of FastONN to the while formulating HR strategy related to employees. This in
prediction of employee attrition. turn helps HR managers focus on specific factors for efficient
Unlike Dataset# 1, in Dataset# 2, the number of best decision making. Even with the best 5 features, our XGBT
features returned by the three estimators is almost the same. model is able to achieve about 99% of the score in terms of
Interesting enough, all three methods rank the level of job all metrics considered for evaluation.
satisfaction as the first, the last evaluation report as the For the case of Dataset# 3, the performance trend is
second, and so on. It is noteworthy that the importance observed the same as in Dataset# 2. For some cases, the
returned by the estimators is the same for all common 5 performance increased when the models are trained using
featured that we chose for model training for employee the best four features. For example, the accuracy and F1-
attrition classification. If the performance of applying all score of CNN model for determining whether an employee
features and the best features is analyzed (see Table 6), we will stay or leave the company are 87.46% and 84.46%,
observe that the scores of all machine learning models are respectively (Table 6) when the best 4 features (Fig. 4) are
the same. In more detail, the reduction of the number of used for model training. But these scores are significantly low
features from 9 to the best 5 features (Fig. 4 and Fig. 8) when considering all features for CNN (Table 6). For all the
does not degrade the performance, showing that the manager other models except FastONN, the scores are almost the same
VOLUME 4, 2016 13
for both scenario; training the model with all features and FastONN model. Furthermore, the accuracy of other models
best features. It is also served that FasONN perform better may change if applied to other datasets with different input
when only the best features are used. According to the results, features. Therefore, in future work, we can consider differ-
decision makers can consider job satisfaction and working ent datasets with different input features. Hyperparameter
hours, among other factors to be emphasized in HR strategic optimization is another potential future direction for this
planning to retain the most highly competent HR resources research.
to gain competitive advantage.
In Dataset #4, the number of best features returned by RF, VIII. CONCLUSION
ET and GBT are 9, 7 and 5, respectively, with the rankings
of common features are consistent in all three estimators Managers are often very concerned about employee turnover
as in the previous datasets. The 5 best common features due to its negative impact on company capital and strategic
are job satisfaction, period of employment in the company, objectives, typically resulting from inadequate recruitment
last evaluation report, number of projects worked on, and and management practices. Consequently, companies that
working hours (Fig. 4). These findings show that considering rely heavily on data charts can greatly benefit from identi-
factors other than these is just a waste of time, resources, fying the factors that contribute to employee turnover. This
and complication decision-making process in HR resource article employs different techniques, namely feature selec-
retention. This is further validated by training the models tion, automated multi-criteria decision-analysis, and deep
with the best five features and the results in Table 6 show learning techniques, to accurately identify the key drivers of
that about 99% accuracy can be achieved even with the best employee turnover. Although each of these methods is highly
five features. This is valid for LGBM, XGBT, and CNN, valuable and can serve as a useful tool for managers, their
and FastONN has the lower classification accuracy when it implementation in businesses has been largely overlooked.
is trained and tested using only five best features. In our automated multi-criteria decision analysis, we em-
For dataset # 5, from the best features (Fig. 4) returned by ployed recursive feature elimination (RFE) based on three
different estimators, 17 best common features are selected different techniques, Random Forest, Extra Trees, and Gra-
and trained the model to validate the importance of HR dient Boosting to show the main factors influencing em-
factors. From the scores tabulated in Table 6, it is seen that ployee turnover. The results of this method show that, for
for XGBT and LGBM, the scores are comparable with those example from dataset #1, experience, gender, age, industry,
obtained by training all features. For CNN and FastONN, profession, traffic, coach, head gender, way, extraversion,
there are significant differences. The reason may be due to independence, self-control, anxiety, and novator are the most
the limited number of input samples available for training as influential characteristics of retention decision of an em-
both CNN and FastONN are data hungry models. However, ployee based on Random Forest method. The result from
the decision maker can exploit this importance to strategize Extra Trees show that experience, age, industry, traffic, ex-
their HR resources in an efficient manner to make a quality traversion, independence, self-control, anxiety, and novator
decision. For this dataset, the highest F1-score when used are the most significant factors. Gradient Boosting shows that
best features is 73.20% by LGBM. Although we have the experience, age, and traffic are the most significant factors.
highest accuracy of 89.59% by XGBT, we are reluctant to Then we took the common features from those three methods
use this measure of accuracy for this dataset, because it is an and found that experience, age, and traffic are features that
imbalance dataset and it may bias the majority class. gave better testing accuracy than if we incorporated all the
We applied ensemble methods to filter out unnecessary features.
features in HR retention analysis. This is an important step In the subsequent stage, various machine learning algo-
for HR managers and decision makers alike to reduce labori- rithms are utilized to predict the departure of required human
ous burden and cost in taking efficient and quality decisions. resources. These algorithms are utilized and then compared
Our experimental results and validation results show that the such as XGBT (Extreme Gradient Boosting Tree), LGBM
work presented in this paper reduces the number of features (Light Gradient Boosting Machine), CNN (Convolutional
of machine learning model training, while at the same time Neural Network) and FastONN (Fast Operational Neural
the highest performance scores in automated employee reten- Network), which are the prediction algorithms used to eval-
tion classification. The importance of all features can be seen uate the results. Additionally, prediction models that use all
in Fig.8. features are implemented and compared against each other.
The results of the prediction demonstrate that the XGBT
VII. LIMITATIONS AND FUTURE WORK algorithm, in general, outperforms other techniques. Further-
A number of machine learning and deep learning techniques more, five different case studies reveal that among the three
are explored for the classification of employee attrition using methods employed to identify factors affecting employee
different datasets. Although we have tried FastONN, the turnover, total working years, and number of hours per month
accuracy is poor due to low data complexity and a lower are common factors among those case studies. Consequently,
number of input instances. This can be mitigated in the managers should prioritize these two factors to better manage
future work by considering a large data set on a fine-tuned the rate of employee turnover.
14 VOLUME 4, 2016
REFERENCES [21] J. Malik, S. Kiranyaz, M. Yamac, E. Guldogan, and M. Gabbouj,

[1] N. Pourkhodabakhsh, M. M. Mamoudan and A. Bozorgi-Amiri, “Effective “Convolutional versus Self-Organized Operational Neural Networks for
machine learning, Meta-heuristic algorithms and multi-criteria decision Real-World Blind Image Denoising,” arXiv:2103.03070, Mar. 2021,
making to minimizing human resource turnover,” Appl. Intell., vol. 53, https://doi.org/10.48550/arXiv.2103.03070
16309–16331, Dec. 2022, 10.1007/s10489-022-04294-6. [22] J. Malik, S. Kiranyaz, and M. Gabbouj, “Self-organized operational neural
networks for severe image restoration problems,” Neural Netw., vol. 135,
[2] M. Guo, Q. Zhang, X. Liao, F. Y. Chen, D. Zeng and D. D. Zeng, “A
p. 201–211, Mar. 2021, https://doi.org/10.1016/j.neunet.2020.12.014
hybrid machine learning framework for analyzing human decision-making
[23] J. Malik, S. Kiranyaz, and M. Gabbouj, “FastONN – Python based
through learning preferences,” Omega ., vol. 101, pp. 102263–102280,
open-source GPU implementation for Operational Neural Networks,”
Jun. 2021, https://doi.org/10.1016/j.omega.2020.102263
arXiv:2006.02267, Jun. 2020, https://doi.org/10.48550/arXiv.2006.02267
[3] A. M. E. Sikaroudi, R. Ghousi and A. E. Sikaroudi, “A data mining
[24] T. Chen and T. He, “Higgs boson discovery with boosted trees,” NIPS 2014
approach to employee turnover prediction (case study: Arak automotive
workshop on high-energy physics and machine learning, vol. 42, p. 69–80.
parts manufacturing),” J. Ind. Eng., vol. 8, pp. 106–121, Oct. 2015.
2015.
[4] C. Varma and C. Chavan, “A Case of HR Analytics–to Understand Effect
[25] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y.
on Employee Turnover,” J. Ind. Eng., vol. 6, pp. 106–121, Oct. 2015.
Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Adv
[5] H. Gupta, “Assessing organizations performance on the basis Neural Inf Process Syst., vol. 30, Dec. 2017.
of GHRM practices using BWM and Fuzzy TOPSIS,” J. [26] Q. Chen, Z. Meng, X. Liu, Q. Jin, and R. Su, “Decision variants for the
Environ. Manage. ., vol. 226, pp. 201–216, Nov. 2018, automatic determination of optimal feature subset in RF-RFE,” Genes, vol.
https://doi.org/10.1016/j.jenvman.2018.08.005 9, no.6, Jun. 2018. doi: 10.3390/genes9060301.
[6] A. Kumar, S. K. Mangla, S. Luthra, and A. Ishizaka, “Evaluating the [27] L. Breiman, “Random forests,” Mach. Learn., vol. 45, p. 5–32, 2001.
human resource related soft dimensions in green supply chain manage- https://doi.org/10.1023/A:1010933404324.
ment implementation,” Prod. Plan.., vol. 30, pp. 699–715, Nov. 2018, [28] S. Almuhammadi, A. Alnajim, and M. Ayub, “QUIC Network Traffic
https://doi.org/10.1080/09537287.2018.1555342 Classification Using Ensemble Machine Learning Techniques,” Appl. Sci.,
[7] Y.-C. Chou, H.-Y. Yen, V. T. Dang, and C.-C. Sun, “Assessing the Human vol. 13, no. 8, p. 4725, 2023. doi: 10.3390/app13084725.
Resource in Science and Technology for Asian Countries: Application of [29] J. Chen, H. Huang, A. G. Cohn, D. Zhang, and M. Zhou, “Machine
Fuzzy AHP and Fuzzy TOPSIS,” Symmetry, vol. 11, no. 2, p. 251–266, learning-based classification of rock discontinuity trace: SMOTE oversam-
Feb. 2019, doi: 10.3390/sym11020251. pling integrated with GBT ensemble learning,” Int J Min Sci Technol, vol.
[8] D. Pessach, G. Singer, D. Avrahami, H. C. B.-Gal, E. Shmueli 32, no. 2, p. 309–322, 2022. https://doi.org/10.1016/j.ijmst.2021.08.004.
and I. B.-Gal, “Employees recruitment: A prescriptive analytics [30] Pavansubhash, "IBM HR Analytics Employee
approach via machine learning and mathematical programming,” Attrition & Performance" [Online]. Available:
Decis. Support Syst., vol. 134, p. 113290–113307, Jul. 2020, https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset,
https://doi.org/10.1016/j.dss.2020.113290. Accessed on: Jun. 13, 2023
[9] J. Sun, “Machine Learning-Driven Enterprise Human Resource Manage- [31] G. Pujar, "HR analytics" [Online]. Available:
ment Optimization and Its Application,” Comput. Intell. Neurosci., vol. https://www.kaggle.com/datasets/giripujar/hr-analytics, Accessed on:
2022, Aug. 2022, https://doi.org/10.1155/2022/2541421. Jun. 13, 2023
[10] R. Rijamampianina, “Employee turnover rate and organizational perfor- [32] M. Stewart, "Employee turnover" [Online]. Available:
mance in South Africa,” Probl. Perspect. Manag., vol. 13, no. 41, p. 240– https://www.kaggle.com/datasets/marikastewart/employee-turnover,
253, Jan. 2015. Accessed on: Jun. 13, 2023
[11] X. Zhu, W. Seaver, R. Sawhney, S. Ji, B. Holt, G. B. Sanil, and G. Upreti, [33] D. Wijaya, "Employee turnover" [Online]. Available:
“Employee turnover forecasting for human resource management based https://www.kaggle.com/datasets/davinwijaya/employee-turnover,
on time series analysis,” J. Appl. Stat., vol. 4, no. 8, p. 1421–1440, Aug. Accessed on: Jun. 13, 2023
2016, https://doi.org/10.1080/02664763.2016.1214242. [34] P. Sahu, "Employee attrition" [Online]. Available:
[12] W. Wang and R. Sun, “Does organizational performance affect em- https://www.kaggle.com/datasets/analystanand/employee-attrition,
ployee turnover? A re-examination of the turnover–performance re- Accessed on: Jun. 13, 2023
lationship,” Public Adm., vol. 98, no. 1, p. 210–225, Mar. 2020, [35] Q. Xie, “Machine learning in human resource system of intelligent manu-
https://doi.org/10.1111/padm.12648. facturing industry,” Enterp. Inf. Syst., vol. 16, no. 2, p. 264–284, Jan. 2022.
[13] P. Lyons and R. Bandura, “Employee turnover: Features and https://doi.org/10.1080/17517575.2019.1710862
perspectives,” Dev. Learn., vol. 34, no. 1, p. 1–4, Sep. 2019,
https://doi.org/10.1108/DLO-02-2019-0048.
[14] J. Yin, Y. Bi, and Y. Ni, “The impact of COVID-19 on turnover
intention among hotel employees: A moderated mediation
model,” J. Hosp. Tour. Manag., vol. 51, p. 539–549, Jun. 2022,
https://doi.org/10.1016/j.jhtm.2022.05.010.
[15] A. Größler and A. Zock, “Supporting long-term workforce planning with
a dynamic aging chain model: A case study from the service indus-
try,” Hum. Resour. Manag. J., vol. 49, no. 5, p. 829–848, Sep. 2010,
https://doi.org/10.1002/hrm.20382. MOHAMMED AYUB is currently a Ph.D. candi-
[16] S. A. Collini, A. M. Guidroz, and L. M. Perez, “Turnover in health care: date in Computer Science at King Fahd University
the mediating effects of employee engagement,” J. Nurs. Manag., vol. 23, of Petroleum and Minerals (KFUPM), Saudi Ara-
no. 2, p. 169–178, Mar. 2015, https://doi.org/10.1111/jonm.12109. bia. He is an employee (on study leave) of Apollo
[17] S. A. Collini, A. M. Guidroz, and L. M. Perez, “Turnover in health care: General Trading LLC, Ajman, UAE, holding the
the mediating effects of employee engagement,” J. Nurs. Manag., vol. 23, post of Computer Engineer. He received B.Sc.
no. 2, p. 169–178, Mar. 2015, https://doi.org/10.1111/jonm.12109. degree in Computer Science and Engineering, and
[18] A. Jangid, "Protein sequence multi-class classifi- MBA in Human Resource Management from In-
cation using deep learning" [Online]. Available: ternational Islamic University Chittagong (IIUC),
https://medium.com/@jangidajay271/a0b172015c67, Accessed on: Bangladesh in 2007 and 2009, respectively. He
Jun. 13, 2023
also received M.Sc. degree in Computer Science from KFUPM. His research
[19] S. Kiranyaz, T. Ince, A. Iosifidis, and M. Gabbouj, “Operational neural
interest includes Deep and Machine Learning, Signal Processing, Renewable
networks,” Neural. Comput. Appl., vol. 32, p. 6645–6668, Mar. 2020,
https://doi.org/10.1007/s00521-020-04780-3.
Energy, and Smart Grid.
[20] S. Kiranyaz, J. Malik, H. B. Abdallah, T. Ince, A. Iosifidis, and
M. Gabbouj, “Self-organized operational neural networks with gen-
erative neurons,” Neural Netw., vol. 140, p. 294–308, Mar. 2021,
https://doi.org/10.1016/j.neunet.2021.02.028.
VOLUME 4, 2016 15
MUHAMMAD AL FIRDAUSI received B.Eng. YASSER ALMOGHATHAWI received the B.S.

from University of Brawijaya, Indonesia and M.S. and M.S. degrees in industrial and systems engi-
degree from King Saud University, Saudi Arabia, neering from King Fahd University of Petroleum
both in Industrial Engineering. He is currently and Minerals (KFUPM), Dhahran, Saudi Arabia,
a Ph.D. student in Industrial and Systems En- and the Ph.D. degree from the School of Indus-
gineering Department, King Fahd University of trial and Systems Engineering, University of Ok-
Petroleum and Minerals, Saudi Arabia. He has lahoma (OU), Norman, OK, USA. He is an As-
experience working at a complex aircraft manu- sistant Professor with the Department of Systems
facturer company and international FMCG com- Engineering, KFUPM. During his doctoral work
pany. He has completed several courses related to within OU’s Risk-Based Systems Analytics Labo-
machine learning and python programming. His research interests include ratory, he developed optimization models and approaches for the restoration
machine learning, IIOT, data science, process monitoring, fault detection, problem of interdependent networks to enhance their resilience. His research
scheduling, and bibliometric network analysis. interests broadly deal with applications of operations research, including
optimization, mathematical modeling, facility location, and sequencing and
scheduling. He is a member of INFORMS and IAENG.
HAITHAM SALEH received B.S. and M.S. de-

grees in Industrial and Systems Engineering from
King Fahd University of Petroleum and Minerals
(KFUPM), Dhahran, Saudi Arabia in 2008 and
2012 respectively, M.S. degree in Industrial Engi-
neering from the University of Illinois at Urbana-
Champaign, Urbana, IL, USA in 2016, and a Ph.D.
degree in Industrial Engineering from Purdue Uni-
versity, West Lafayette, IN, USA in 2018. He is
currently an Assistant Professor at the Industrial
and Systems Engineering Department, KFUPM. His research interests are
in the areas of applied optimization/OR, artificial intelligence, complex
systems modeling, and intelligent decision support systems.
ANAS ALGHAZI received his B.S. and M.S.

degrees in Systems Engineering from King Fahd
University of Petroleum and Minerals (KFUPM),
Dhahran, Saudi Arabia, in 2003, and 2009 re-
spectively, the M.S. degree in Industrial and Sys-
tems Engineering from the University of Florida,
Gainesville, FL, USA in 2012, and the Ph.D. de-
gree in Systems Engineering from Clemson Uni-
versity, Clemson, SC, in 2017. Prior to joining
KFUPM as a graduate assistant, he worked in a
management consultancy firm and was involved in Business Performance
Improvement, Risk Management, and IT Auditing projects. He is currently
an Assistant Professor at the Systems Engineering Department, KFUPM,
teaching course such as Production Planning and Control, Facility Layout
and Location, Engineering Probability and Statistics, and Engineering Eco-
nomic Analysis.
16 VOLUME 4, 2016

ISE 527 IEEE Access LaTeX Template

Uploaded by

Copyright:

Available Formats

ISE 527 IEEE Access LaTeX Template

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ISE 527 IEEE Access LaTeX Template

Uploaded by

Copyright:

Available Formats

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.

Digital Object Identifier 10.1109/ACCESS.2017.DOI

Feature Selection and Machine Learning

I. INTRODUCTION turnover were known, a better decision could be made to

of people, hundreds or even thousands of applicant data are III. METHODOLOGY

TABLE 1: Summary of previous conventional works

Sr.# Ref. Problem Data acquisition Techniques Lim./Adv.*

FIGURE 3: Non-linear operations of ONNs functions. [23]

TABLE 2: Features from Dataset under study

Dataset Feature Name Description Feature ID

TABLE 3: HR related dataset

Num. Dataset # of Variables # of Records Source

FIGURE 4: Correlation Between Variables from dataset [31]

TABLE 4: Feature Ranking on different datasets

FIGURE 5: Accuracy curves for different datasets

FIGURE 7: F1-score comparison of all machine learning models on different dataset

FIGURE 8: Feature ranking of different data sets based on different methods

REFERENCES [21] J. Malik, S. Kiranyaz, M. Yamac, E. Guldogan, and M. Gabbouj,

MUHAMMAD AL FIRDAUSI received B.Eng. YASSER ALMOGHATHAWI received the B.S.

HAITHAM SALEH received B.S. and M.S. de-

ANAS ALGHAZI received his B.S. and M.S.

You might also like