Machine Learning Models For Salary Prediction Dataset Using Python

This document discusses using machine learning models for salary prediction. It explores using linear regression, random forest, and neural networks on a dataset of over 20,000 salaries in the US. The neural network model achieved the highest accuracy at 83.2% while linear regression had the fastest training time of 0.363 seconds. Keywords included linear regression, machine learning, neural networks, random forest, salary prediction, and supervised learning.

Uploaded by

Inés Margarita Bravo

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

296 views

Machine Learning Models For Salary Prediction Dataset Using Python

Uploaded by

Inés Margarita Bravo

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)

Machine Learning Models for Salary Prediction

Dataset using Python
Reham Kablaoui Ayed Salman
2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA) | 978-1-6654-5600-5/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICECTA57148.2022.9990316

Computer Engineering Department Computer Engineering Department

Kuwait University Kuwait University
Khaldiya, Kuwait Khaldiya, Kuwait
reham.kablaoui@ku.edu.kw ayed.salman@ku.edu.kw

Abstract— In today’s world, salary is the primary source of Another simple supervised machine learning algorithm is
motivation for many regular employees, which makes salary the Random Forest (RF); it builds decision trees from multiple
prediction very important for both employers and employees. It data samples. Then, it uses its majority vote for classification
helps employers and employees to make estimations of the and its average for regression [5]. Random forest is more
expected salary. Fortunately, technological advancements like accurate than traditional decision trees as it prevents
Data Science and Machine Learning (ML) have made salary overfitting. Also, it can handle missing values effectively
prediction more realistic. In this paper, we exploit the benefits while producing predictions without hyperparameter tuning.
of data science to collect a 20,000+ dataset of salaries in the USA.
We then apply three supervised ML techniques to the obtained An enormous technique of ML is the Neural Network
datasets to produce salary prediction. The learning models are (NN); it is a supervised machine learning algorithm widely
linear regression, random forest, and neural networks. The used in many applications. It is a massively parallel distributed
output of the three models is analyzed and compared to show processor made up of simple processing units with a natural
the following; neural network outperforms the other ML models predisposition for learning to store and create experimental
for better accuracy with accuracy level 83.2%, and linear knowledge readily available for usage [6]. In brief, NN is a
regression has the fastest time of 0.363s for training the model. network of multiple processors connected to mimic human
behaviors. NN works by building three types of layers; the
Keywords—Linear Regression, Machine Learning, Neural
input layer, the hidden layer, and the output layer.
Networks, Random Forest, Salary Prediction, Supervised
Learning. Other supervised machine learning techniques are; support
vector machine (SVM), K-nearest neighbors (KNN), Naïve
I. INTRODUCTION Bayes, and nearest centroid. SVMs and KNN are ML
For many people, the most common reason for resignation algorithms that solve classification and regression problems.
is their salaries; higher salaries motivate employees to stay SVM is a low-dimensional input space that can be
more in a company, and low or unraised ones encourage transformed into a higher-dimensional space by the kernel [7].
employees to switch their work to a different company [1]. And then, KNN uses proximity to classify or anticipate how a
Usually, specific human traits, educational background, and set of individual data points will be arranged [8]. The Naïve
work experience highly affect one's salary. Salary prediction Bayes is a quick and simple machine learning approach for
is needed to make one aware of their salary estimation. At the predicting a class of datasets [9]. And lastly, the nearest
same time, it helps to allow a company recognizes what an centroid works by assigning a label to each training data
employee is expecting from them [1]. Fortunately, the fast- closest to the centroid [10].
emerging topics of Data Science and Machine Learning have
In this paper, we build different supervised machine
allowed us to find enormous datasets for salaries and apply
learning techniques on an enormous dataset to make salary
prediction techniques to them.
predictions. The paper proposes the implementation of three
In many disciplines, data science has increased the ML algorithms; linear regression, random forest, and neural
discovery of probabilistic outcomes. Nowadays, data science networks. We implement our models using Python and run our
is an emerging discipline that uses statistical methods and algorithms on Google Colab. Then, a complete classification
computer science knowledge to make meaningful predictions report and the accuracy of each of the models will be reported
and insights in multiple conventional scholarly fields [2]. At and compared. The main aim of this paper is to exploit the use
the same time, another hugely emerging technology is of data science and machine learning techniques to predict
machine learning, coming in with many of its enormous salaries and then provide feedback on the most suitable ML
disciplines and mechanism to allow computers to learn model for such type of data.
without being explicitly programmed. Machine learning
The flow of this paper is as follows; section II includes a
consists of two main concepts; supervised learning, and
complete literature review, and section III has the
unsupervised learning. Supervised learning is when an
implementation of the three models. Then, section IV shows
algorithm can generate a function of input through a given
the results, and section V concludes this paper.
output; the data has a label. In unsupervised learning, data is
unlabeled; the algorithm works in such a way as to make the II. LITERATURE REVIEW
model learn the features on its own [3].
Many researchers are interested in salary prediction, so
A simple supervised machine learning technique is logistic different researchers applied different ML supervised learning
regression; it predicts the probability output of an event techniques for salary prediction. In this section, we compare
occurring. It takes a set of independent inputs and gives a multiple ML models used for salary prediction by conducting
categorically dependent value predicted [4]. The dependent a complete literature review. Thus, this will help us know the
variable is a binary variable that contains data coded as 0 or 1. currently available research and its possible limitations. ML-

978-1-6654-5600-5/22/$31.00 ©2022 IEEE 143

Authorized licensed use limited to: Universidad Tecnica Federico Santa Maria. Downloaded on May 09,2023 at 12:25:21 UTC from IEEE Xplore. Restrictions apply.
2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)

supervised learning techniques work best for classified data; B. Data preprocessing
which is when input data maps to output data. Salary The dataset obtained is in a CSV format, which helped us
prediction datasets run among many supervised learning to read and clean the data file easily. To begin with, we first
algorithms; this includes regression, random forest, support
had to drop all rows containing NaN values as such values
vector machine, K-nearest neighbors, Naïve Bayes, and
could affect our model prediction by having missing
nearest centroid.
information. Then, we started applying preprocessing
Salary prediction models that use linear regression are techniques.
implemented by Lothe et al. [1] on a small dataset and based
only on four factors, and by Mukherjee & Satyasaivani [11] Our model's learning ability is influenced by; the quality
are based only on employee years of experience. The accuracy of data and the meaningful information it can extract. For this
of both models is between 96% to 98%. Similarly, a regression reason, preprocessing is applied. We apply three techniques
ML model was implemented [12], and salary predictions were of preprocessing; normalization, one hot encoding, and
based only on the employee's job position [12]. At the same binary encoding. Each one of the techniques used is needed
time, regression and SVM models were applied [13] to a huge to preprocess different types of data included in our dataset.
dataset for salary prediction. The results show an accuracy of
79% for regression and 83% for SVM [13]. 1) Normalization: is used for numeric data. In our
dataset, this is the age and the number of working hours. In
Furthermore, the regression, RF, SVM, KNN, and nearest machine learning, normalization transforms the numeric data
centroid techniques predict salary based on different into a value within the range of 0 to 1. Normalization is
companies’ traits and positions for a huge dataset in the UK. needed to ensure that all numeric data is in the same format
The accuracy of the models is as follows; regression model is
when used for training the model. The normalization formula
between 73% to 74%, RF is 81%, SVM is 60%, KNN is 81%,
and the nearest centroid is 65% [14]. Likewise, Martin et al. is shown in equation (1).
[15] apply linear regression, RF, SVM, and KNN models to a
dataset including 4000 job posts in Spain, and the models have (1)
an accuracy of 58% for linear regression, 84% for RF, 66%
for SVM, and 79% for KNN [15]. Another paper included
KNN and Naïve Bayes to predict salary, and accuracy is 2) One hot encoding: is used for categorical data, which
75.9% for KNN and 93.3% Naïve Bayes [16]. is string data that may be one of more than two options. In
our dataset, this is the work class, education, marital status,
From this review, we can say that models with high occupation, and country. The one hot encoding is a
accuracy; either consist of a small dataset or the predictions mechanism used to format the data in such a way as to ease
are made based on a small number of parameters. Thus,
the training process. It produces a binary vector of categorical
making the model’s accuracy rate higher as the dataset is
simple. Nevertheless, in terms of salary prediction conditions, variables with a value of 1 for each row having this option
this will be inaccurate as many factors affect one’s salary. So, and 0 otherwise. We use the Pandas library to convert our
the dataset should include as many parameters as possible. columns into one hot encoding.

III. IMPLEMENTATION 3) Binary encoding: is used for Boolean data or data that
can be one of two options. In our dataset, this is the gender
To implement our ML models, we first find the datasets,
and salary. Binary encoding converts the data in such a way
then apply preprocessing techniques, prepare the training and
testing data, and start modeling the data. that the new column is a statement; it is of a value 1 if it is
true, else it is 0. We use Python functions to convert our
A. Dataset columns into binary encoding.
Our dataset comes from two sources; online research and
C. Preparing the data
an online survey. Firstly, from research, we found 20,000+
datasets from Kaggle on the topic of salary prediction in the After cleaning the data, dropping unwanted rows, and
USA. Then, we conducted an online survey including applying the preprocessing algorithms, our dataset is now
questions similar to the ones in the dataset, and we distributed ready to be used for our proposed learning models. We will
this survey to around 100 people in Kuwait with different first use the train-test split mechanism to prepare the data for
backgrounds. We combined all the data into one CSV file to training and testing.
work on it.
The train-test split is a mechanism used to evaluate the
The datasets focus on predicting whether one’s salary is performance of learning algorithms. It splits the data into two
above 50,000 dollars per year by looking at specific traits. The type sets; a train set and a test set. The training set contains
traits are; age, work sector, education, marital status, inputs with their output; this is needed for the model to learn
occupation, gender, hours per week, and country. The dataset and generalize the concept learned to other data. The testing
also has a column for the salary; this shows if salary is less set is a subset of the data that will allow the model to try and
than or equal to 50,000 dollars or greater than 50,000 dollars. predict the output from the given input after being trained by
The dataset decides the cut-off is 50,000 dollars as it is slightly
the training set.
close to the average household income per year [17].
Scikit-Learn (Sklearn) library in Python can successfully
The dataset mentioned above is tested over three ML
perform the train-test split mechanism. The train-test split
algorithms to predict and check the accuracy of different ML
function from the Sklearn library is used in our code to
models. The models are; logistic regression, random forest,
achieve the train-test split. The test size given is 0.2; this
and neural network models.
means 20% of the dataset should be for testing, and the

144

random state integer is also specified as SEED of 0 to dropout function takes the rate of 0.3; this is a fraction of the
initialize the random number generator to 0. input unit to be dropped.
D. Data modeling Then, we compile the neural network model based on a
After preparing the data for learning, we have designed loss function, optimizer, and metrics using the compiler
our proposed models; logistic regression, random forest, and function. The loss function finds the error in the learning
neural network. process. We set the loss function as binary cross entropy used
for binary classifications. The optimizer optimizes the input
1) Logistic regression model: is implemented using the weights by comparing the loss function and the prediction;
Sklearn library as it already has a built-in model to build the we assign it to Adam with a learning rate of 0.001. Adam is
logistic regression. The model's implementation is simple, so a stochastic gradient descent method using adaptive
we only need to call three functions; a logistic regression estimation of first and second-order moments. Last, the
function, a "fit" for training (takes input and output train set metrics evaluate the overall performance of the model. We
as parameters), and a "predict" function for predictions (takes set the metrics to accuracy.
input test set as a parameter). The logistic regression function Then, a fit function trains the model for a fixed number of
is given a stochastic average gradient descent (sag) as a times based on the epochs. The fit function takes the input
solver; it is a variation of gradient descent and incremental and output train data, epochs, batch size, and validation data.
aggregated gradient methods. SAG uses a random sample of The epochs is 10; this is the number of iterations to pass over
previous gradient values. After model prediction, we print the the entire train dataset. The batch size is 10; this is the number
model's accuracy and other characteristics provided in the of samples per gradient update, and the validation data is the
next section. test sample data.
2) Random forest model: is implemented using the Last, a predict function predicts the output of the input test
Sklearn library as it already has a built-in model to build the dataset (takes input test set as a parameter). After model
random forest trees. The model's implementation is simple, prediction, we print the model's accuracy and other
so we only need three functions; a random forest classifier, a characteristics provided in the next section.
"fit" function for training (takes input and output train set as
parameters), and a "predict" function for predictions (takes IV. RESULTS
input test set as a parameter). We provide the number of trees The three models are implemented in Google Colab using
in the model as 500 to the random forest classifier. After Python code with the needed libraries; Scikit-Learn, Keras,
model prediction, we print the model's accuracy and other and TensorFlow. All models are used to train and predict the
characteristics provided in the next section. data. In all of our models, we use the predict function to
predict the output of a sample testing input. The result of the
3) Neural network model: is implemented using predict function and the test sample output finds the accuracy,
Tensorflow and Keras; they are libraries in Python for confusion matrix, and a complete classification report of a
machine learning. To develop a deep learning or a neural model.
network model, we use the following functions from
The classification report function provides a report of the
Tensorflow and Keras; sequential function, dense function,
trained model that includes the value of precision, recall, f1-
dropout function, compile function, fit function, and predict
score, and the support of the predicted output. Furthermore,
function.
the report provides the average, macro avg, and weighted avg
The sequential function creates an empty linear stack of against those metrics.
layers; this initiates our model. Then, we call multiple denser
The confusion matrix function provides the overall
layers to fill our model. The dense function creates a fully
model's performance on the test data. The output of this
connected layer of nodes. We provide it with the number of
function is a 2x2 matrix shown in Table I. It shows the actual
units, the activation function, and an input shape (for input
result compared to the predicted output.
layers only) based on the type of the layer. The activation
function determines how input transforms into output. All TABLE I. OUTPUT OF CONFUSION MATRIX.
layers implemented in our code are as follows:
N = total predictions Actual: No Actual: Yes
• Input layer is given 20 units for higher accuracy, the Predicted: No True Negative False Negative
activation function is the Rectifier Linear Unit (relu), and Predicted: Yes False Positive True Positive
the input shape is the reshaped input of the dataset. The accuracy score function provides the accuracy value
• Hidden layers are all also given 20 units, and the of the prediction of the trained model. Furthermore, we
activation function is relu. Our model consists of three calculate the training time of each model by using the time
hidden layers as the data is quite large. library in Python.
• Output layer only contains 1 unit, and the activation A. Logistic regression results
function is sigmoid because our data is divided into In the logistic regression model, the classification report is
binary and multilabel classifications. as shown in Fig. 1. In this model, the confusion matrix is
Then, the dropout function drops some neurons from the [[3838 311] [626 746]], and the accuracy score achieved is
input or hidden layer. It helps to avoid overfitting. The around 83.0%. And the time taken to train this model is around
0.363s.

145

V. CONCLUSION
In conclusion, we have tested and compared three types
of supervised machine learning models; logistic regression,
random forest, and neural networks. The models are tested on
a salary prediction dataset to see how one’s personal traits
and educational background affect their salary. Our dataset’s
output is a binary output of 1 if the salary is above 50,000
dollars per year or 0 otherwise. From the obtained results, we
Fig. 1. Logistic regression model classification report. can say that, on such a dataset, the neural network model is
the most accurate with 83.2% accuracy, but it is the slowest
B. Random forest results as it needs 82.79s to train the model. Then, random forest is
In the random forest model, the classification report is as the least accurate with 80.7% accuracy, and its time taken is
shown in Fig. 2. In this model, the confusion matrix is [[3693 considered low as it takes 8.489s to train the model. Then,
456] [ 607 765]], and the accuracy score achieved is around logistic regression has an accuracy level between the other
80.7%. And the time taken to train this model is around two models of 83.0% accuracy, but it’s the fastest as it only
8.489s. needs 0.363s to train the model. Therefore, we can conclude
that a neural network is best when accuracy is the main factor,
and random forest or logistic regression is better when time
is the main factor. The three models are supervised machine
learning techniques used to train computers to predict the
output of a given set of inputs; this is makes predictions easier
for any application.
The main limitation of this paper is that the obtained result
is for only one type of dataset, so reduced categories of the
Fig. 2. Random forest model classification report. dataset may slightly differ from the final result. However, on
a large dataset (similar to the one used in this paper), the ML
C. Neural network results models’ accuracy and time will always be the same. In the
In the neural network model, the classification report is as future, we aim to combine different ML models to see how
shown in Fig. 3. In this model, the confusion score is [[3730 this could affect the overall accuracy of the ML algorithms.
419] [ 508 864]], and the accuracy score achieved is around
83.2%. And the time taken to train this model is around VI. ACKNOWLEDGMENT
82.79s. We wish to acknowledge the generous financial support from
the Kuwait Foundation for the Advancement of Sciences
(KFAS) to present this paper at the conference under the
Research Capacity Building/Scientific Missions program.
REFERENCES
[1] M. D. Lothe, P. Tiwari, N. Patil, S. Patil, and Patil, V, “Salary
Prediction using Machine Learning,” International Journal of Advance
Scientific Research and engineering Trends, vol. 6, issue 5, 2021 pp.
199-202.
[2] L. M. Brodie, “What Is Data Science?” In book: Applied Data Science,
2019, pp. 101-130.
Fig. 3. Neural network model classification report.
[3] R. Bansal, J. Singh, and R. Kaur, “Machine learning and its
D. Comparing results of the models applications: A Review,” JASC: Journal of Applied Science and
Computations, 2020, pp. 1076-5131.
In terms of accuracy, from the three classification reports, [4] H. A. Park, “An Introduction to Logistic Regression: From Basic
confusion matrix, and accuracy score, we can see that the Concepts to Interpretation with Particular Attention to Nursing
neural network model has the best accuracy level, precision, Domain,” J Korean Acad Nurs, vol. 43, issue 2, 2013, pp. 154-164.
recall, and f1-score. Then comes the logistic regression, and [5] M. Azhari, A. Alaoui, Z. Acharoui, B. Ettaki, and J. Zerouaoui,
last is the random forest. In terms of time, the neural network “Adaptation of the Random Forest Method: Solving the problem of
Pulsar Search,” SCA '19: Proceedings of the 4th International
is the slowest, then comes the random forest, and the fastest Conference on Smart City Applications, 2019, pp. 1-6.
is the logistic regression. Table II summarizes the overall [6] A. Sharkawy, “Principle of Neural Network and Its Main Types:
result of the accuracy level and time of the three trained Review,” Journal of Advances in Applied & Computational
models. Mathematics, vol. 7, issue 1, 2020, pp. 8-19.
TABLE II. COMPARISON OF MODELS RESULTS. [7] D. Srivastava, and L. Bhambhu, “Data classification using support
vector machine,” Journal of Theoretical and Applied Information
Model Name Technology, vol. 12, issue 1, 2010, pp. 1-7.
Performance Logistic
Random forest Neural [8] Z. Zhang, “Introduction to machine learning: K-nearest neighbors.
Metrics regression Annals of Translational Medicine,” vol. 4, issue 11, 2016, pp. 218-218.
network
[9] T. N. Viet, and L. M. Hoang, “The Naïve Bayes algorithm for learning
Accurarcy 83.0% 80.7% 83.2% data analytics,” Indian Journal of Computer Science and
Engineering, vol. 12, issue 4, 2021, pp. 1038-1043.
Time 0.363s 8.489s 82.79s

146

[10] S. Johri, S. Debnath, A. Mocherla, A. Singh, A. Prakash, J. Kim, and I. [14] L. Li, X. Liu, and Y. Zhou, “Prediction of Salary in UK,” Computer
Kerenidis, “Nearest Centroid Classification on a Trapped Ion Quantum Science and Engineering department of UC San Diego, 2018.
Computer,” npj Quantum Information vol. 7, issue 1, 2021. [15] I. Martin, A. Mariello, R. Battiti, and J. A. Hern´andez, “Salary
[11] T. Mukherjee, and B. Satyasaivani, “Employee’s Salary Prediction,” Prediction in the IT Job Market with Few High-Dimensional Samples:
International Journal of Advance Research, Ideas and Innovations in A Spanish Case Study,” International Journal of Computational
Technology, vol. 8, issue 3, 2022, pp. 356-359. Intelligence Systems, vol. 11, 2018, pp. 1192-1209.
[12] S. Das, R. Barik, and A. Mukherjee, “Salary Prediction Using [16] K. Gopal, A. Singh, H. Kumar, and S. Sagar, “Salary Prediction Using
Regression Techniques,” SSRN Electronic Journal, 2020. Machine Learning,” International Journal of Innovative Research in
[13] R. Voleti, and B. Jana, “Predictive Analysis of HR Salary using Technology (IJIRT), vol. 8, issue 1, 2021, pp. 380-383.
Machine Learning Techniques,” International Journal of Engineering [17] U.S. household income distribution 2021. Percentage distribution of
Research & Technology (IJERT), vol. 10, issue 1, 2022, pp. 34-37. household income in the United States in 2021 (in U.S. dollars)*
[Graph]. In Statistics.

147

Authorized licensed use limited to: Universidad Tecnica Federico Santa Maria. Downloaded on May 09,2023 at 12:25:21 UTC from IEEE Xplore. Restrictions apply.

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
BS en 993-19-2004
No ratings yet
BS en 993-19-2004
14 pages
AI Berkeley Solution PDF
No ratings yet
AI Berkeley Solution PDF
9 pages
Adaptiva Installation Guide
No ratings yet
Adaptiva Installation Guide
24 pages
Hadoop Python MapReduce Tutorial For Beginners
No ratings yet
Hadoop Python MapReduce Tutorial For Beginners
15 pages
CSI 2110 Summary PDF
No ratings yet
CSI 2110 Summary PDF
17 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
How To Set Up A Hadoop Cluster in Docker
No ratings yet
How To Set Up A Hadoop Cluster in Docker
13 pages
Introduction To Linux I - Chapter 01 Exam Answers
No ratings yet
Introduction To Linux I - Chapter 01 Exam Answers
6 pages
RPH Bio F4 Jan 2020
0% (1)
RPH Bio F4 Jan 2020
16 pages
Dove Brand Audit Report
67% (6)
Dove Brand Audit Report
23 pages
Big Five Personality Predictors of Post-Secondary Academic Performance
No ratings yet
Big Five Personality Predictors of Post-Secondary Academic Performance
20 pages
Mcsa 70-410 Lab Note
100% (1)
Mcsa 70-410 Lab Note
17 pages
MapGuide Programming Manual
No ratings yet
MapGuide Programming Manual
164 pages
File Types in Data Engineering!
No ratings yet
File Types in Data Engineering!
18 pages
How To Install Oracle Database 11gR2 On Oracle Linux 7 VMware Workstation
No ratings yet
How To Install Oracle Database 11gR2 On Oracle Linux 7 VMware Workstation
14 pages
AI Assignment
No ratings yet
AI Assignment
6 pages
Using AI Techniques To Improve Pentesting Automation: Carlos Sarraute
No ratings yet
Using AI Techniques To Improve Pentesting Automation: Carlos Sarraute
48 pages
CP R80.40 Installation and Upgrade Guide
No ratings yet
CP R80.40 Installation and Upgrade Guide
799 pages
9.1.1.6 Lab - Encrypting and Decrypting Data Using OpenSSL - ILM Estudantes
No ratings yet
9.1.1.6 Lab - Encrypting and Decrypting Data Using OpenSSL - ILM Estudantes
3 pages
Coursera Enterprise Catalog - Master
No ratings yet
Coursera Enterprise Catalog - Master
1,702 pages
NetBackup 7.7.3 Upgrade Guide
No ratings yet
NetBackup 7.7.3 Upgrade Guide
130 pages
Console Bacula PDF
No ratings yet
Console Bacula PDF
51 pages
Implementing Samba 4 Sample Chapter
No ratings yet
Implementing Samba 4 Sample Chapter
46 pages
Kubernetes - Objects Nov24
No ratings yet
Kubernetes - Objects Nov24
11 pages
How To Create A SSL Certificate On Apache For Debian 7 PDF
No ratings yet
How To Create A SSL Certificate On Apache For Debian 7 PDF
4 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
3 pages
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
No ratings yet
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
100 pages
Web Application
No ratings yet
Web Application
13 pages
CHAPTER 3 - Scanning Networks
No ratings yet
CHAPTER 3 - Scanning Networks
14 pages
Multi Node Cluster Installation Guide PDF
No ratings yet
Multi Node Cluster Installation Guide PDF
24 pages
Data Science in Spark With Sparklyr::: Cheat Sheet
No ratings yet
Data Science in Spark With Sparklyr::: Cheat Sheet
2 pages
Synchronous Replication
100% (2)
Synchronous Replication
26 pages
Id-11652 Web Python Flask
No ratings yet
Id-11652 Web Python Flask
62 pages
OS Lecture3 - Inter Process Communication
No ratings yet
OS Lecture3 - Inter Process Communication
43 pages
UNIX-LabBook Solutions
No ratings yet
UNIX-LabBook Solutions
29 pages
Machine Learning - Brief
No ratings yet
Machine Learning - Brief
12 pages
MapR Sandbox For Hadoop DocUpdateFor3.1.1
No ratings yet
MapR Sandbox For Hadoop DocUpdateFor3.1.1
7 pages
Hive Queries
No ratings yet
Hive Queries
5 pages
Configure A DNS Name Server On RHEL7 - CentOS7
No ratings yet
Configure A DNS Name Server On RHEL7 - CentOS7
4 pages
Asterisk CDR Mysql
No ratings yet
Asterisk CDR Mysql
11 pages
Terraform Commands
No ratings yet
Terraform Commands
5 pages
Hadoop Admin 171103e Exercise Manual
No ratings yet
Hadoop Admin 171103e Exercise Manual
103 pages
PySpark RDD Assignment
No ratings yet
PySpark RDD Assignment
1 page
Red Hat System Administration I - RH124: Course Outline
No ratings yet
Red Hat System Administration I - RH124: Course Outline
2 pages
Linux Test 2 Q and A
No ratings yet
Linux Test 2 Q and A
17 pages
Module 3
No ratings yet
Module 3
28 pages
01-Docker - 02 - Install Docker Desktop on Windows (1)
No ratings yet
01-Docker - 02 - Install Docker Desktop on Windows (1)
6 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
History of Selenium: Practical 1
No ratings yet
History of Selenium: Practical 1
59 pages
Single Link Example
No ratings yet
Single Link Example
8 pages
Atelier 1 Opendaylight Mininet
No ratings yet
Atelier 1 Opendaylight Mininet
9 pages
Photon Prog Guide
100% (1)
Photon Prog Guide
938 pages
Exploit Writing With Python
100% (1)
Exploit Writing With Python
2 pages
Full Stack UNIT 3
No ratings yet
Full Stack UNIT 3
36 pages
Deep Learning Open-Source
No ratings yet
Deep Learning Open-Source
30 pages
Factor Participating and Impacting E-Markets Pioneers Behavioral Tracking
No ratings yet
Factor Participating and Impacting E-Markets Pioneers Behavioral Tracking
7 pages
Dumpsdownload 210-260 Exam Dumps
No ratings yet
Dumpsdownload 210-260 Exam Dumps
5 pages
DZone ScyllaDB Database Systems Trend Report
No ratings yet
DZone ScyllaDB Database Systems Trend Report
49 pages
TP ACL Configuration
No ratings yet
TP ACL Configuration
2 pages
Mastering Active Directory
From Everand
Mastering Active Directory
VICTOR P HENDERSON
No ratings yet
Oracle VM Manager 2.1.2
From Everand
Oracle VM Manager 2.1.2
Tarry Singh
No ratings yet
Salary Prediction-2
No ratings yet
Salary Prediction-2
26 pages
Salary Prediction Using Machine Learning
No ratings yet
Salary Prediction Using Machine Learning
4 pages
Strength and Ductility of Spirally Reinforced
No ratings yet
Strength and Ductility of Spirally Reinforced
7 pages
Bibliometrics Ansd Traffic Flow Prediction Based On IA
No ratings yet
Bibliometrics Ansd Traffic Flow Prediction Based On IA
17 pages
Changes in Children S Rhythms of Everyday Life During The COVID 19 Pandemic in A
No ratings yet
Changes in Children S Rhythms of Everyday Life During The COVID 19 Pandemic in A
15 pages
File Organization Worksheet - RDS - 2017 01 04
No ratings yet
File Organization Worksheet - RDS - 2017 01 04
2 pages
Digital Libraries
No ratings yet
Digital Libraries
23 pages
Vocabular French
100% (2)
Vocabular French
485 pages
A STUDY ON CUSTOMER PERCEPTION ABOUT SBI ONLINE BANKING
No ratings yet
A STUDY ON CUSTOMER PERCEPTION ABOUT SBI ONLINE BANKING
9 pages
CH 04
No ratings yet
CH 04
15 pages
Unlock The AWL Sublist 1
No ratings yet
Unlock The AWL Sublist 1
75 pages
HR Analytics_Unit 3
No ratings yet
HR Analytics_Unit 3
25 pages
Demography
No ratings yet
Demography
14 pages
Carolina Sánchez Roncancio
No ratings yet
Carolina Sánchez Roncancio
3 pages
School Aptitude, Academic Performance: Basis For Enhancement of Service Program
No ratings yet
School Aptitude, Academic Performance: Basis For Enhancement of Service Program
9 pages
Summer Internship Report On Capacita Conncect PVT LTD
No ratings yet
Summer Internship Report On Capacita Conncect PVT LTD
26 pages
Survey Questionnaire3 25 1 1 10
No ratings yet
Survey Questionnaire3 25 1 1 10
8 pages
Peerj Cs 1481
No ratings yet
Peerj Cs 1481
22 pages
Revised Introduction
No ratings yet
Revised Introduction
22 pages
Literature Review 500 Words
100% (3)
Literature Review 500 Words
7 pages
Capstone Forms 2
No ratings yet
Capstone Forms 2
15 pages
OnaloUgbedeetale Payment
No ratings yet
OnaloUgbedeetale Payment
13 pages
An Investigation Into The Impact of Digitalization in The SME's Development in Namibia: A Systematic Literature Review
No ratings yet
An Investigation Into The Impact of Digitalization in The SME's Development in Namibia: A Systematic Literature Review
5 pages
Literature Review Lancaster University
100% (2)
Literature Review Lancaster University
8 pages
Detailed Security Risk Assessment Template: Executive Summary
No ratings yet
Detailed Security Risk Assessment Template: Executive Summary
8 pages
Designing Risk Qualitative Assessment On Fiber Optic Instalation
No ratings yet
Designing Risk Qualitative Assessment On Fiber Optic Instalation
13 pages
Holy Angel University Basic Education Department #1 Holy Angel Avenue, Sto. Rosario, Angeles City SY 2023 - 2024 Name: - Section
No ratings yet
Holy Angel University Basic Education Department #1 Holy Angel Avenue, Sto. Rosario, Angeles City SY 2023 - 2024 Name: - Section
1 page
Transportation Research Part F
No ratings yet
Transportation Research Part F
11 pages
Coding Synthesis
No ratings yet
Coding Synthesis
13 pages
Daily Lesson Plan CO 1 - General Biology 2-MACapinig
No ratings yet
Daily Lesson Plan CO 1 - General Biology 2-MACapinig
3 pages
Observational Study
No ratings yet
Observational Study
15 pages
Milk Candy
No ratings yet
Milk Candy
109 pages
Introduction To Forensic Chemistry
No ratings yet
Introduction To Forensic Chemistry
4 pages
Resume - Silver - Marlyne 07 Mar 2024
No ratings yet
Resume - Silver - Marlyne 07 Mar 2024
4 pages
Learning by Grafting Knowledge
No ratings yet
Learning by Grafting Knowledge
13 pages

Machine Learning Models For Salary Prediction Dataset Using Python

Uploaded by

Machine Learning Models For Salary Prediction Dataset Using Python

Uploaded by

2022 International Conference on Electrical and Computing Technologies and Applications (ICECTA)

Machine Learning Models for Salary Prediction

Computer Engineering Department Computer Engineering Department

978-1-6654-5600-5/22/$31.00 ©2022 IEEE 143

You might also like