0% found this document useful (0 votes)

69 views

Student Academic Performance Prediction Under Various Machine Learning Classification Algorithms

Data Mining in Educational System has increased tremendously in the past and still increasing in present era. This study focusses on the academic stand point and the performance of the student is evaluated by various parameters such as Scholastic Features, Demographic Features and Emotional Features are carried out.

Uploaded by

IJRASETPublications

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Student Academic Performance Prediction Under Various Machine Learning Classification Algorithms

Uploaded by

IJRASETPublications

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

9 XI November 2021

https://doi.org/10.22214/ijraset.2021.38786
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 9 Issue XI Nov 2021- Available at www.ijraset.com

Student Academic Performance Prediction under

Various Machine Learning Classification Algorithms
M. Nirmala1, T. Seeni Selvi2, V. Saravanan3
1
Department of Computer Applications,
2
Department of Computer Science,
3
Department of Information Technology
1
Hindusthan College of Engineering and Technology,
2, 3
Hindusthan College of Arts and Science
Abstract: Data Mining in Educational System has increased tremendously in the past and still increasing in present era. This
study focusses on the academic stand point and the performance of the student is evaluated by various parameters such as
Scholastic Features, Demographic Features and Emotional Features are carried out. Various Machine learning methodologies
are adopted to extract the masked knowledge from the educational data set provided, which helps in identifying the features
giving more impact to the student academic performance and there by knowing the impacting features, helps us to predict
deeper insights about student performance in academics. Various Machine learning workflow starting from problem definition
to Model Prediction has been carried out in this study. The supervised learning methodology has been adopted and various
Feature engineering methods has been adopted to make the ML model appropriate for training and evaluation. It is a prediction
problem and various Classification algorithms such as Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision
Tree modelling has been done to fit the student data appropriately.
Keywords: Scholastic, Demographic, Emotional, Logistic Regression, Random Forest, SVM, KNN, XGBOOST, Decision Tree.

I. INTRODUCTION
Machine Learning [1] commonly deals with big data where the size of the data is massive and the data can be both in structured and
unstructured format. It endows the computers with the ability to learn from ‘DATA’ and make sensible decisions. The main focus of
this research it to perform a step by step process of the Machine Learning approach from Problem definition to Prediction.
Educational sector is a domain where outsized amount of data is being bred every day. The generated existing data and the about to
receive data if analysed in the right format can bring tremendous changes in the Scholastic field. The Machine Learning technique is
able to perfectly analyze the data and can bring lot of changes in improving the scholastic performance of the students. The other
features which included demographic, behavioural can also create an impact in the academic performance of the students.

II. LITERATURE SURVEY / RELATED WORK

Numerous data mining tasks [2] were used to create qualitative predictive models to predict the students’ grades from a collected
training dataset. During the survey, university students were aimed and collected multiple personal, social, and academic data of
them. Pre-processing of the collected were done to make it suitable for data mining tasks. Third, the classification models were
tested on the pre-processed data. On the whole this study motivated the universities to do data mining tasks on their students’ data
regularly to get interesting results and patterns which in turn can be more effective and helpful for university as well as the students
in many ways. A similar research on Educational Data Mining; Student’s performance was predicted based on academic records and
their forum participation in [3] . Two undergraduate course data were collected. To predict student’s performance three
classification models like Naive Bayes, Neural Networks and Decision Trees were used. The results show that Naive Bayes model
gave better result comparing to other two models.
Another comparative study was done by [4]. They compared six algorithms like J48, Random Forest, Naive Bayes, Naive Bayes
Multinomial, K-Star and IBK. The data set contains 480 records and Weka Tool were used for implementation. The Survey
conducted based on seven attributes and found Random Forest algorithm provides more accuracy compared to other algorithms.
A survey was conducted over 200 college students. In this research [5] classification algorithms were adopted on student dataset to
foretell the learning behavior of student’s. Slow learners were identified, and actions were taken to reduce the failure count and
correct actions could be adopted to make the weaker students suitable for learning. In this study the J48, Naive Bayes and Random
forest algorithms were compared. Finally the researcher got accuracy using Random forest algorithm when the data set is in massive
size.

©IJRASET: All Rights are Reserved 221

International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 9 Issue XI Nov 2021- Available at www.ijraset.com

The study about students’ educational behavior done by [6] proposed framework having a category of a feature called “Behavioral
feature” is introduced where they focus on student’s behavioral features and their relationship with student’s academic success.
They used the same framework to examine student’s progress by using ensemble techniques which enhance the overall accuracy of
results. Classification task on student database to predict the academic performance of student was carried by [7]. Bayesian Network
Classifiers is used in this study. Information like Previous semester marks, Internal Assessment Marks, Performance during
Seminars, Assignment, Attendance, Co-Curricular Activities were collected to predict the performance of the end semester marks.
This study will help the students improve their performance. The students who require special responsiveness will be effectively
identified and the failure rate of students would be decreased considerably.
A Student performance through a study was done by [8]. The sample contains 300 students out of which 225 are males and 75 are
females. The performance of the students in the class are affected by various parameters such as student attendance, hours spent in
class, family income, students mother’s age and her education.
Educational Data Mining to be a upcoming research area which deals with computational methods to explore educational data was
explained by [9]. It also explains the types of Educational Environments, Educational data and different group of people in
education field. It helps us to explore educational phenomenon better and to get enhanced insights into it. This also says about the
current affairs in the EDM field.

III. RESEARCH METHODOLOGY

The various methods adopted during the research process have been portrayed. This is a Descriptive Research problem where the
study of student data set is explored. It performs the prediction of Academic performance of students of an educational body by
applying various methodologies with respect to Machine Learning.

A. Research Data
The data collected from secondary data sources are tabulated in the Table 1.
Table 1 : Data Source Details
Data sources xAPI-Edu-Data.csv
Dataset characteristics Multivariate
Number of Instances 480
Number of Attributes 17
Attribute Type Categorical and Numerical
Dataset Owner Ibrahim Alijarah
Professor (Assistant) at The University of Jordan
Fargo, North Dakota, United States
Link https://www.kaggle.com/aljarah/xAPI-Edu-Data/metadata

B. Proposed System Method Of Analysis

The proposed system states the prediction of the Academic performance of the student using various Features depicted in Table 2
are classified as Demographic, Scholastic and Emotional.
Table 2 : Students Features
Demographic Features
Scholastic Features Emotional Features
(Related to Population)
gender Educational Stages Raised Hands
Nationality Grade Levels Visited Resources
Place of Birth Section ID Viewing Announcements
Semester Discussion Groups
Topic Parents Answering Survey
Parents responsible for student Student Absence Days
Class (L,M,H) based on the total
Parents School Satisfaction
grade marks classified into 3
classes

©IJRASET: All Rights are Reserved 222

Machine Learning workflow has various steps to be followed starting from Problem definition to Model Prediction. Various steps
required to be followed before fitting the model are shown in the Figure 1.

Figure 1 : Machine Learning Process Pipeline

C. Machine Learning Pipeline

Machine learning methodology is adopted for problems when traditional programming cannot be done, and when the system itself
needs to solve the problem rather than a programmer, and if the size of the data is very large.
Steps to be followed for Machine Learning Process

Be clear with what the model is expected to do.

Define
Ensure that all the inputs are available during prediction.
Problem
In this system the academic performance of students need to be predicted based upon various features.
The data is collected from xAPI-Edu-Data.csv data repository. It contains 480 rows and 17 Columns. It
contains both categorical and Numerical data.
The data collected is in the format shown in Figure 2.

Collect
Data

Figure 2 : Data Format for Supervised Learning

Table 3 : Students Features and its Descriptions

Feature Datatype Description
gender Categorical Male or Female
NationalITy Categorical Student Nationality
PlaceofBirth Categorical Place of Birth of the Student
StageID Categorical Stage refers to Primary, Middle or High School
GradeID Categorical Grade Category varies from G-01 to G-12
SectionID Categorical Classroom Section, either A or B or C
Topic Categorical Refers to Course Topic such as Math, Quran etc.
Semester Categorical Either First semester or Second Semester
Relation Categorical Either Father or Mum, who is responsible for Student
raisedhands Numerical Count of students Interacted during the class room by raising hands.
VisiTedResources Numerical Count of the students who visited the course content.
AnnouncementsView Numerical Count of the students who checks the new Announcements
Discussion Numerical Count of the students who participated on discussion groups.
ParentsAnsweringSurvey Categorical Whether Parent Answered Survey provided from school or not.
ParentsschoolSatisfaction Categorical Degree of Parent satisfaction from School
StudentAbsenceDays Categorical Either Nominal above 7 or under 7
Based on the total grade / marks it is classified as Low-level, Middle Level,
Class Categorical
High Level.
Exploratory Data Analysis (EDA) is an approach for data analysis that employs a variety of techniques (both graphical and
quantitative) to better understand data. This system contains 4 Numerical Columns and 13 Categorical Columns and the description
about each and every feature, its datatype, its category and its description are explained in the table 3.

©IJRASET: All Rights are Reserved 223

D. Exploratory Data Analysis

1) Univariate Analysis – Individual Features / Variables

Identify the Null Values present in each column and after analysing it shows that the
given data set contains No Null values.
Data visualization is the graphical representation of data in the form of charts,
Analyze diagrams etc. Visualization helps to understand the data much quicker than
Data quantitative methods and as a part of visualization various methods are performed to
Analyze the data in a better format.
UNIVARIATE ANALYSIS – Individual Features / Variables
BIVARIATE ANALYSIS – Relationship of a feature with Target Variable

The Univariate analysis does a single variable analysis. It does not infers its relationship with any other variables. In general count
plot could be used for this analysis. It helps to portray the data and it’s respective patterns for the user to get a better insight about
the single variable and the graphical representation helps us to view maximum, minimum, mean values etc. The Univariate Analysis
and its visualization inferences are described using below mentioned charts.

Figure 3 : Univariate Analysis - gender Figure 4 : Univariate Analysis – Stage ID

Figure 5 : Univariate Analysis – PlaceofBirth Figure 6 : Univariate Analysis – Nationality

©IJRASET: All Rights are Reserved 224

Figure 7 : Univariate Analysis – Class

Figure 8 : Univariate Analysis – Grade ID

Figure 9 : Univariate Analysis – Section ID Figure 10 : Univariate Analysis – Topic

Figure 11 : Univariate Analysis – Semester Figure 12 : Univariate Analysis – Relation

Figure 13 : Univariate Analysis – Figure 14 : Univariate Analysis –

ParentAnsweringSurvey ParentschoolSatisfaction

©IJRASET: All Rights are Reserved 225

Figure 15 : Univariate Analysis – StudentAbsenceDays

2) Univariate Analysis –Report

Male is 63.5% and Female is 36.4% . The gender feature infers that the maximum
Gender
count of students from the data set is Male.
Under Nationality feature KW has 37.3% and Jordan has 35.8% and Venezuela
Nationality
has the least % of 0.2%
The % ratio of Nationality and Place of Birth is almost same and as per the
PlaceofBirth
analysis any one column could be dropped.
Out of the total 51.7 % students are studying in MiddleSchool, 41.5% are in
StageID
Lowerlevel and only 6.9% are in High School.
Out of the total G-02 is 30.6%,G-08 is 24.2% ,G-07 is 21%, G-04 is 10%, G-06
GradeID is 6.7%, G-11 is 2.7%, G-12 is 2.3%, G-09 is 1.04%, G-10 is 0.83% and G-05 is
0.63%.
Out of the total 59% are studying in A section. 34.8% are studying in B section
SectionID
and 6.25% are studying in C Section.
Out of the total students 19.8% area of interest topic is IT, 13.5% is French,
12.3 % is Arabic, 10.6% is Science, 9.8% is English, 6.25% is Biology, 5.2% is
Topic
Spanish, 5% for both Geology and Chemistry , 4.58% for Quran, 4.37% is
Mathematics and 3.95% for History.
Semester 51% of students are in First Semester and 48.95% are in Second Semester.
Parent Responsible for student can be either Father or Mum. Out of the total %
Relation
58.9% is for Father and 41.04% is for Mother.
ParentAnsweringSurvey towards the school improvement is an important factor
ParentAnsweringSurvey
and 56.25% gave an Answer of ‘YES’ and 43.75% gave an answer of ‘NO’
ParentschoolSatisfaction is also an important factor and this helps to identify
whether the student will continue in the same school or not. Out of the Total
ParentschoolSatisfaction
percentage 61% opinion towards the School was Good and remaining of 39%
opinion towards school was Bad.
StudentAbsenceDays Out of the total 60% students are regular and 40% has taken more than 7 days
leave. Female has more attendance than Male.
StudentAbsenceDays StudentAbsenceDays/ Gender Male Female
with respect to gender Under 7 160 129
Above 7 145 46
Out of the Total Low Level score is acquired by 26.5%, Medium Level Score is
Class
acquired by 44% and High Level score is acquired by 30%of students.

©IJRASET: All Rights are Reserved 226

3) Bivariate Analysis – Relationship Of a Feature With Target Variable

Bivariate Analysis is performed to find the associativity between every variable in the data set with the Target Variable (Class in
this system). It also checks for association and the strength of this association or whether there are differences between two variables
and the significance of these differences.

Figure 16 : Bivariate Analysis –Gender & Class Figure 17 : Bivariate Analysis – Stage ID & Class

Figure 18 : Bivariate Analysis – Section ID & Class Figure 19 : Bivariate Analysis – Semester & Class

Figure 21 : Bivariate Analysis – ParentAnsweringSurvey &

Figure 20 : Bivariate Analysis – Relation & Class Class

©IJRASET: All Rights are Reserved 227

Figure 22 : Bivariate Analysis – ParentSchoolSatisfaction Figure 23 : Bivariate Analysis – StudentAbsenceDays &

& Class Class

Figure 24 : Bivariate Analysis – raisedhands & Class Figure 25 : Bivariate Analysis – Visited Resources & Class

Figure 27 : Bivariate Analysis – Announcements View &

Figure 26 : Bivariate Analysis – Discussion & Class Class

Figure 28 : Bivariate Analysis – Nationality & Class

©IJRASET: All Rights are Reserved 228

Figure 29 : Bivariate Analysis – Place of Birth & Class

Figure 30 : Bivariate Analysis – Grade ID & Class

Figure 31 : Bivariate Analysis – Topic & Class

4) Bivariate Analysis –Report – Target Variable = Class

With respect to gender compared with class, female

has the highest score with respect to High level and Male
Gender
has Highest score with respect to Low Level. Female
Table 4 : Gender & Class Academic performance is more compared to Male.
Score

Nationality
Table 5 : Nationality & Class Score
With respect to Nationality compared with class, Jordan and Egypt has got highest percentage
compared to other countries

©IJRASET: All Rights are Reserved 229

PlaceofBirth

Table 6 : PlaceofBirth & Class Score

With respect to PlaceofBirth compared with class, Jordan and Egypt has got highest count
value compared to other countries.

With respect to StageID Middle

StageID School and Lower Level has got high
level of scores with respect to Class.

Table 7 : Stage ID & Class Score

GradeID

Table 8 : Grade ID & Class Score

G-02, G-08, G-09 has the highest scores compared to other grades

With respect to SectionID compared

SectionID with class, Section A is ranking high in
all 3 class categories.

Table 9 : Section ID & Class Score

Topic

Table 10 : Topic & Class Score

In case of second semester, it is less

Semester in the Low Level and in other cases it is
more.

Table 11 : Semester & Class Score

With respect to Relation compared

Relation with class, the highlevel learning
students are greatly supported and
motivated by mothers.
Table 12 : Relation & Class Score

With respect to
ParentAnswerin ParentAnsweringSurvey compared with
gSurvey class, there was more yes for H and M
and less for L.
Table 13 : ParentAnsweringSurvey & Class
Score

With respect to
ParentSchoolsatisfaction compared
ParentschoolSat with class, large majority of parents are
isfaction satisfied with the education they
received. In case of least satisfied
Table 14 : ParentSchoolSatisfaction & Class parent the count is comparatively less.
Score
The biggest visual trend can be seen
is how frequently the student was
absent. Over 90% of the students who
StudentAbsence
did poorly were absent more than seven
Days
times, while almost none of the
students who did well were absent
Table 15 : StudentAbsenceDays & Class Score more than seven times.

Raisedhands

Announce Female student have participated

mentsView more in viewing announcements.

visitedReso Female student have visited the

urces resources more in number.

Female Students have more

Discussion
participated in Discussion.

5) Correlation: Coorelation [10] is a bivariate analysis that measures the strength of association between 2 variables and the
direction of the relationship. The correlation value will be between +1 and -1.
Types of Coorelation are :
Numeric Vs Numeric Categorical (Binary Ordinal With Categorical vs
Feature) Vs Numerical Ordinal categorical
Pearson Pointbiserialr Spearman Rho Cross Tab

Different types of correlation has been implemented depending upon the type of variable. For the given data set, the following
coorelation methods have been adopted which is depicted in the
Table 16
Table 16 : Correlation Methods Applied for the Dataset

The following inferences has been drawn from the

Table 17. It shows that correlation between various features among other feature using crosstab function, Spearman RHO, Pearson,
point biserialr shows that the following features are coo related and could be included for modelling. Nationality, Place of Birth,
Stage ID, Grade ID, Section ID, Topic, Semester, Relation, Class, parent Answering Survey, Parent School Satisfaction, Student
Absence Days to be included for model along with numerical features. Other features if required using the Feature importance could
be later included for modelling.
Table 17 : Correlation Methods Tabulated Values

E. Feature Engineering Concepts [11]

It is the process of converting data into features to act as inputs to machine learning models. Variable transformation type is applied
in this study, where in the given data set most of the columns are categorical and need to be converted to numerical. The conversion
process is done through Label encoding method [12] and the output of the Label Encoding is shown in the Figure 34 and the
formula applied for the label encoding is shown in the Figure 32

Figure 33 : Label Encoding Code

Figure 34 : Label Encoder: Categorical to Numeric Converted Values

Various proposed Classification Algorithms [13] used in this paper are :
1) Logistic Regression Decision Tree
2) Random Forest XG Boost
3) K Nearest Neighbors Algorithm Support Vector Machine

IV. EXPERIMENTAL RESULTS

The transformed data set is partitioned into training data set and the test data set where the training data is 70% of the whole data set
and the remaining unused 30% is used as Test data set. The random state is set as 0. The parameters applied for various algorithms
are depicted in Table 18. The experimented results before feature engineering is depicted in
Table 19. Sample code for Logistic Regression and its classification Report has been shown in Table 20 & Figure 35.
Table 18 : Parameters For Model Fitting
Model Type Parameters for Fitting the Model
Logistic Regression solver='lbfgs',multi_class='auto', max_iter=2000
RandomForestClassifier(n_jobs=-1, random_state=123, criterion='gini',
Random Forest
max_depth=3,)
KNN KNeighborsClassifier(n_neighbors=7
SVM svm.SVC(kernel='rbf',gamma='auto') # Linear Kernel
xgb.XGBClassifier(max_depth=10, learning_rate=0.1, n_estimators=100,
XGBOOST
seed=10)
DecisionTreeClassifier(criterion = "gini", random_state = 100,
DECISION TREE – Gini
max_depth=7, min_samples_leaf=5)
DECISION TREE - Entropy DecisionTreeClassifier(criterion = "entropy", random_state = 100,
max_depth=7, min_samples_leaf=5)

Table 19 : Experimented Results – Before Feature Engineering

Model Type Training Score Testing Score
Logistic Regression 79.16 75.0
Random Forest 82.44 75.69
KNN 75.0 61.1
SVM 99.70 50.0
XGBOOST 100.0 74.30
DECISION TREE – Gini 86.90 70.83
DECISION TREE - Entropy 85.11 67.36

Table 20 : Training & Testing Code – Logistic Regression Algorithm

Training Score Code Testing Score Code
from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score
Logit_Model=LogisticRegression(solver='lbfgs', from sklearn.metrics import classification_report
multi_class='auto', max_iter=2000) prediction=Logit_Model.predict(X_test)
Logit_Model.fit(X_train,Y_train) score = accuracy_score(Y_test,prediction)
Logit_Model.score(X_train,Y_train) report=classification_report(Y_test,prediction)

Figure 35 : Logistic Regression – Classification Report

A. Feature Importance
1) Random Forest Feature Importance [14]: Random forests are among the most popular machine learning methods thanks to
their relatively good accuracy, robustness and ease of use. They also provide two straightforward methods for feature selection:
mean decrease impurity and mean decrease accuracy.
2) Experimented Results after Feature Engineering: The Feature Engineering process applied data set is divided into training data
set and the test data set where the training data is 70% of the whole data set and the remaining unused 30% is used as Test data
set. The random state is set as 50 here, whereas in the previous phase it was set as 0.

Table 21 : Experimented Results –After Feature Engineering

Model Type Training Testing Remarks
Score Score
Logistic Regression 87.20 86.81 Good
Random Forest 94.05 90.97 Fair
KNN 81.54 82.63 Good
Needs more Testing
SVM 97.91 83.33
Effort
Needs more Testing
XGBOOST 97.02 90.27
Effort
Needs more Testing
DECISION TREE – Gini 81.25 76.38
Effort
DECISION TREE - Entropy 80.65 81.25 Good

V. CONCLUSION
The Machine learning methodology is rapidly increasing and the impact of the machine able to predict the result of a system by
itself and also it is able to train a data over a period of time and also test the trained model with a different set of data to prove that
the model is working efficiently and effectively. In this research study it has been apparently proved that Logistic Regression has
got a training score of 87.20 and a testing score of 86.81 has proved that the model is working effectively without any bias or
variance concept. KNN and Decision Tree Entropy also works good and other implemented algorithms in this research study needs
some more feature engineering concepts and data analysis in a stronger term. The model deployment has been done for all
algorithms and the sample input has been given for evaluation, which classified perfectly in all algorithms.
VI. FUTURE SCOPE
The present study predicting the Academic performance of students with respect various features have considerably proved positive
results. This research work increases the performance prediction process of student in an effective way. When considering the future
this work can be further extended by using other feature(s) as Target Variable.
A. Other Features such as Financial Impacting feature, Physical Health Impacting feature and practicing food habits feature can
also be included in the upcoming research study.
B. As the above factors also can create an impact on the academic performance of the student directly or indirectly.
C. Since the present study focused on predicting the academic performance [5] of the student other factors included can also be
experimented to predict the performance of the student not only in academic point of view but also in a behavior perspective.

REFERENCES
[1] Smola, Alex, and S.V.N. Vishwanathan. Introduction to Machine Learning. Cambridge University Press, 2008. N.p., 2008. Web.
[2] Amjad Abu Saa. (2016) “Educational Data Mining & Students’ Performance Prediction” International Journal of Advanced Computer Science and
Applications, Vol. 7, No. 5, 2016.
[3] Ahmed Mueen, Bassam Zafar and Umar Manzoor. (2016) “Modeling and Predicting Students’ Academic Performance Using Data Mining Techniques” I.J.
Modern Education and Computer Science, 2016, 11, 36-42.
[4] Bhrigu Kapur, Nakin Ahluwalia and Sathyaraj R, “Comparative Study on Marks Prediction using Data Mining and Classification Algorithms”, International
Journal of Advanced Research in Computer Science, 8 (3), March-April 2017,632-636
[5] Prasada Rao, K. , M. V.P. Chandra Sekhara, and B. Ramesh. "Predicting Learning Behavior of Students using Classification Techniques." International
Journal of Computer Applications (0975 – 8887) Volume 139 – No.7, April 2016.

[6] Amrieh, E. A., Hamtini, T. & Aljarah, I. (2016). Mining educational data to predict Student’s academic performance using ensemble methods. International
Journal of Database Theory and Application, 9(8), pp. 119–136. doi: 2016.9.8.13.
[7] Sundar PVP. A Comparative Study For Predicting Students Academic Performance using Bayesian Network Classifiers. IOSR Journal of Engineering. 2013
Feb; 3(2):37–42.
[8] S. T. Hijazi, and R. S. M. M. Naqvi, “Factors affecting student’s performance: A Case of Private Colleges”, Bangladesh e-Journal of Sociology, Vol. 3, No. 1,
2006
[9] C. Romero, “Educational Data Mining: A Review of the State of the Art”, IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and
Reviews, Vol. 40, 2010.
[10] https://www.statisticssolutions.com/correlation-pearson-kendall-spearman/
[11] https://www.kdnuggets.com/2018/12/feature-engineering-explained.html
[12] https://towardsdatascience.com/encoding-categorical-features-21a2651a065c
[13] https://www.cs.princeton.edu/~schapire/talks/picasso-minicourse.pdf
[14] https://blog.datadive.net/selecting-good-features-part-iii-random-forests/

A Comparative Study of Machine Learning Algorithms For Gas Leak Detection
No ratings yet
A Comparative Study of Machine Learning Algorithms For Gas Leak Detection
9 pages
Query2Prod2Vec Grounded Word Embeddings For Ecommerce
No ratings yet
Query2Prod2Vec Grounded Word Embeddings For Ecommerce
14 pages
Predictive Analytics and Machine Learning in Business
No ratings yet
Predictive Analytics and Machine Learning in Business
7 pages
Basic Concepts of Epidemiology
100% (9)
Basic Concepts of Epidemiology
125 pages
Tracking and Predecting Students Performance With Machine Learning
0% (1)
Tracking and Predecting Students Performance With Machine Learning
47 pages
Futuristic Learning: AI Edition
From Everand
Futuristic Learning: AI Edition
Tharun Vigneswar PS
No ratings yet
Performance Evaluation of Machine Learning Algorithms in Post-Operative Life Expectancy in The Lung Cancer Patients
No ratings yet
Performance Evaluation of Machine Learning Algorithms in Post-Operative Life Expectancy in The Lung Cancer Patients
11 pages
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
No ratings yet
Predicting Students Performance Using Data Mining Technique With Rough Set Theory Concepts
7 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
19 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Disease Prediction Application Using Machine Learning
No ratings yet
Disease Prediction Application Using Machine Learning
12 pages
Student Dropout Prediction
No ratings yet
Student Dropout Prediction
11 pages
Education Loan Prediction Analysis
No ratings yet
Education Loan Prediction Analysis
5 pages
First Project
No ratings yet
First Project
34 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
34 pages
Supervised Vs Unsupervised Learning What S The Difference IBM 24062021 035331pm
No ratings yet
Supervised Vs Unsupervised Learning What S The Difference IBM 24062021 035331pm
9 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
70 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
Market Basket Analysis For Data Mining - Msthesis PDF
No ratings yet
Market Basket Analysis For Data Mining - Msthesis PDF
75 pages
Towards A Students' Dropout Prediction Model in Higher Education Institutions Using Machine Learning Algorithms
No ratings yet
Towards A Students' Dropout Prediction Model in Higher Education Institutions Using Machine Learning Algorithms
16 pages
Predictive Maintenance of Railway Point Machine Using Machine Learning Algorithm
No ratings yet
Predictive Maintenance of Railway Point Machine Using Machine Learning Algorithm
3 pages
Smart Disease Prediction Using Machine Learning
No ratings yet
Smart Disease Prediction Using Machine Learning
5 pages
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
100% (1)
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
4 pages
Loan Prediction System
No ratings yet
Loan Prediction System
5 pages
Project Proposal 260 Copy
No ratings yet
Project Proposal 260 Copy
38 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Young Adult Stroke Prediction Using Machine Learning
No ratings yet
Young Adult Stroke Prediction Using Machine Learning
5 pages
Detection of Tomato Leaf Disease Locations Using Deep Learning
No ratings yet
Detection of Tomato Leaf Disease Locations Using Deep Learning
9 pages
Higher Education Student Dropout Prediction and Analysis Through Educational Data Mining
No ratings yet
Higher Education Student Dropout Prediction and Analysis Through Educational Data Mining
5 pages
Twitter Sentiment Analysis Project Report Compressed
No ratings yet
Twitter Sentiment Analysis Project Report Compressed
33 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
4 pages
Time Series Forecasting of Petroleum Pro
No ratings yet
Time Series Forecasting of Petroleum Pro
11 pages
Sentiment Analysis of Restaurant Customer
100% (1)
Sentiment Analysis of Restaurant Customer
6 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
Air Quality Prediction
No ratings yet
Air Quality Prediction
21 pages
Applications of Data Mining in The Banking Sector
No ratings yet
Applications of Data Mining in The Banking Sector
8 pages
Diabetes Prediction Using Data Mining
No ratings yet
Diabetes Prediction Using Data Mining
17 pages
CUST THESIS - Student Evaluations
No ratings yet
CUST THESIS - Student Evaluations
175 pages
Alzheimers Disease Detection Using Different Machine Learning Algorithms
100% (1)
Alzheimers Disease Detection Using Different Machine Learning Algorithms
7 pages
A Review On Credit Card Default Modelling Using Data Science
No ratings yet
A Review On Credit Card Default Modelling Using Data Science
7 pages
Obstructive Sleep Apnea
No ratings yet
Obstructive Sleep Apnea
19 pages
Heart Prediction
No ratings yet
Heart Prediction
15 pages
Design and Implementing Heart Disease Prediction Using Naives Bayesian Dept. of Cse
No ratings yet
Design and Implementing Heart Disease Prediction Using Naives Bayesian Dept. of Cse
16 pages
Project Report Hate
100% (1)
Project Report Hate
24 pages
Predicting Cardiovascular Disease Using Logistic Regression Research Paper
No ratings yet
Predicting Cardiovascular Disease Using Logistic Regression Research Paper
4 pages
Face Recognition Attendance System
No ratings yet
Face Recognition Attendance System
18 pages
Final Project Report Crime Data
No ratings yet
Final Project Report Crime Data
65 pages
Classification of Cancerous Profiles Using Machine Learning
No ratings yet
Classification of Cancerous Profiles Using Machine Learning
6 pages
Introduction to Data Science
No ratings yet
Introduction to Data Science
25 pages
Time Series
No ratings yet
Time Series
29 pages
Childhood Asthma Prediction Model Using SVM
No ratings yet
Childhood Asthma Prediction Model Using SVM
9 pages
Explain Machine Learning Model Using SHAP
No ratings yet
Explain Machine Learning Model Using SHAP
28 pages
Build A Machine Learning Portfolio
No ratings yet
Build A Machine Learning Portfolio
18 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
Heart Disease Prediction Using Machine Learning-1
No ratings yet
Heart Disease Prediction Using Machine Learning-1
6 pages
Data Ethics Framework 2
No ratings yet
Data Ethics Framework 2
23 pages
Random Forest
No ratings yet
Random Forest
18 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
Madhan-1
No ratings yet
Madhan-1
90 pages
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
From Everand
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
Joseph O. Esin
No ratings yet
Air Conditioning Heat Load Analysis of A Cabin
No ratings yet
Air Conditioning Heat Load Analysis of A Cabin
9 pages
Advanced Wireless Multipurpose Mine Detection Robot
No ratings yet
Advanced Wireless Multipurpose Mine Detection Robot
7 pages
Design and Analysis of Fixed-Segment Carrier at Carbon Thrust Bearing
No ratings yet
Design and Analysis of Fixed-Segment Carrier at Carbon Thrust Bearing
10 pages
Study and Analysis of Non-Newtonian Fluid Speed Bump
No ratings yet
Study and Analysis of Non-Newtonian Fluid Speed Bump
8 pages
Role of Artificial Intelligence in Emotion Recognition
No ratings yet
Role of Artificial Intelligence in Emotion Recognition
5 pages
Study and Analysis of Non-Newtonian Fluid Speed Bump
No ratings yet
Study and Analysis of Non-Newtonian Fluid Speed Bump
8 pages
IoT-Based Smart Medicine Dispenser
100% (1)
IoT-Based Smart Medicine Dispenser
8 pages
Adsorption Study On Waste Water Characteristics by Using Natural Bio-Adsorbents
No ratings yet
Adsorption Study On Waste Water Characteristics by Using Natural Bio-Adsorbents
6 pages
Se of Optimism Software To Observe Effect of Different Sources in Optical Fiber
No ratings yet
Se of Optimism Software To Observe Effect of Different Sources in Optical Fiber
7 pages
11 V May 2023
No ratings yet
11 V May 2023
34 pages
Design and Analysis of Components in Off-Road Vehicle
No ratings yet
Design and Analysis of Components in Off-Road Vehicle
23 pages
Controlled Hand Gestures Using Python and OpenCV
No ratings yet
Controlled Hand Gestures Using Python and OpenCV
7 pages
Topology Optimisation of Piston
No ratings yet
Topology Optimisation of Piston
8 pages
TNP Portal Using Web Development and Machine Learning
No ratings yet
TNP Portal Using Web Development and Machine Learning
9 pages
Credit Card Fraud Detection Using Machine Learning and Blockchain
100% (1)
Credit Card Fraud Detection Using Machine Learning and Blockchain
9 pages
Skill Verification System Using Blockchain SkillVio
No ratings yet
Skill Verification System Using Blockchain SkillVio
6 pages
Real Time Human Body Posture Analysis Using Deep Learning
100% (1)
Real Time Human Body Posture Analysis Using Deep Learning
7 pages
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
No ratings yet
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
5 pages
Structural Analysis of The Performance of The Diagrid System With and Without Shear Wall
No ratings yet
Structural Analysis of The Performance of The Diagrid System With and Without Shear Wall
13 pages
Comparative in Vivo Study On Quality Analysis On Bisacodyl of Different Brands
No ratings yet
Comparative in Vivo Study On Quality Analysis On Bisacodyl of Different Brands
17 pages
BIM Data Analysis and Visualization Workflow
No ratings yet
BIM Data Analysis and Visualization Workflow
7 pages
Design and Analysis of Fixed Brake Caliper Using Additive Manufacturing
No ratings yet
Design and Analysis of Fixed Brake Caliper Using Additive Manufacturing
9 pages
Smart Parking System Using MERN Stack
No ratings yet
Smart Parking System Using MERN Stack
6 pages
Low Cost Scada System For Micro Industry
No ratings yet
Low Cost Scada System For Micro Industry
5 pages
Image Detection and Real Time Object Detection
100% (1)
Image Detection and Real Time Object Detection
8 pages
Pneumonia Detection Using X-Rays by Deep Learning
No ratings yet
Pneumonia Detection Using X-Rays by Deep Learning
6 pages
CryptoDrive A Decentralized Car Sharing System
100% (1)
CryptoDrive A Decentralized Car Sharing System
9 pages
Business Support System For Local Stores
No ratings yet
Business Support System For Local Stores
8 pages
Fund Future Empowering The Crowdfunding
No ratings yet
Fund Future Empowering The Crowdfunding
6 pages
Dark Store E-Commerce Website Using Sentiment Analysis Prediction
No ratings yet
Dark Store E-Commerce Website Using Sentiment Analysis Prediction
6 pages
Dividend Payout Prediction Using Discriminant Analysis: IQRA University
No ratings yet
Dividend Payout Prediction Using Discriminant Analysis: IQRA University
19 pages
Lecture 4
No ratings yet
Lecture 4
38 pages
Quantitative Research Design and Method
No ratings yet
Quantitative Research Design and Method
54 pages
Ghisi NJRE 2010 PDF
No ratings yet
Ghisi NJRE 2010 PDF
37 pages
Confirmatory Factor Analysis
100% (1)
Confirmatory Factor Analysis
38 pages
Wharton - Business Analytics - Week 6 - Summary Transcripts
No ratings yet
Wharton - Business Analytics - Week 6 - Summary Transcripts
12 pages
Global Physical Activity Questionnaire (GPAQ) : Nine Country Reliability and Validity Study
No ratings yet
Global Physical Activity Questionnaire (GPAQ) : Nine Country Reliability and Validity Study
16 pages
Bottle Role Test
No ratings yet
Bottle Role Test
12 pages
Physical Activity Motivating Games: Virtual Rewards For Real Activity
No ratings yet
Physical Activity Motivating Games: Virtual Rewards For Real Activity
10 pages
AGS, Terraced Landscapes, 57-2, 2017
No ratings yet
AGS, Terraced Landscapes, 57-2, 2017
178 pages
MMM15 Part 3 - Moderation PDF
No ratings yet
MMM15 Part 3 - Moderation PDF
20 pages
Uncertainty Analysis of Measurement Results
No ratings yet
Uncertainty Analysis of Measurement Results
12 pages
M3 Part 2: Regression Analysis
No ratings yet
M3 Part 2: Regression Analysis
21 pages
Study of The Influence of The Length of Work Experience On Labor Coefficient in Unit Price Analysis
No ratings yet
Study of The Influence of The Length of Work Experience On Labor Coefficient in Unit Price Analysis
7 pages
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
No ratings yet
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
36 pages
Yashima Et Al-2004-Language Learning
No ratings yet
Yashima Et Al-2004-Language Learning
34 pages
ISUE-CTE-Syl Effectivity: Revision:1
No ratings yet
ISUE-CTE-Syl Effectivity: Revision:1
12 pages
2 - Cost Terms, Concepts and Behavior
No ratings yet
2 - Cost Terms, Concepts and Behavior
12 pages
Research Presentation
No ratings yet
Research Presentation
27 pages
The Smell of Us - Crowdsourcing Human Body Odor Evaluation: December 2016
No ratings yet
The Smell of Us - Crowdsourcing Human Body Odor Evaluation: December 2016
20 pages
4 Mitiku Emiru
No ratings yet
4 Mitiku Emiru
87 pages
BUAD 812 Summary Notebook
No ratings yet
BUAD 812 Summary Notebook
11 pages
Demand Forecasting Techniqes
No ratings yet
Demand Forecasting Techniqes
29 pages
Vidyasagar University: Directorate of Distance Education
No ratings yet
Vidyasagar University: Directorate of Distance Education
3 pages
Regration
No ratings yet
Regration
4 pages
Basic Business Statistics: 10 Edition
No ratings yet
Basic Business Statistics: 10 Edition
77 pages
The Relationship Between Teachers' Teacher Leadership Roles and Organizational Commitment Levels
No ratings yet
The Relationship Between Teachers' Teacher Leadership Roles and Organizational Commitment Levels
19 pages
Corporate Governance Reform Within The Uk Banking Industry and Its Effect On Firm Performance
No ratings yet
Corporate Governance Reform Within The Uk Banking Industry and Its Effect On Firm Performance
15 pages
Spatial Correlation New
No ratings yet
Spatial Correlation New
14 pages