Class Result Prediction Using Machine Learning
Class Result Prediction Using Machine Learning
net/publication/325521490
CITATIONS READS
41 582
5 authors, including:
All content following this page was uploaded by Manjunath T N on 27 October 2022.
$EVWUDFW² 0RUH WKDQ TXLQWLOOLRQ E\WHV RI GDWD LV Machine learning is a type of artificial intelligence (AI)
EHLQJ JHQHUDWHG DFURVV WKH JOREH ,Q IDFW WKLV GDWD LV DV that provides computers with the ability to learn without being
PXFKDVRIWKHGDWDLQWKHZRUOGWRGD\DQGKDVEHHQ explicitly programmed. Machine learning focuses on the
FUHDWHGLQWKHODVWWZR\HDUVDORQH%LJGDWDGHVFULEHVWKH development of computer programs that can change when
ODUJHYROXPHRIGDWDWKDWLQXQGDWHVDEXVLQHVVRQDGD\WR exposed to new data [4]. Some of the applications like
GD\ EDVLV +XJH DPRXQW RI GDWD LV EHLQJ JHQHUDWHG E\ Pedestrian Detection [12], Multiple object recognition with
HYHU\WKLQJDURXQGXVDWDOOWLPHVDQGLVSURGXFHGE\HYHU\ visual attention [14], Neural networks behind Google voice &
GLJLWDO SURFHVV DQG VRFLDO PHGLD H[FKDQJH WKURXJK Google web search using AI concepts [11], Deep neural
V\VWHPV VHQVRUV PRELOH GHYLFHV HWF %LJ GDWD DQDO\WLFV networks have been successful in solving parallel processors
H[DPLQHV ODUJH DPRXQWV RI GDWD WR XQFRYHU KLGGHQ and math expression compilers [16], and many more. Using
SDWWHUQV FRUUHODWLRQV DQG RWKHU LQVLJKWV 7R H[WUDFW the concept of machine learning, a number of algorithms are
PHDQLQJIXO YDOXH IURP ELJ GDWD RQH QHHGV RSWLPDO explored in order to predict the result of class students. In
SURFHVVLQJ SRZHU DQDO\WLFV FDSDELOLWLHV DQG VNLOOV 8VLQJ Machine Learning, this problem is of classification type.
WKH FRQFHSW RI PDFKLQH OHDUQLQJ D QXPEHU RI DOJRULWKPV Hence, the various supervised learning algorithms such as
DUHH[SORUHGLQRUGHUWRSUHGLFWWKHUHVXOWRIFODVVVWXGHQWV Support Vector Machine, Naïve Bayes Classifier, Random
%DVHG RQ WKH SHUIRUPDQFH RI WKH VWXGHQWV LQ SUHYLRXV Forest Classifier, and Gradient Boosting Algorithm are used.
VHPHVWHU DQG WKH VFRUHV RI LQWHUQDO H[DPLQDWLRQV RI WKH The accuracy obtained by each of the algorithms are then
FXUUHQW VHPHVWHU WKH ILQDO UHVXOW ZKHWKHU WKH VWXGHQW compared in order to identify the algorithm that is most
SDVVHVRUIDLOVWKHFXUUHQWVHPHVWHULVFRPSXWHGEHIRUHWKH suitable for the problem.
ILQDOH[DPLQDWLRQDFWXDOO\WDNHVSODFH
978-1-5386-0569-1$31.00 2017
c IEEE 1208
Authorized licensed use limited to: BMS Institute of Technology. Downloaded on October 27,2022 at 04:21:08 UTC from IEEE Xplore. Restrictions apply.
and make predictions. Prescription stage of machine learning, $ 6XSSRUW9HFWRU0DFKLQH
ushering in a new era of man machine collaboration, will In machine learning, support vector machines (SVMs, also
require the biggest change in the way we work. support vector networks) are supervised learning models with
associated learning algorithms that analyze data used for
Tom M. Mitchell describes some of the applications that are classification and regression analysis. Given a set of training
routinely used in various areas of computer science like examples, each marked as belonging to one or the other of two
machine learning algorithms for speech recognition, computer categories, an SVM training algorithm builds a model that
vision, bio surveillance, robot control and variety of other assigns new examples to one category or the other, making it a
tasks, and has been considered in discovering hidden non-probabilistic binary linear classifier [3].
regularities in the continuously growing volumes of online In this algorithm, each data item is plotted as a point in n-
data [10]. dimensional space (where n is number of features you have)
with the value of each feature being the value of a particular
In particular, learning and evaluating such models have a coordinate. Then, we perform classification by finding the
variety of challenges like machine learning skills in domain hyper-plane that differentiates the two classes.
area, collection of data and algorithmic complexity, etc. In this
paper we have made an attempt to predict results of a batch of When two classes of data are linearly separable, infinitely
students based on the previous performance. many hyper-planes could be drawn two separate the two
classes. All these hyper-planes can classify the data into two
classes, and the best among all the hyper-planes is selected by
III. METHODOLOGY the SVM classifier for the prediction model. One reasonable
standard for judging the quality of these hyper-planes is via
their margin lengths [2].
The application of machine learning has been portrayed here.
The general pipeline used for essentially all machine learning Since the output of the problem considered is a factor of
problems consists of [2]: two, i.e. pass or fail, SVM algorithm can be used to predict the
1. Define the problem required results.
2. Collect data.
3. Design features. % 1DwYH%D\HV&ODVVLILHU
4. Train the model. In machine learning, Naïve Bayes classifiers are a family
5. Test the model. of simple probabilistic classifiers based on applying Bayes'
The problem considered here is to predict the fourth theorem with strong (naive) independence assumptions
semester results of the students based on the results of the between the features. It is a simple technique for constructing
students in their third semester, and current internal classifiers: models that assign class labels to problem instances,
examination scores. represented as vectors of feature values, where the class labels
The data collected from BMSIT & M for the 2014-18 are drawn from some finite set. It is not a single algorithm for
training such classifiers, but a family of algorithms based on a
batch students of Information Science and Engineering is used
common principle: all naive Bayes classifiers assume that the
as the sample.
value of a particular feature is independent of the value of any
The data consists of three features for each subject, the other feature, given the class variable [5].
internal score, the external score, and the total score. The final
feature shows the status of the result, whether the student One of the main assumptions of the Naïve Bayes algorithm
passed or failed the semester. The criterion for this is for the is that each feature is independent, which holds good for the
student to score a minimum of 35 in external examination and problem considered, since the score of the student in each
a minimum total of 50 in every subject. The model is built to subject is independent, though it could be related with similar
predict the results based on this criterion. subjects. Due to this assumption, this classifier is very effective
The third semester scores and the results are used as the for this problem.
sample for training the model. The internal scores is provided
as the data input for the training model, and the result is given & 5DQGRP)RUHVW&ODVVLILHU
as the output. Hence, the training model is built on the Random Forests grows many classification trees. To
dependability of the final result on the internal scores of the classify a new object from an input vector, put the input vector
students. down each of the trees in the forest. Each tree gives a
The trained model is tested on the fourth semester results. classification, and we say the tree "votes" for that class. The
The internal scores of the student is given as the input for the forest chooses the classification having the most votes (over all
machine, and the final result is the output that is predicted by the trees in the forest) [6].
the machine. Random forests are very fast, and they do not over fit. The
The predicted result by the trained model with each of the user can also manually run as many trees as they want in their
algorithms is then compared with the actual results to compute classifier. Since the computation is done with a number of
the accuracy of the models for the selected problem. classifiers and the best among them is chosen by the model,
theoretically, Random Forest Classifiers are more accurate and
efficient than individual Naïve Bayes classifiers.
Authorized licensed use limited to: BMS Institute of Technology. Downloaded on October 27,2022 at 04:21:08 UTC from IEEE Xplore. Restrictions apply.
The output obtained by modifying the number of trees is From Fig. 1, the optimal value for the number of trees
tabulated, and the optimal value for the number of trees is is 10. Hence, the corresponding accuracy is selected for the
chosen. This value is considered as the accuracy for the Random Forest classifier.
Random Forest classifier.
Now, barplot representing the actual result of each
' *UDGLHQW%RRVWLQJ student is presented. The students are represented on the X-
Gradient boosting was developed by Stanford professor axis and the result is represented in the Y-axis. These plots
Jerome Friedman. Gradient boosting develops an ensemble of show the clear distinction in the inaccuracies of each of the
tree-based models by training each of the trees in the ensemble predicted outputs by the models.
on different labels and then combining the trees [7].
Boosting is an ensemble learning algorithm which
combines the prediction of several base estimators in order to
improve robustness over a single estimator. It combines
multiple weak or average predictors to a build strong predictor
[8].
IV. RESULTS
The models created by each of the algorithms are used to
create an output for each of the student. This output is then
compared with the actual results of that semester, and the
accuracy is determined.
First, the number of trees and the output accuracy is
tabulated for the Random Forest classifier, and the optimal
number is selected.
Fig. 1. Number of trees used in the RF Classifier, & corresponding accuracy.
Authorized licensed use limited to: BMS Institute of Technology. Downloaded on October 27,2022 at 04:21:08 UTC from IEEE Xplore. Restrictions apply.
TABLE II. PREDICTIVE ACCURACY OF THE ALGORITHMS
6HULDO1R $OJRULWKPXVHG $FFXUDF\
1.
Support Vector Machine 87.5
2.
Naïve Bayes Classifier 87.5
3.
Random Forest Classifier 89.0625
4.
Gradient Boosting 82.8125
b.
Table representing the various algorithms used & the corresponding accuracy.
V. CONCLUSION
ACKNOWLEDGMENT
We thank the department of Information Science and
Engineering, BMS Institute of Technology for providing us the
data required.
REFERENCES
[1] Andreas C. Müller, Sarah Guido. Introduction to Machine
Learning with Python. O'Reilly Media, September 2016.
[2] Aggelos Konstantinos Katsaggelos, Jeremy Watt, and
Reza Borhani. Machine Learning Refined: Foundations,
Algorithms, and Applications.
[3] Support vector machine, from Wikipedia, the free
Fig. 6. Predicted result by the Gradient Boosting Algorithm. encyclopedia.
https://en.wikipedia.org/wiki/Support_vector_machine
The results predicted by each of these algorithms, when [4] Definition of Machine Learning.
compared with the actual result of that semester, will give the http://whatis.techtarget.com/definition/machine-learning
accuracy of the predictions. The predictive accuracy of each of
[5] Naive Bayes classifier, from Wikipedia, the free
these algorithms is computed and is tabulated below.
encyclopedia.
https://en.wikipedia.org/wiki/Naive_Bayes_classifier
Authorized licensed use limited to: BMS Institute of Technology. Downloaded on October 27,2022 at 04:21:08 UTC from IEEE Xplore. Restrictions apply.
[6] Leo Breiman and Adele Cutler, Random Forests. Comput., 39(3):300–318, 1990.
https://www.stat.berkeley.edu/~breiman/RandomForests/c dl.acm.org/citation.cfm?id=78583.
c_home.htm [14]Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu.
[7] Micheal Bowles, Machine Learning in Python: Essential Multiple object recognition with visual attention. arXiv
Techniques for Predictive Analysis. John Wiley & Sons, preprint arXiv:1412.7755, 2014.
Inc. 2015 arxiv.org/abs/1412.7755.
[8] Sunil Ray, Common Machine Learning Algorithms. [15] Franc¸oise Beaufays. The neural networks behind Google
Analytics Vidhya. Voice transcription, 2015.
https://www.analyticsvidhya.com/blog/2015/08/common- googleresearch.blogspot.com/2015/08/the-
machine-learning-algorithms/ neuralnetworks-behind-google-voice.html.
[9] Mekinsey Quartely “An executive’s guide to machine [16] James Bergstra, Olivier Breuleux, Fr´ed´eric Bastien,
learning ” Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins,
[10]Tom.M Mitchell, “The Discipline of Machine Learning”, Joseph Turian, David Warde-Farley, and Yoshua Bengio.
July 2006. Theano: A CPU and GPU math expression compiler. In
Proceedings of the Python for scientific computing
[11]0DUWÕQ $EDGL $VKLVK $JDUZDO 3DXO %DUKDP (XJHQH
Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, conference (SciPy), volume 4, page 3. Austin, TX, 2010.
Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay UMontreal PDF.
Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey [17] Sharan Chetlur, Cliff Woolley, Philippe Vandermersch,
Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan
Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Shelhamer. cuDNN: Efficient primitives for deep
Man´e, Rajat Monga, Sherry Moore, Derek Murray, Chris learning. arXiv preprint arXiv:1410.0759, 2014.
Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, arxiv.org/abs/1410.0759.
Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent [18] Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and
Vanhoucke, Vijay Vasudevan, Fernanda Vi´egas, Oriol Karthik Kalyanaraman. Project Adam: Building an
Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, efficient and scalable deep learning training system. In
Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale 11th USENIX Symposium on Operating Systems Design
machine learning on heterogeneous systems, 2015. and Implementation (OSDI 14), pages 571–582, 2014.
Software available from tensorflow.org
www.usenix.org/system/files/conference/osdi14/osdi14-
[12]Anelia Angelova, Alex Krizhevsky, and Vincent paper-chilimbi.pdf.
Vanhoucke. Pedestrian detection with a large-field-of- [19] Jack Clark. Google turning its lucrative web search over
view deep network. In Robotics and Automation (ICRA), to AI machines, 2015.
2015 IEEE International Conference on, pages 704–711. www.bloomberg.com/news/articles/2015-10-
IEEE, 2015. CalTech PDF. 26/googleturning- its-lucrative-web-search-over-to-ai-
[13]Arvind and Rishiyur S. Nikhil. Executing a program on machines.
the MIT tagged-token dataflow architecture. IEEE Trans.
Authorized licensed use limited to: BMS Institute of Technology. Downloaded on October 27,2022 at 04:21:08 UTC from IEEE Xplore. Restrictions apply.
View publication stats