Machine Learning Models
Machine Learning Models
net/publication/358235790
Article in International Journal of Innovative Technology and Exploring Engineering · January 2022
DOI: 10.35940/ijitee.C9733.0111322
CITATIONS READS
25 1,224
1 author:
Gurjeet Singh
17 PUBLICATIONS 82 CITATIONS
SEE PROFILE
All content following this page was uploaded by Gurjeet Singh on 06 February 2022.
Abstract: The paper focuses on predicting the Nifty 50 Index by Very less researches in the literature are found to predict
using 8 Supervised Machine Learning Models. The techniques the movements of index in the Indian Stock Market
used for empirical study are Adaptive Boost (AdaBoost), k- especially Nifty 50 Index.
Nearest Neighbors (kNN), Linear Regression (LR), Artificial
Neural Network (ANN), Random Forest (RF), Stochastic My previous research works in the area of Stock Market
Gradient Descent (SGD), Support Vector Machine (SVM) and (Singh & Nagar, 2012), (Nagar & Singh, 2012) and (Nagar
Decision Trees (DT). Experiments are based on historical data of & Issar, 2013) motivated me to do further study as still there
Nifty 50 Index of Indian Stock Market from 22nd April, 1996 to are very limited researches have been done in the area of
16th April, 2021, which is time series data of around 25 years. Stock Market.
During the period there were 6220 trading days excluding all the The NIFTY 50 Index [39] consists of 50 stocks from 13
non trading days. The entire trading dataset was divided into 4
subsets of different size-25% of entire data, 50% of entire data, different sectors of the Indian economy.
75% of entire data and entire data. Each subset was further ANN, LR, SGD, SVM, AdaBoost, RF, kNN and DT
divided into 2 parts-training data and testing data. After applying Machine Learning techniques are used for modeling and
3 tests- Test on Training Data, Test on Testing Data and Cross predicting the movements of the index.
Validation Test on each subset, the prediction performance of the The main objective of this empirical study is to explain
used models were compared and after comparison, very and validate the expectedness of Stock Index movements
interesting results were found. The evaluation results indicate
that Adaptive Boost, k- Nearest Neighbors, Random Forest and using the above models and to make comparison of the
Decision Trees under performed with increase in the size of data performance of used techniques.
set. Linear Regression and Artificial Neural Network shown This research paper is divided into six sections. Section II
almost similar prediction results among all the models but provides the literature review of predicting Stock Market
Artificial Neural Network took more time in training and using Machine Learning. Section III explains the research
validating the model. Thereafter Support Vector Machine data. Section IV tells about the Prediction Models used in
performed better among rest of the models but with increase in
the size of data set, Stochastic Gradient Descent performed better this empirical study. Section V describes about the
than Support Vector Machine. experiments and empirical results from the comparative
Keywords: Artificial Neural Network, Stock, Market
analysis. And the last section, Section VI contains the
Prediction, Supervised Machine Learning Models, Time Series concluding remarks.
Data
II. LITERATURE REVIEW
I. INTRODUCTION Using Artificial Neural Networks and Support Vector
Machines, Y.Kara et al. (2011) attempted to predict the
S tock Markets are the most popular financial market movements of stock price in the Istanbul Stock Exchange.
instrument. Perfect prediction of Stock Market Indices and They prepared 2 prediction models and compared their
stocks price is very difficult due to its highly dynamic performances. The average prediction performance of the
nature. According to Abu-Mostafa & Atiya (1996), stock ANN model was 75.74% and SVM model was 71.52%.
market prediction is considered as a very difficult task in Alaa F et al. (2015) explored the use of Multiple Linear
financial time series prediction. According to Tan, Quek, & Regression and Support Vector Machine (SVM) to develop
See (2007), stock market is affected by many macro models for predicting the S&P 500 stock market index. 27
economical factors such as political events, firm’s policies, potential financial and economic variables which impact the
general economic conditions, investors’ expectations, stock movement were adopted to build a relationship
institutional investors’ decisions, other stock market between the stock index and these variables. The
movements, and investors psychology etc. Lots of empirical constructed SVM model with Radial Basis Function (RBF)
studies dealt with the predicting stock index movement for kernel model provided good prediction capabilities with
the developed financial markets. respect to the Regression and ANN models. Madge and
Bhatt (2015) predicted the movements of stock price using
SVM. 34 technology stocks and 4 parameters (technical
Manuscript received on January 15, 2022.
Revised Manuscript received on January 24, 2022. indicators) were taken into consideration in their study. The
Manuscript published on February 28, 2022. work concluded that in the short-term predictions there was
* Correspondence Author very low accuracy, on the other hand in the long-term
Dr. Gurjeet Singh, Associate Professor & Dean, Department of Lords prediction the prediction accuracy stood between 55 and
School of Computer Applications & IT, Lords University, Alwar,
Rajasthan, India. E-mail: research.gurjeet@gmail.com 60%. Aparna Nayak et al. (2016) used historical and social
media data.
© The Authors. Published by Blue Eyes Intelligence Engineering and
Sciences Publication (BEIESP). This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 18 ©Copyright: All rights reserved.
Machine Learning Models in Stock Market Prediction
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 19 ©Copyright: All rights reserved.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-11 Issue-3, January 2022
The performance comparison was performed on the Data Detection, Stock Market Prediction etc. Orange Data
Set-1, Data Set-2, Data Set-3 and Data Set-4 (entire Data Mining Tool (Demsar, J. et al., 2013) provide Neural
Set) using ANN, LR, SGD, SVM, AdaBoost, RF, KNN and Network widget which uses Multi-Layer Perceptron (MLP)
DT models. The results for each model and Data Set were with back propagation, algorithm that can learn non-linear
compared to each other. models as well as linear is used in this study. MLP is chosen
as it is well suited in stock market prediction.
IV. METHODOLOGY
E. Random Forest (RF)
A. Adaptive Boosting (AdaBoost) According to Breiman et al. (2001), in Random Forest, 2
Boosting method is an ensemble method which is mostly parameters are required: (i) ntree- the number of trees (ii)
used for predictions. This method is a group of algorithms mtry- the number of features in each split. Many studies
which converts weak learners to a powerful learner. By have declared that the suitable results could be obtained with
using boosting weak learners are trained sequentially to the default parameters (Duro et al. and Immitzer et al.
correct its past performance. According to R.E. Schapire (2012); Liaw et al. (2002); Zhang et al. (2017)). According
(2003) AdaBoost was the initial boosting algorithm which to Liaw et al. (2002), the more number of trees are giving
was successfully being used to meet the requirements of more stable results. According to Breiman et al. (2001) use
binary classification. AdaBoost algorithm is used to raise the of more than the required number of trees may not be
performance of any machine learning algorithm, and it is necessary, however this is not harmful to the model.
mostly used with weak learners. According to Feng et al. (2015) with ntree = 200, accurate
results can be obtained in Random Forest. Duro et al. (2012)
B. K-Nearest Neighbors (kNN)
states default value of mtry = √p, here p is denoted as
According to Duda et al. (2015), the k-Nearest Neighbors number of predictors.
technique is a non-parametric technique which was used in
the before 1970’s in the applications of statistics (Franco- F. Stochastic Gradient Descent (SGD)
Loez et al., 2001). The basic theory behind kNN is that in According to L.Bottou (2010), in convex loss functions,
the dataset of calibration, kNN is capable to find a group of Stochastic Gradient Descent is very much efficient approach
k samples that are more nearest to unknown samples. The for learning of unequal linear classifiers for example in
labels of unknown samples out of these k samples can be SVM and logistic regression. SGD is being used for a long
determined by finding the average of the response variables time in the Machine Learning society. In the context of
(Akbulut et al. and Wei et al., 2017). The k plays an large-scale learning, SGD is being widely used recently.
important role in the performance of the kNN, because it is Except large scale learning, SGD is being applied in sparse
key tuning parameter of kNN (Qian et al., 2015). By using a machine learning obstacle mostly in text classification and
bootstrap procedure, the parameter k can be determined. natural language processing.
C. Linear Regression (LR) G. Support Vector Machine (SVM)
Linear Regression is known as the linear relationship The Support Vector Machine algorithm can work well
between the input data and the output data. The formula for especially on the basis of classification. SVM is basically
a linear regression is as shown following: used due to its high generalization performance (Liu et al.,
y=c0 + c1x1 +…..+ cnxn 2017; Mohri et al., 2018). However the learning time of this
In the equation shown above there are n input variables; algorithm is long, although more optimized results can be
these are known as predictors or regressors and one output obtained by reducing the number of processes in learning.
variable which is denoted by y (variable to be predicted). SVM is successful on large-volume data sets. SVM is used
The constants c0, c1 ……… cn, are called the regression in prediction, image classification, medical diagnosis, text
coefficients which is computed by principle of Least Square analytic, outlier detection etc.
method. This is known Multiple Linear Regression because
H. Decision Tree (DT)
there are more than one predictor (Margaret H. Dunham,
2005). Regression Analysis is a methodology of statistics According to J.-J.Sheu et al. (2016) Decision Tree is a
that is mostly used in numeric predictions (Jiawei Han and data mining approach and it is based on the design of tree
Micheline Kamber, 2010). like structure. DT is a predictive modelling approach which
is used in Statistics, Data Mining and Machine Learning. DT
D. Artificial Neural Network (ANN) is mostly used in classification and regression based
According to T. Subramaniam et al. (2010), first of all problems.
McCulloch and Pitts introduced Artificial Neural network
(ANN) in 1943. Earlier ANN was widely used in words V. EXPERIMENTS AND RESULTS
classification. ANN emulates the capabilities of human
In this section, we conduct experimentation to evaluate
mind. In human mind neurons which are also called nerves
the performance of Machine Learning models for Nifty 50
cells, contact to each other through sending messages. ANN
Index prediction on four different datasets.
is highly capable and efficient data-driven model which is
widely used in complex non linear behavior of data. ANN
can be used in Hand Writing Recognition, Pattern
Recognition, Face Identification, Text Translation, Medical
Diagnosis, Speech Recognition, Credit Card Fraud
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 20 ©Copyright: All rights reserved.
Machine Learning Models in Stock Market Prediction
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 21 ©Copyright: All rights reserved.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-11 Issue-3, January 2022
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 23 ©Copyright: All rights reserved.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-11 Issue-3, January 2022
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 24 ©Copyright: All rights reserved.
Machine Learning Models in Stock Market Prediction
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 25 ©Copyright: All rights reserved.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-11 Issue-3, January 2022
E. Discussion of Experiments and Results Table 6 shows that AdaBoost, RF, kNN and DT
Three tests were applied- Cross Validation Test, Test performed well receptively in training even better than
on Training Data and Test on Testing Data on each Data LR, ANN, SVM and SGD but Table 7 shows that these
Set. These results are shown in the Tables 2, 3, 4, 5, 6, models did not performed as well in Validation Test.
7, 8, 9, 10, 11, 12 and 13 in the section V(C) Tables 2, 3 In Data Set-3, from Tables 8 and 10, it is observed
and 4 are showing the testing results of Data Set-1. that this time also LR and ANN results are almost
Tables 5, 6 and 7 are showing the testing results of Data similar and ranked 1 st and 2 nd respectively. This time
Set-2. Tables 8, 9 and 10 are showing the testing results SGD performed better than SVM. SVM stood last in the
of Data Set-3. Tables 11, 12 and 13 are showing the Cross Validation Test. Table 9 shows that AdaBoost,
testing results of Data Set-4. RF, kNN and DT performed well receptively in training
In the testing results the time taken to train the model even better than LR, ANN, SGD and SVM but Table 10
and the time taken to test the model in seconds, also shows that these models did not performed as well in
included to compare the execution time of the models. Validation Test and obtained negative value of R 2 .
The next measurements in the testing results are Negative value of R 2 on Test Data as shown in the Table
MSE, RMSE, MAE, R 2 and CVRMSE which stands for 10 denotes the models AdaBoost, RF, kNN and DT are
Mean Square Error, Root Mean Square Error, Mean overfitted.
Absolute Error, R-Square, Coefficient of the Variation In Data Set-4, from Table 11 and 13, it is observed
of the Root Mean Square Error respectively. The lower that LR and ANN results are almost similar and
value of MSE, RMSE, MAE and CVRMSE implies obtained rank 1 and 2 respectively. Thereafter SGD
higher accuracy of a model. However, a higher value of performed well compared to SVM. Table 12 shows that
R square is considered desirable. AdaBoost, RF, kNN and DT performed well
Over fitting is identified by checking validation respectively in training even better than LR, ANN, SGD
metrics such as accuracy and loss. The validation and SVM but Table 13 shows that these models did not
metrics are generally increased up to a certain point and performed as well in Validation Test and obtained
after that start declining when the model is affected negative value of R 2 . Negative value of R 2 on Test Data
by over fitting. As a result, over fitting fails to fit as shown in the Table 13 denotes the models AdaBoost,
additional data and this effect the accuracy of a RF, kNN and DT are overfitted.
predicting model. If the errors on the testing or The above comparison made among all the models
validation dataset are higher than the errors on training can also be understood by seeing the Figs. 12, 13, 14,
dataset, it is said over fitting. Under fitting is identified 15, 16, 17, 18 and 19.
when the model neither performs well with training data Fig.12. shows the Actual and Predicted Close Price of
nor generalizes to test data. When a model performs NIFTY 50 Index using Linear Regression (LR) in
well on Training Data as well as on Test Data then the Testing Data of Data Set-4. Both the lines of actual and
model is a best fit or a good model. predicted close price of Nifty 50 Index are almost
Before comparing obtained testing results, sorting is overlapping to each other; it means the prediction model
done on RMSE variable, in ascending order for each is good fit model for LR.
table. It means the models having lower RMSE values Fig.13. shows the Actual and Predicted Close Price of
are up in the tables in comparison to models having NIFTY 50 Index using Artificial Neural Network
higher values of RMSE. After sorting the tables in (ANN) in Testing Data of Data Set-4. Both the lines of
ascending order on the basis of RMSE, rank is given to actual and predicted close price of Nifty 50 Index are
each model in increasing order as lower RMSE value almost overlapping to each other; it means the
will be having higher accuracy than the higher RMSE prediction model is a good fit model for ANN.
value. Fig.14. shows the Actual and Predicted Close Price of
In Data Set-1, from Tables 2 and 4, it is observed that NIFTY 50 Index using Stochastic Gradient Descent
LR and ANN results are almost similar but ANN takes (SGD) in Testing Data of Data Set-4. Both the lines of
more time in training and testing, so rank 1 is given to actual and predicted close price of Nifty 50 Index are
LR and rank 2 is given to ANN. Thereafter SVM and almost overlapping to each other; it means the
SGD performed well respectively among rest of the prediction model is a good fit model for SGD.
models, but SVM takes more time in training and testing Fig.15. shows the Actual and Predicted Close Price of
in comparison to SGD. DT performed worst among all NIFTY 50 Index using Support Vector Machine (SVM)
the models. Table 3 shows that AdaBoost, RF, DT and in Testing Data of Data Set-4. Both the lines of actual
kNN performed well respectively in training even better and predicted close price of Nifty 50 Index are almost
than LR, ANN, SVM and SGD but Table 4 shows that overlapping to each other; it means the prediction model
these models did not performed as well in Validation is a good fit model for SVM. But comparing the
Test. Validation Test results of LR, ANN, SGD and SVM it is
In Data Set-2, from Tables 5 and 7, it is observed that observed that LR and ANN performed better among all.
once again LR and ANN results are almost similar and
ranked 1 st and 2 nd respectively. SVM and SGD were
ranked 3 rd and 4 th respectively. Table 5 shows that DT
performed worst in Cross Validation Test and Table 7
shows that kNN performed worst in Validation Test.
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 26 ©Copyright: All rights reserved.
Machine Learning Models in Stock Market Prediction
Thereafter SVM performed better than SGD in Data networks can deal better with non linear dependencies.
Set-1 and Data Set-2 but SGD performed better than So if the data will be having some non linear
SVM in Data Set-3 and Data Set-4. It indicates that with dependencies, Neural Networks should perform better
increase in the size of data set, SVM performance than regression. After LR and ANN, Support Vector
decreases. Machine performed well but with increase in the size of
Fig.16. shows the Actual and Predicted Close Price of data Stochastic Gradient Descent performed better than
NIFTY 50 Index using Adaptive Boosting (AdaBoost) SVM. Thereafter ensemble learning methods of
in Testing Data of Data Set-4. The line of predicted Decision Tree- AdaBoost and Random Forest performed
close price of Nifty 50 Index overlaps the line of Actual better than kNN and Decision Tree with increase in the
Close price up to a certain point- 231 days out of 1244 size of data.
days thereafter it remains constant as predicted close In this empirical study eight Supervised Machine
price in the output was also constant. It means the Learning Models were used, in future empirical study, more
prediction model is an over fit model for AdaBoost. ensemble methods for Supervised Machine Learning
Fig.17. shows the Actual and Predicted Close Price of Models can be taken. This empirical study used around 25
NIFTY 50 Index using Random Forest (RF) in Testing years historical data, which is good for machine learning
Data of Data Set-4. The line of predicted close price of because in such a long period many bull and bear phases of
Nifty 50 Index overlaps the line of Actual Close price stock market were included.
up to a certain point- 231 days out of 1244 days
thereafter it remains constant as predicted close price in REFERENCES
the output was also constant. It means the prediction 1. Abu-Mostafa, Y. S., & Atiya, A. F. (1996). Introduction to financial
model is an over fit model for Random Forest. forecasting. Applied Intelligence, 6(3), 205–213.
Fig.18. shows the Actual and Predicted Close Price of 2. Akbulut Y., Sengur A., Guo Y., Smarandache F. NS-k-NN:
Neutrosophic Set-Based k-Nearest Neighbors classifier. Symmetry.
NIFTY 50 Index using k Nearest Neighbors (kNN) in 2017;9:179. doi: 10.3390/sym9090179.
Testing Data of Data Set-4. The line of predicted close 3. Breiman L. Random forests. Mach. Learn. 2001;45:5–32. doi:
price of Nifty 50 Index overlaps the line of Actual Close 10.1023/A:1010933404324.
4. Dash, R., & Dash, P. K. (2016). A hybrid stock trading framework
price up to a certain point- 231 days out of 1244 days integrating technical analysis with machine learning techniques.
thereafter it remains constant as predicted close price in Journal of Finance and Data Science, 2(1), 42–57.
the output was also constant. It means the prediction https://doi.org/10.1016/j.jfds.2016.03.002
model is an over fit model for kNN. 5. Demsar, J. et al. Orange: data mining toolbox in python. J. Mach.
Learn. Res.14, 2349–2353 (2013).
Fig.19. shows the Actual and Predicted Close Price of 6. Duda R., Hart P. Pattern Classification and Scene Analysis. John
NIFTY 50 Index using Decision Tree (DT) in Testing Wiley & Sons; New York, NY, USA: 1973.
Data of Data Set-4. The line of predicted close price of 7. Duro D.C., Franklin S.E., Dubé M.G. A comparison of pixel-based
and object-based image analysis with selected machine learning
Nifty 50 Index overlaps the line of Actual Close price algorithms for the classification of agricultural landscapes using
up to a certain point- 218 days out of 1244 days SPOT-5 HRG imagery. Remote Sens. Environ. 2012;118:259–272.
thereafter it remains constant as predicted close price in doi: 10.1016/j.rse.2011.11.020.
8. F., A., Elsir, S., & Faris, H. (2015). A Comparison between
the output was also constant. It means the prediction Regression, Artificial Neural Networks and Support Vector Machines
model is an over fit model for Decision Tree. for Predicting Stock Market Index. International Journal of Advanced
Research in Artificial Intelligence, 4(7), 55–63.
https://doi.org/10.14569/ijarai.2015.040710
VI. CONCLUSION 9. Feng Q., Liu J., Gong J. UAV remote sensing for urban vegetation
mapping using random forest and texture analysis. Remote Sens.
Prediction of the movements of the stock market
2015;7:1074–1094. doi: 10.3390/rs70101074.
index is very important for developing the effective 10. Franco-Lopez H., Ek A.R., Bauer M.E. Estimation and mapping of
market trading strategies. Financial decision to buy or forest stand density, volume and cover type using the k-Nearest
sell an instrument may be made by the traders by Neighbors method. Remote Sens. Environ. 2001;77:251–274. doi:
10.1016/S0034-4257(01)00209-7.
choosing the effective predictive model. Successful 11. Henrique, B. M., Sobreiro, V. A., & Kimura, H. (2018). Stock price
prediction of Stock Market Index movements may be prediction using support vector regression on daily and up to the
beneficial for investors. The tasks of predicting the minute prices. Journal of Finance and Data Science, 4(3), 183–201.
https://doi.org/10.1016/j.jfds.2018.04.003
movements of the Stock Market Index are highly 12. Idrees, S. M., Alam, M. A., & Agarwal, P. (2019). A Prediction
complicated and very difficult. This empirical study Approach for Stock Market Volatility Based on Time Series Data.
attempted to predict the direction of Nifty 50 Index IEEE Access, 7, 17287–17298.
https://doi.org/10.1109/ACCESS.2019.2895252
movement in the Indian Stock Market. Eight prediction 13. Immitzer M., Atzberger C., Koukal T. Tree species classification with
models were constructed and their performances were random forest using very high spatial resolution 8-Band WorldView-2
compared on historical data from April 22, 1996 to satellite data. Remote Sens. 2012;4:2661–2693. doi:
10.3390/rs4092661.
April 16, 2021. Based on the experimental results 14. J.-J.Sheu, Y.-K.Chen, K.-T.Chu, J.-H.Tang, andW.-P.Yang, “An
obtained, some important conclusions can be drawn. intelligent three phase spam filtering method based on decision tree
Linear Regression and Artificial Neural Network data mining,” Security and Communication Networks, vol. 9, no. 17,
pp. 4013-4026,2016.
performed almost equal performance in all the segments
of different size of data. The reason behind good
performance by Linear Regression competing to
Artificial Neural Network was that regression method
deals better with linear dependencies whereas neural
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 27 ©Copyright: All rights reserved.
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075 (Online), Volume-11 Issue-3, January 2022
15. Jiawei Han and Micheline Kamber, Second Edition (2010), “Data 39. https://www.nseindia.com/products-services/indices-nifty50-index
Mining, Concepts and Techniques”, Morgan Kaufmann, An imprint
of Elsevier (San Francisco).
16. Kara, Y., Acar Boyacioglu, M., & Baykan, Ö. K. (2011). Predicting
AUTHOR PROFILE
direction of stock price index movement using artificial neural Dr. Gurjeet Singh, is working as Associate Professor
networks and support vector machines: The sample of the Istanbul & Dean in the department of Lords School of
Stock Exchange. Expert Systems with Applications, 38(5), 5311– Computer Applications & IT, Lords University,
5319. https://doi.org/10.1016/j.eswa.2010.10.027 Alwar, Rajasthan. He has rich experience of 22 years
17. Kumar, I., Dogra, K., Utreja, C., & Yadav, P. (2018). A Comparative in teaching in the field of Computer Applications. He
Study of Supervised Machine Learning Algorithms for Stock Market is awarded a Ph.D degree in Computer Science. His
Trend Prediction. Proceedings of the International Conference on area of specialization is Data Mining. His interested
Inventive Communication and Computational Technologies, ICICCT areas of research are Outlier Detection, Data
2018, 1003–1007. https://doi.org/10.1109/ICICCT.2018.8473214 Management, Knowledge Discovery and Data Mining, Artificial
18. L.Bottou, “Large-scale machine learning with stochastic gradient Intelligence, Machine Learning and Deep Learning. He is also reviewer of
descent,” in Proceedings of COMPSTAT’2010. Springer, 2010, pp. reputed journals like IEEE Access. Email: research.gurjeet@gmail.com,
177-186. gurjeet.singh@lordsuni.edu.in
19. Liaw A., Wiener M. Classification and regression by randomForest. R
News. 2002;2:18–22.
20. Liu Y, Bi JW, Fan ZP. Multi-class sentiment classification: The
experimental comparisons of feature selection and machine learning
algorithms. Expert Syst Appl 2017.
https://doi.org/10.1016/j.eswa.2017.03.042.
21. Madge, S., & Bhatt, S. (2015). Predicting Stock Price Direction using
Support Vector Machines.
22. Margaret H. Dunham (2005), “Data Mining, Introductory and
Advanced Topics”, Pearson Education(Singapore).
23. Mohri M, Rostamizadeh A, Talwalkar A. Foundations of Machine
Learning. MIT Press; 2018.
24. Nagar, P., & Issar, G. S. (2013). DETECTION OF OUTLIERS IN
STOCK MARKET USING REGRESSION ANALYSIS,
International Journal of Emerging Technologies in Computational and
Applied Sciences, 3(2), 176–181.
25. Nagar, P., & Singh, G. (2012). An Analysis of Outliers for Fraud
Detection in Indian Stock Market, Researchers World : Journal of
Arts, Science and Commerce, Vol.-III, Issue 4(4), 10-15.
https://www.researchgate.net/publication/351415274_An_Analysis_o
f_Outliers_for_Fraud_Detection_in_Indian_Stock_Market
26. Nayak, A., Pai, M. M. M., & Pai, R. M. (2016). Prediction Models for
Indian Stock Market. Procedia Computer Science, 89, 441–449.
https://doi.org/10.1016/j.procs.2016.06.096
27. Parray, I. R., Khurana, S. S., Kumar, M., & Altalbe, A. A. (2020).
Time series data analysis of stock price movement using machine
learning techniques. Soft Computing, 24(21), 16509–16517.
https://doi.org/10.1007/s00500-020-04957-x
28. Picasso, A., Merello, S., Ma, Y., Oneto, L., & Cambria, E. (2019).
Technical analysis and sentiment embeddings for market trend
prediction. Expert Systems with Applications, 135, 60–70.
https://doi.org/10.1016/j.eswa.2019.06.014
29. Pyo S, Lee J, Cha M, Jang H (2017) Predictability of machine
learning techniques to forecast the trends of market index prices:
Hypothesis testing for the Korean stock markets. PLoS ONE 12(11):
e0188107. https://doi.org/ 10.1371/journal.pone.0188107
30. Qian Y., Zhou W., Yan J., Li W., Han L. Comparing machine
learning classifiers for object-based land cover classification using
very high resolution imagery. Remote Sens. 2015;7:153–168. doi:
10.3390/rs70100153.
31. R.E.Schapire, “The boosting approach to machine learning: An
overview,” in Nonlinear estimation and classification. Springer, 2003,
pp. 149-171.
32. Singh, G., & Nagar, P. (2012). A Case Study on Nutek India Limited,
regarding Deep Fall in Share Price, Researchers World - Journal of
Arts, Science and Commerce, Vol.– III, Issue 2(3), 64–68.
33. T.Subramaniam, H.A.Jalab, and A.Y.Taqa, “Overview of textual
antispam filtering techniques,” International Journal of Physical
Sciences, vol.5, no. 12, pp. 1869-1882, 2010.
34. Tan, T. Z., Quek, C., & See, Ng. G. (2007). Biological brain-inspired
genetic complementary learning for stock market and bank failure
prediction. Computational Intelligence, 23(2), 236–261.
35. Wei C., Huang J., Mansaray L.R., Li Z., Liu W., Han J. Estimation
and mapping of winter oilseed rape LAI from high spatial resolution
satellite data based on a hybrid method. Remote Sens. 2017;9:488.
doi: 10.3390/rs9050488.
36. Zhang H.K., Roy D.P. Using the 500 m MODIS land cover product to
derive a consistent continental scale 30 m Landsat land cover
classification. Remote Sens. Environ. 2017;197:15–34. doi:
10.1016/j.rse.2017.05.024.
Web References
37. https://download.biolab.si/download/files/
38. https://www.niftyindices.com/reports/historical-data
Published By:
Retrieval Number: 100.1/ijitee.C97330111322 Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.C9733.0111322 and Sciences Publication (BEIESP)
Journal Website: www.ijitee.org 28 ©Copyright: All rights reserved.
View publication stats