Stock Market Prediction Using Machine Learning
Stock Market Prediction Using Machine Learning
Learning
Qingyi Chen(B)
Abstract. Stock market analysis and prediction has always been a challenging
problem for finance experts because it is so volatile and susceptible of external
factors that deeply affect the sentiment of investors. Machine learning, which
produces forecasts based on the values of current stock market indices by training
on their prior values, is a recent trend in stock market prediction technologies
and it shows great promise. However, the prediction methods and algorithms
are still developing and the results seem to be volatile and unstable. Making the
predictions fast and accurate can greatly impact the financial markets and it is
necessary to further develop the models and expand the scope of machine learning
approaches. This paper focuses on comparing the use and effectiveness of various
models of stock market prediction including Support Vector Machine (SVM),
Convolutional Neural Network (CNN), Regression based model and Long Short-
Term Memory (LSTM) by summarizing qualitatively the results obtained from
several existing sources and experiments. According to the research results of this
paper, SVM and the combination of CNN and LSTM performed well in making
accurate predictions in the stock market.
1 Introduction
Financial market is one of the greatest inventions of all times that plays an imperative role
in the whole economy system. Stock market prediction is the process of forecasting and
determining the prospective values of a company stock. The stock market is a key pivot
in a prosperous and growing economy of all the countries. Making accurate forecasts
is of great interest for investors and financial experts because it has strong implications
for trading strategies and helps generate significant profits for both the seller and the
broker by predicting market behavior and making correct decision either to sell or hold
the stocks they possess, yet it is challenging to conduct accurate predictions due to the
dynamic and chaotic nature of the stock market [1]. The financial market forecasting
has been explored extensively in the past and most stockbrokers tend to use fundamental
and technical analysis to make forecasts on stock prices but these traditional methods
can not be trusted fully due to the nature of stock market data. A lot of the factors need
to be take into consideration when making stock price forecasts and things like politics
and economic environment can affect the actions and sentiments of the investors, hence
leading to stock market movements [9].
In the last few decades, many innovative machine learning approaches have been
trained and tested for the forecast of stock prices. They are proven to be much more
efficient because the predictions can be made by carefully analyzing the historical data,
which are well performed by the machine learning methods [1]. Although many studies
and works have focused on building and testing the effectiveness of models in machine
learning, the purpose of this article is to summarize some of the machine learning predic-
tion results obtained. It can help people form an overall understanding of various machine
learning methods. And provide a constructive reference for the potential improvement
of machine learning models in stock market forecasting.
as using a machine learning algorithm that is computational efficient that produces the
error gradient in a stable way [10]. However, the study result seems to be a bit shallow and
not in-depth because the model is not structured and trained in a complicated way, only
one confidence score was given for reference and the number of simulations were not
specified. Making a solid conclusion needs more evidence and simulations. In general,
this is an understandable method that gives a quite high confidence score. It also can be
seen that statistical methods are strong supporting pillars for machine learning models
till now due to their performance.
To conclude, amongst all the other models, linear regression model is widely utilized
in various aspects due to its simplicity and robustness [3], however, the estimation
accuracy varies according to the variables considered, association among variables and
the random error component.
By comparing the confidence score of the regression model and the resulting MSE
score of the LSTM model, the accuracy of LSTM is proven to be better at stock market
prediction. It is worth mentioning that larger datasets and more training trails do influence
the accuracy of the prediction results. The use of the stacking in this model makes the
model deeper and therefore more accurate in making complicated predictions such as
that on the stock market. Another thing that Parmer did well is that the effort in setting
dropout value as it definitely increased the speed and combats the overfitting issue.
However, the model was not trained thoroughly and systematically, and the dataset was
not large and comprehensive enough in this case.
2.2 LSTM, Convolutional Neural Network (CNN) and Support Vector Regression
(SVR) on Time Dependent Data
2.2.5 Results
It can be seen from Table 1, Chen summarizes the performances of LSTM, CNN and
SVR in predicting future market prices on the four stock datasets using MAPE. SVR is
Model Trained on Data from Mean Absolute Percentage Error MAPE of each
Machine Learning’s prediction On
AAPL MAST FORD EXON
LSTM AAPL 13.75 6.67 8.68 9.71
LSTM MAST - 19.64 - -
LSTM FORD - - 2.66 -
LSTM EXON - - - 1.34
CNN AAOL 2.18 4.78 6.77 7.93
CNN MAST - 4.17 - -
CNN FORD - - 0.55 -
CNN EXON - - - 0.77
SVR AAOL 0.67 0.11 0.51 0.21
SVR MAST - 0.86 - -
SVR FORD - - 0.097 -
SVR EXON - - - 0.085
(Data Sources: http://www.xpublication.com/index.php/jmo/article/view/411)
Stock Market Prediction Using Machine Learning 463
the most accurate one among the three in this case. And it is interesting to see that the
combination of LSTM and CNN will increase the efficiency comparing to LSTM alone.
Forecasting other stock prices using the model trained on one dataset only is an effective
method that Chen used, it shows the stock market data has striking similarities and it can
be useful if people can apply this idea in stocks that have positive relationships, it is both
efficient and accurate. However, as the data has a large time span of almost 20 years, many
external circumstances may have changed and influenced the current data. Therefore,
it is recommended that Chen can look at the data of different time intervals differently
and make further analysis based on how the predictions vary in different intervals and
digging out the reasons behind it.
3 Discussion
Based on the holistic overview and the detailed analysis of the results of the two papers,
The LSTM model is proven to be more accurate than the Regression based model in the
paper written by Parmer when predicting the stock market price for one specific company
throughout the year. Choosing the correct datasets for building different models and a
computational efficient algorithm is what the author did well and these are important
in the stock market prediction process as well. Though LSTM performs better than the
Regression-based model in this case, it is irrefutable that statistical based methods are
the building blocks of many machine learning models and they are widely applied in
other fields as well.
The second paper compares and analyzes the performances of SVR, LSTM and CNN
in the stock market prediction of four main datasets from Apple, Mastercard, Ford and
ExxonMobil. It is found that SVM performs better and makes better predictions when
trained on larger datasets and the combination of LSTM and CNN is proven to be of
better prediction than the LSTM model alone. The finding that the model trained based
on one stock can also be applied to predict stocks of similar types that have a positive
relationship with the trained dataset.
A number of things that were not considered or done properly in these two papers
include: 1. datasets too large or too small 2. lack of further analysis of the datasets based
on the time span 3. limited number of simulations 4. overfitting and efficiency problems
5. influence of external factors like politics, social media posts and financial news.
Hence, it is recommended that people: 1. choose proper datasets of different time
spans based on the models they use; 2. try to use larger datasets and more simulations;
3. train the models in a more complete and deeper way; 4. take care of the overfitting
and efficiency problem in the training process; 5. assess the role of external factors in
stock market prediction and machine learning algorithms.
These recommendations could be a strong boost in making the whole stock market
prediction process more complete and accurate.
4 Conclusion
Financial markets provides an essential platform for financial transactions and invest-
ments to happen and the it allows people to have the opportunity of making their invest-
ment grow. Therefore, stock market prediction is essential in helping the investors make
464 Q. Chen
correct and profitable decisions. The attempt of this paper is to give a systemic review
of the modern machine learning methods that are commonly used in the stock market
prediction process by analyzing two academic papers that includes the building, testing
and analyzing of LSTM, Regression-based model, CNN and SVM. LSTM performed
better than the Regression-based model when training on a relative small dataset, SVM
performed the best among SVM, CNN and LSTM in predicting future prices based
on large stock datasets and the combination of CNN and LSTM also produces better
results. It is recommended that people try to choose suitable data sets with different
time spans and investigate the characteristics of the data before building a model. At the
same time, a computationally efficient method is used when training the model, and it is
recommended to apply the model to the current and training experiments. So as to avoid
the over-fitting problem in the market forecasting process, so as to improve the process
efficiency and forecasting accuracy.
Examining the role of external factors in stock market prediction can be an important
aspect in the future study and machine learning definitely plays and important role in it.
References
1. Parmar et al., “Stock Market Prediction Using Machine Learning,” 2018 First International
Conference on Secure Cyber Computing and Communication (ICSCCC), 2018, pp. 574–576,
doi: https://doi.org/10.1109/ICSCCC.2018.8703332.
2. H. L. Siew and M. J. Nordin, “Regression techniques for the prediction of stock price
trend”, 2012 International Conference on Statistics in Science Business and Engineering
(ICSSBE), pp. 1–5, 2012.
3. Klein, M.D.; Datta, G.S. Statistical disclosure control via sufficiency under the multiple linear
regression model. J. Stat. Theory Pract. 2018, 12, 100–110.
4. S. O. Ojo, P. A. Owolawi, M. Mphahlele and J. A. Adisa, “Stock Market Behaviour Prediction
using Stacked LSTM Networks,” 2019 International Multidisciplinary Information Technol-
ogy and Engineering Conference (IMITEC), 2019, pp. 1–5, doi: https://doi.org/10.1109/IMI
TEC45504.2019.9015840.
5. D. Wei, “Prediction of Stock Price Based on LSTM Neural Network,” 2019 International
Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), 2019, pp. 544–
547, doi: https://doi.org/10.1109/AIAM48774.2019.00113.
6. Shen, W.; Zhang, Y.; Ma, X. Stock return forecast with LS-SVM and particle swarm optimiza-
tion. In Proceedings of the International Conference on Business Intelligence and Financial
Engineering (BIFE’09), Beijing, China, 24–26 July 2009; IEEE: Piscataway, NJ, USA, 2009.
7. S. Madge, Predicting Stock Price Direction using Support Vector Machines, Independent
Work Report Spring, 2015.
8. S. Liu and G. Liao and Y. Ding, “Stock transaction prediction modelling and analysis based
on LSTM”, 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA),
pp. 2787–2790, 2018.
Stock Market Prediction Using Machine Learning 465
9. Chen, L. (2020, December 15). Using Machine Learning Algorithms on Prediction of Stock
Price | Journal of Modeling and Optimization. Xpublication http://www.xpublication.com/
index.php/jmo/article/view/411
10. Donges, N. (2021, August 1). Gradient Descent: An Introduction to 1 of Machine Learning’s
Most Popular Algorithms. Built In. https://builtin.com/data-science/gradient-descent
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-
NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/),
which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.