Abstract
Reviews of users on social networks have been gaining rapidly interest on the usage of sentiment analysis which serve as feedback to the government, public and private companies. Text Mining has a wide variety of applications such as sentiment analysis, spam detection, sarcasm detection, and news classification. Reviews classification using user sentiments is an important and collaborative task for many organizations. During recent years, text classification is mostly studied with machine learning models and hand–crafted features which are not able to give promising results on short text classification. In this research, a deep neural network–based model Long Short Term Memory (LSTM) with word embedding features is proposed. The proposed model has been evaluated on the large dataset of Hotel reviews based on accuracy, precision, recall, and F1-score. This research is a classification study on the hotel review sentiments given by guests of the hotel. The results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%. These promising results reveal the effectiveness of the proposed model on any type of review classification tasks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ajesh F, Ravi R, G R, (2020) Early diagnosis of glaucoma using multi-feature analysis and dbn based classification. J Ambient Intell Hum Comput https://doi.org/10.1007/s12652-020-01771-z
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010950718922
Catal C, Nangır M (2016) A sentiment classification model based on multiple classifiers. Appl Soft Comput 50. https://doi.org/10.1016/j.asoc.2016.11.022
Chen T, Xu R, Wang X (2016) Improving sentiment analysis via sentence type classification using bilstm-crf and cnn. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2016.10.065
Chung J, Gülçehre Ç, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555, http://arxiv.org/abs/1412.3555,
Collobert R Weston J Bottou L KMKK, Kuksa (2011) Natural language processing (almost) from scratch. J Mach Learn Res pp 2493–2537
Dai S, Li L, Li Z (2019) Modeling vehicle interactions via modified lstm models for trajectory prediction. IEEE Access 7:38287–38296
Du J, Vong CM, Chen CP (2020) Novel efficient rnn and lstm-like architectures: Recurrent and gated broad learning systems and their applications for text classification. IEEE Trans Cybern
Freire-Obregón D, Castrillón-Santana M, Barra P, Bisogni C, Nappi M (2020) An attention recurrent model for human cooperation detection. Comput Vis Image Understand, 102991
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. 10.1006/jcss.1997.1504, http://www.sciencedirect.com/science/article/pii/S002200009791504X
Friedman J (2000) Greedy function approximation: a gradient boosting machine. Ann Stat 29: https://doi.org/10.1214/aos/1013203451
Garcia-Pablos A, Cuadros M, Linaza M (2015a) Automatic analysis of textual hotel reviews. Inf Technol & Tourism 16: https://doi.org/10.1007/s40558-015-0047-7
Garcia-Pablos A, Cuadros M, Linaza M (2015b) OpeNER: Open Tools to Perform Natural Language Processing on Accommodation Reviews, pp 125–137. 10.1007/978-3-319-14343-910
Ghorpade T, Ragha L (2012) Featured based sentiment classification for hotel reviews using nlp and bayesian classification. In: 2012 International Conference on Communication, Information Computing Technology (ICCICT), pp 1–5
Gotz M, Weber C, Blöcher J, Stieltjes B, Meinzer HP, Maier-Hein K (2014) Extremely randomized trees based brain tumor segmentation
Hand DJ (2013) Data Mining Based in part on the article “Data mining” by David Hand, which appeared in the Encyclopedia of Environmetrics., American Cancer Society. 10.1002/9780470057339.vad002.pub2, https://onlinelibrary.wiley.com/doi/abs/10.1002/9780470057339.vad002.pub2,
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput pp 1735–1780
Huang JS, Chen BQ, Zeng NY, Cao XC, Li Y (2020) Accurate classification of ecg arrhythmia using mowpt enhanced fast compression deep learning networks. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-02110-y
Hwang SY, Lai C, Jiang JJ, Chang S (2014) The identification of noteworthy hotel reviews for hotel management. Proceedings - Pacific Asia Conference on Information Systems, PACIS 2014 6, 10.17705/1pais.06402
Imtiaz Z, Umer M, Ahmad M, Ullah S, Choi GS, Mehmood A (2020) Duplicate questions pair detection using siamese malstm. IEEE Access 8:21932–21942
Jason Liu (2017) 515K Hotel Reviews Data in Europe. https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe. Accessed 19 Feb 2020
Joachims T (1999) Making large scale svm learning practical. Advances in Kernel Methods: Upport Vector Machines 10.17877/DE290R-5098
Kalchbrenner N GE, P B (2014) A convolutional neural network for modelling sentences. arXiv preprint arXiv:14042188
Kasper W, Vela M (2012) Sentiment analysis for hotel reviews. Speech Technol 4:96–109
Li J, Luong T, Jurafsky D, Hovy E (2015) When are tree structures necessary for deep learning of representations? In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 2304–2314, 10.18653/v1/D15-1278, https://www.aclweb.org/anthology/D15-1278
Long T (2019) Research on application of athlete gesture tracking algorithms based on deep learning. J Ambient Intell Human Comput pp 1–9, 10.1007/s12652-019-01575-w
Mandelbaum A, Shalev A (2016) Word embeddings and their use in sentence classification tasks. ArXiv abs/1610.08229
Mathieu C (2017) Bb\_twtr at semeval-2017 task 4: Twitter sentiment analysis with cnns and lstms. arXiv preprint arXiv:170406125
Mccallum A, Nigam K (2001) A comparison of event models for naive bayes text classification. Work Learn Text Categ 752
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26
Oscar A, Ignacio CP, Fernando SRJ, A IC, (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E, Louppe G (2012) Scikit-learn: Machine learning in python. J Mach Learn Res 12
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. vol 14, pp 1532–1543, 10.3115/v1/D14-1162
Potgieter M, de Jager JW, van Heerden NH (2013) An innovative marketing information system: A management tool for south african tour operators. Procedia - Social and Behavioral Sciences 99:733–741. 10.1016/j.sbspro.2013.10.545, http://www.sciencedirect.com/science/article/pii/S1877042813039906, the Proceedings of 9th International Strategic Management Conference
Raut V, Londhe D (2015) Opinion mining and summarization of hotel reviews. Proceedings - 2014 6th International Conference on Computational Intelligence and Communication Networks, CICN 2014 pp 556–559, 10.1109/CICN.2014.126
Richardson A (2011) Logistic regression: A self-learning text, third edition by david g. kleinbaum, mitchel klein. Int Stat Rev79:296, 10.2307/41305046
Rush AM, Harvard S, Chopra S, Weston J (2017) A neural attention model for sentence summarization. In: ACLWeb. Proceedings of the 2015 conference on empirical methods in natural language processing
Sadiq S, Mehmood A, Ullah S, Ahmad M, Choi GS, On BW (2020) Aggression detection through deep neural model on twitter. Fut Gen Comput Syst
Kim Sang-Bum, Han Kyoung-Soo, Rim Hae-Chang, Myaeng Sung Hyon (2006) Some effective techniques for naive bayes text classification. IEEE Trans Knowl Data Eng 18(11):1457–1466
Shi H, Li X (2011) A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics, vol 3, pp 950–954
Somya P Bansal, Ahmad T (2016) Methods and techniques of intrusion detection: a review. pp 518–529, 10.1007/978-981-10-3433-662
Steffen J (2004) N-gram language modeling for robust multi-lingual document classification. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), European Language Resources Association (ELRA), Lisbon, Portugal, http://www.lrec-conf.org/proceedings/lrec2004/pdf/510.pdf
Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B (2014) Learning sentiment-specific word embedding for twitter sentiment classification. vol 1, pp 1555–1565, 10.3115/v1/P14-1146
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432
Trip Advisor (2018) 6 Tren Wisata Utama Tahun 2016. https://www.tripadvisor.co.id/TripAdvisorInsights/w665. Accessed 19 Feb 2020
Umer M, Ashraf I, Mehmood A, Ullah DS, Choi GS (2020) Predicting numeric ratings for google apps using text features and ensemble learning. ETRI J. https://doi.org/10.4218/etrij.2019-0443
Umer M, Imtiaz Z, Ullah S, Mehmood A, Choi GS, On BW (2020) Fake news stance detection using deep learning architecture (cnn-lstm). IEEE Access 8:156695–156706
Umer M, Sadiq S, Ahmad M, Ullah S, Choi GS, Mehmood A (2020) A novel stacked cnn for malarial parasite detection in thin blood smear images. IEEE Access 8:93782–93792
Wan Y, Nakayama M (2014) The reliability of online review helpfulness. J Electron Commer Res 15
Wang D, Zhu S, Li T (2013) Sumview: A web-based engine for summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33. 10.1016/j.eswa.2012.05.070, http://www.sciencedirect.com/science/article/pii/S0957417412007865
Yamunadevi M, Ranjani S (2020) Efficient segmentation of the lung carcinoma by adaptive fuzzy-glcm (af-glcm) with deep learning based classification. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-01874-7
Yang L, Zheng Y, Cai X, Dai H, Mu D, Guo L, Dai T (2018) A lstm based model for personalized context-aware citation recommendation. IEEE Access 6:59618–59627
Yin W, Kann K, Yu M, Schütze H (2017) Comparative study of CNN and RNN for natural language processing. CoRR abs/1702.01923, http://arxiv.org/abs/1702.01923,
Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 10(1145/775047):775151
Zhu L, Yin G, He W (2014) Is this opinion leader’s review useful? peripheral cues for online review helpfulness. J Electron Commer Res 15:267–280
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2020-2016-0-00313) supervised by the IITP(Institute for Information & communications Technology Promotion), The Brain Korea 21 Plus Program(No. 22A20130012814) funded by the National Research Foundation of Korea (NRF), and in part by the Fareed Computing Research Center, Department of Computer Science under Khwaja Fareed University of Engineering and Information Technology(KFUEIT), Punjab, Rahim Yar Khan, Pakistan.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflicts of interest
”The authors declare no conflict of interest. The funding agency had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results”.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ishaq, A., Umer, M., Mushtaq, M.F. et al. Extensive hotel reviews classification using long short term memory. J Ambient Intell Human Comput 12, 9375–9385 (2021). https://doi.org/10.1007/s12652-020-02654-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02654-z