Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Extensive hotel reviews classification using long short term memory

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Reviews of users on social networks have been gaining rapidly interest on the usage of sentiment analysis which serve as feedback to the government, public and private companies. Text Mining has a wide variety of applications such as sentiment analysis, spam detection, sarcasm detection, and news classification. Reviews classification using user sentiments is an important and collaborative task for many organizations. During recent years, text classification is mostly studied with machine learning models and hand–crafted features which are not able to give promising results on short text classification. In this research, a deep neural network–based model Long Short Term Memory (LSTM) with word embedding features is proposed. The proposed model has been evaluated on the large dataset of Hotel reviews based on accuracy, precision, recall, and F1-score. This research is a classification study on the hotel review sentiments given by guests of the hotel. The results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%. These promising results reveal the effectiveness of the proposed model on any type of review classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Ajesh F, Ravi R, G R, (2020) Early diagnosis of glaucoma using multi-feature analysis and dbn based classification. J Ambient Intell Hum Comput https://doi.org/10.1007/s12652-020-01771-z

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473

  • Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010950718922

    Article  MATH  Google Scholar 

  • Catal C, Nangır M (2016) A sentiment classification model based on multiple classifiers. Appl Soft Comput 50. https://doi.org/10.1016/j.asoc.2016.11.022

  • Chen T, Xu R, Wang X (2016) Improving sentiment analysis via sentence type classification using bilstm-crf and cnn. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2016.10.065

  • Chung J, Gülçehre Ç, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555, http://arxiv.org/abs/1412.3555,

  • Collobert R Weston J Bottou L KMKK, Kuksa (2011) Natural language processing (almost) from scratch. J Mach Learn Res pp 2493–2537

  • Dai S, Li L, Li Z (2019) Modeling vehicle interactions via modified lstm models for trajectory prediction. IEEE Access 7:38287–38296

    Article  Google Scholar 

  • Du J, Vong CM, Chen CP (2020) Novel efficient rnn and lstm-like architectures: Recurrent and gated broad learning systems and their applications for text classification. IEEE Trans Cybern

  • Freire-Obregón D, Castrillón-Santana M, Barra P, Bisogni C, Nappi M (2020) An attention recurrent model for human cooperation detection. Comput Vis Image Understand, 102991

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. 10.1006/jcss.1997.1504, http://www.sciencedirect.com/science/article/pii/S002200009791504X

  • Friedman J (2000) Greedy function approximation: a gradient boosting machine. Ann Stat 29: https://doi.org/10.1214/aos/1013203451

  • Garcia-Pablos A, Cuadros M, Linaza M (2015a) Automatic analysis of textual hotel reviews. Inf Technol & Tourism 16: https://doi.org/10.1007/s40558-015-0047-7

  • Garcia-Pablos A, Cuadros M, Linaza M (2015b) OpeNER: Open Tools to Perform Natural Language Processing on Accommodation Reviews, pp 125–137. 10.1007/978-3-319-14343-910

  • Ghorpade T, Ragha L (2012) Featured based sentiment classification for hotel reviews using nlp and bayesian classification. In: 2012 International Conference on Communication, Information Computing Technology (ICCICT), pp 1–5

  • Gotz M, Weber C, Blöcher J, Stieltjes B, Meinzer HP, Maier-Hein K (2014) Extremely randomized trees based brain tumor segmentation

  • Hand DJ (2013) Data Mining Based in part on the article “Data mining” by David Hand, which appeared in the Encyclopedia of Environmetrics., American Cancer Society. 10.1002/9780470057339.vad002.pub2, https://onlinelibrary.wiley.com/doi/abs/10.1002/9780470057339.vad002.pub2,

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput pp 1735–1780

  • Huang JS, Chen BQ, Zeng NY, Cao XC, Li Y (2020) Accurate classification of ecg arrhythmia using mowpt enhanced fast compression deep learning networks. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-02110-y

  • Hwang SY, Lai C, Jiang JJ, Chang S (2014) The identification of noteworthy hotel reviews for hotel management. Proceedings - Pacific Asia Conference on Information Systems, PACIS 2014 6, 10.17705/1pais.06402

  • Imtiaz Z, Umer M, Ahmad M, Ullah S, Choi GS, Mehmood A (2020) Duplicate questions pair detection using siamese malstm. IEEE Access 8:21932–21942

    Article  Google Scholar 

  • Jason Liu (2017) 515K Hotel Reviews Data in Europe. https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe. Accessed 19 Feb 2020

  • Joachims T (1999) Making large scale svm learning practical. Advances in Kernel Methods: Upport Vector Machines 10.17877/DE290R-5098

  • Kalchbrenner N GE, P B (2014) A convolutional neural network for modelling sentences. arXiv preprint arXiv:14042188

  • Kasper W, Vela M (2012) Sentiment analysis for hotel reviews. Speech Technol 4:96–109

    Google Scholar 

  • Li J, Luong T, Jurafsky D, Hovy E (2015) When are tree structures necessary for deep learning of representations? In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, pp 2304–2314, 10.18653/v1/D15-1278, https://www.aclweb.org/anthology/D15-1278

  • Long T (2019) Research on application of athlete gesture tracking algorithms based on deep learning. J Ambient Intell Human Comput pp 1–9, 10.1007/s12652-019-01575-w

  • Mandelbaum A, Shalev A (2016) Word embeddings and their use in sentence classification tasks. ArXiv abs/1610.08229

  • Mathieu C (2017) Bb\_twtr at semeval-2017 task 4: Twitter sentiment analysis with cnns and lstms. arXiv preprint arXiv:170406125

  • Mccallum A, Nigam K (2001) A comparison of event models for naive bayes text classification. Work Learn Text Categ 752

  • Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26

  • Oscar A, Ignacio CP, Fernando SRJ, A IC, (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246

    Article  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E, Louppe G (2012) Scikit-learn: Machine learning in python. J Mach Learn Res 12

  • Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. vol 14, pp 1532–1543, 10.3115/v1/D14-1162

  • Potgieter M, de Jager JW, van Heerden NH (2013) An innovative marketing information system: A management tool for south african tour operators. Procedia - Social and Behavioral Sciences 99:733–741. 10.1016/j.sbspro.2013.10.545, http://www.sciencedirect.com/science/article/pii/S1877042813039906, the Proceedings of 9th International Strategic Management Conference

  • Raut V, Londhe D (2015) Opinion mining and summarization of hotel reviews. Proceedings - 2014 6th International Conference on Computational Intelligence and Communication Networks, CICN 2014 pp 556–559, 10.1109/CICN.2014.126

  • Richardson A (2011) Logistic regression: A self-learning text, third edition by david g. kleinbaum, mitchel klein. Int Stat Rev79:296, 10.2307/41305046

  • Rush AM, Harvard S, Chopra S, Weston J (2017) A neural attention model for sentence summarization. In: ACLWeb. Proceedings of the 2015 conference on empirical methods in natural language processing

  • Sadiq S, Mehmood A, Ullah S, Ahmad M, Choi GS, On BW (2020) Aggression detection through deep neural model on twitter. Fut Gen Comput Syst

  • Kim Sang-Bum, Han Kyoung-Soo, Rim Hae-Chang, Myaeng Sung Hyon (2006) Some effective techniques for naive bayes text classification. IEEE Trans Knowl Data Eng 18(11):1457–1466

    Article  Google Scholar 

  • Shi H, Li X (2011) A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics, vol 3, pp 950–954

  • Somya P Bansal, Ahmad T (2016) Methods and techniques of intrusion detection: a review. pp 518–529, 10.1007/978-981-10-3433-662

  • Steffen J (2004) N-gram language modeling for robust multi-lingual document classification. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), European Language Resources Association (ELRA), Lisbon, Portugal, http://www.lrec-conf.org/proceedings/lrec2004/pdf/510.pdf

  • Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B (2014) Learning sentiment-specific word embedding for twitter sentiment classification. vol 1, pp 1555–1565, 10.3115/v1/P14-1146

  • Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432

  • Trip Advisor (2018) 6 Tren Wisata Utama Tahun 2016. https://www.tripadvisor.co.id/TripAdvisorInsights/w665. Accessed 19 Feb 2020

  • Umer M, Ashraf I, Mehmood A, Ullah DS, Choi GS (2020) Predicting numeric ratings for google apps using text features and ensemble learning. ETRI J. https://doi.org/10.4218/etrij.2019-0443

  • Umer M, Imtiaz Z, Ullah S, Mehmood A, Choi GS, On BW (2020) Fake news stance detection using deep learning architecture (cnn-lstm). IEEE Access 8:156695–156706

    Article  Google Scholar 

  • Umer M, Sadiq S, Ahmad M, Ullah S, Choi GS, Mehmood A (2020) A novel stacked cnn for malarial parasite detection in thin blood smear images. IEEE Access 8:93782–93792

    Article  Google Scholar 

  • Wan Y, Nakayama M (2014) The reliability of online review helpfulness. J Electron Commer Res 15

  • Wang D, Zhu S, Li T (2013) Sumview: A web-based engine for summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33. 10.1016/j.eswa.2012.05.070, http://www.sciencedirect.com/science/article/pii/S0957417412007865

  • Yamunadevi M, Ranjani S (2020) Efficient segmentation of the lung carcinoma by adaptive fuzzy-glcm (af-glcm) with deep learning based classification. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-01874-7

  • Yang L, Zheng Y, Cai X, Dai H, Mu D, Guo L, Dai T (2018) A lstm based model for personalized context-aware citation recommendation. IEEE Access 6:59618–59627

    Article  Google Scholar 

  • Yin W, Kann K, Yu M, Schütze H (2017) Comparative study of CNN and RNN for natural language processing. CoRR abs/1702.01923, http://arxiv.org/abs/1702.01923,

  • Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 10(1145/775047):775151

  • Zhu L, Yin G, He W (2014) Is this opinion leader’s review useful? peripheral cues for online review helpfulness. J Electron Commer Res 15:267–280

    Google Scholar 

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2019R1A2C1006159), MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2020-2016-0-00313) supervised by the IITP(Institute for Information & communications Technology Promotion), The Brain Korea 21 Plus Program(No. 22A20130012814) funded by the National Research Foundation of Korea (NRF), and in part by the Fareed Computing Research Center, Department of Computer Science under Khwaja Fareed University of Engineering and Information Technology(KFUEIT), Punjab, Rahim Yar Khan, Pakistan.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Muhammad Umer, Muhammad Faheem Mushtaq, Arif Mehmood or Gyu Sang Choi.

Ethics declarations

Conflicts of interest

”The authors declare no conflict of interest. The funding agency had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results”.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ishaq, A., Umer, M., Mushtaq, M.F. et al. Extensive hotel reviews classification using long short term memory. J Ambient Intell Human Comput 12, 9375–9385 (2021). https://doi.org/10.1007/s12652-020-02654-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02654-z

Keywords