research-article

Detection of spam reviews through a hierarchical attention architecture with N-gram CNN and Bi-LSTM

Authors:

Jinyan LiAuthors Info & Claims

Volume 103, Issue C

https://doi.org/10.1016/j.is.2021.101865

Published: 01 January 2022 Publication History

Abstract

Spam reviews misguide decision makings of consumers and may seriously affect fair trading in the online markets. Existing methods for detecting spam reviews mainly focus on feature designs from linguistic and psychological clues, but they hardly reveal the potential semantics. Recent research works apply deep learning to capture semantics features, while these models fail to extract multi-granularity information of the text structures nor consider the mutual influence among the sentences. We propose a hierarchical attention network in which distinct attentions are purposely used at the two layers to capture important, comprehensive, and multi-granularity semantic information. At the first layer, we especially use an N-gram CNN to extract the multi-granularity semantics of the sentences. We then use a combination of convolution structure and Bi-LSTM to extract important and comprehensive semantics in a document at the second layer. Extensive experiments on public datasets demonstrate that our model has superior detection performance over the state-of-the-art baselines, improving F 1 score in the mixed-domain to 89.3% (with 4.8 points absolute improvement), F 1 score in the Doctor domain to 92.8% (with 9.9 points absolute improvement), F 1 score in the Hotel domain to 86.1% (with 2.4 points absolute improvement) and F 1 score in the cross-domain to 84.7% (with 10.4 points absolute improvement).

Highlights

•

We proposed a novel hierarchical attention architecture for spam review detection.

•

The Word2Sent-level captures multi-granularity and informative information.

•

The Sent2Doc-level extracts comprehensive and important information.

•

Extensive experiments demonstrate that our model has superior detection performance.

References

[1]

S. Kennedy, N. Walsh, K. Sloka, A. McCarren, J. Foster, Fact or factitious? Contextualized opinion spam detection, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2019, pp. 344–350.

[2]

López V., Del Río S., Benítez J.M., Herrera F., Cost-sensitive linguistic fuzzy rule based classification systems under the mapreduce framework for imbalanced big data, Fuzzy Sets and Systems 258 (2015) 5–38.

[3]

Y.-R. Chen, H.-H. Chen, Opinion spam detection in web forum: a real case study, in: Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 173–183.

[4]

Soliman A., Girdzijauskas S., Adaptive graph-based algorithms for spam detection in social networks, 2016.

[5]

Yao C., Wang J., Kodama E., A spam review detection method by verifying consistency among multiple review sites, in: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), IEEE, 2019, pp. 2825–2830.

[6]

Liu Y., Pang B., Opinion spam detection based on annotation extension and neural networks., Comput. Inf. Sci. 12 (2) (2019) 87.

[7]

Hussain N., Mirza H.T., Hussain I., Iqbal F., Memon I., Spam review detection using the linguistic and spammer behavioral methods, IEEE Access 8 (2020) 53801–53816.

[8]

N. Jindal, B. Liu, Opinion spam and analysis, in: Proceedings of the 2008 International Conference on Web Search and Data Mining, 2008, pp. 219–230.

[9]

Rastogi A., Mehrotra M., Ali S.S., Effective opinion spam detection: A study on review metadata versus content, J. Data Inf. Sci. 5 (2) (2020) 76–110.

[10]

Ren Y., Ji D., Neural networks for deceptive opinion spam detection: An empirical study, Inform. Sci. 385 (2017) 213–224.

[11]

A. Li, Z. Qin, R. Liu, Y. Yang, D. Li, Spam review detection with graph convolutional networks, in: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 2703–2711.

[12]

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, E. Hovy, Hierarchical attention networks for document classification, in: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 1480–1489.

[13]

Shu K., Wang S., Lee D., Liu H., Mining disinformation and fake news: Concepts, methods, and recent advancements, 2020, arXiv preprint arXiv:2001.00623.

[14]

Graves A., Schmidhuber J., Framewise phoneme classification with bidirectional lstm and other neural network architectures, Neural Netw. 18 (5–6) (2005) 602–610.

[15]

Khalil K., Eldash O., Kumar A., Bayoumi M., Economic LSTM approach for recurrent neural networks, IEEE Trans. Circuits Syst. II: Express Briefs 66 (11) (2019) 1885–1889.

[16]

Tavakoli M., Heydari A., Ismail Z., Salim N., A framework for review spam detection research, World Acad. Sci. Eng. Technol. Int. J. Comput. Electr. Autom. Control Inf. Eng. 10 (1) (2015) 67–71.

[17]

Akram A.U., Khan H.U., Iqbal S., Iqbal T., Munir E.U., Shafi M., Finding rotten eggs: A review spam detection model using diverse feature sets., KSII Trans. Internet Inf. Syst. 12 (10) (2018).

[18]

Heydari A., Tavakoli M., Salim N., Detection of fake opinions using time series, Expert Syst. Appl. 58 (2016) 83–92.

[19]

Wang Z., Hou T., Song D., Li Z., Kong T., Detecting review spammer groups via bipartite graph projection, Comput. J. 59 (6) (2016) 861–874.

[20]

Wang Z., Gu S., Xu X., GSLDA: LDA-based group spamming detection in product reviews, Appl. Intell. 48 (9) (2018) 3094–3107.

Digital Library

[21]

Liu Y., Pang B., A unified framework for detecting author spamicity by modeling review deviation, Expert Syst. Appl. 112 (2018) 148–155.

[22]

Wang Z., Hu R., Chen Q., Gao P., Xu X., Collueagle: Collusive review spammer detection using markov random fields, 2019, arXiv preprint arXiv:1911.01690.

[23]

Noekhah S., binti Salim N., Zakaria N.H., Opinion spam detection: Using multi-iterative graph-based model, Inf. Process. Manage. 57 (1) (2020).

[24]

Z. You, T. Qian, B. Liu, An attribute enhanced domain adaptive model for cold-start spam review detection, in: Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 1884–1895.

[25]

Stanton G., Irissappane A.A., Gans for semi-supervised opinion spam detection, 2019, arXiv preprint arXiv:1903.08289.

[26]

Li L., Qin B., Ren W., Liu T., Document representation and feature combination for deceptive spam review detection, Neurocomputing 254 (2017) 33–41.

[27]

Yuan C., Zhou W., Ma Q., Lv S., Han J., Hu S., Learning review representations from user and product level information for spam detection, 2019, arXiv preprint arXiv:1909.04455.

[28]

Huang Z., Xu X., Zhu H., Zhou M., An efficient group recommendation model with multiattention-based neural networks, IEEE Trans. Neural Netw. Learn. Syst. (2020).

[29]

Kim J., Jang S., Park E., Choi S., Text classification using capsules, Neurocomputing 376 (2020) 214–221.

Digital Library

[30]

Yadav V., Bethard S., A survey on recent advances in named entity recognition from deep learning models, 2019, arXiv preprint arXiv:1910.11470.

[31]

Koehn P., Neural Machine Translation, Cambridge University Press, 2020.

[32]

Socher R., Perelygin A., Wu J., Chuang J., Manning C.D., Ng A.Y., Potts C., Recursive deep models for semantic compositionality over a sentiment treebank, in: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1631–1642.

[33]

Kalchbrenner N., Grefenstette E., Blunsom P., A convolutional neural network for modelling sentences, 2014, arXiv preprint arXiv:1404.2188.

[34]

Kim Y., Convolutional neural networks for sentence classification, 2014, arXiv preprint arXiv:1408.5882.

[35]

Collobert R., Weston J., Bottou L., Karlen M., Kavukcuoglu K., Kuksa P., Natural language processing (almost) from scratch, J. Mach. Learn. Res. 12 (Aug) (2011) 2493–2537.

[36]

Ren Y., Zhang Y., Zhang M., Ji D., Context-sensitive twitter sentiment classification using neural network, in: Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 215–221.

[37]

Er M.J., Zhang Y., Wang N., Pratama M., Attention pooling-based convolutional neural network for sentence modelling, Inform. Sci. 373 (2016) 388–403.

[38]

A.L. Maas, A.Y. Hannun, A.Y. Ng, Rectifier nonlinearities improve neural network acoustic models, in: Proc. Icml, Vol. 30, 2013, p. 3.

[39]

Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R., Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15 (1) (2014) 1929–1958.

Digital Library

[40]

Zeiler M.D., Adadelta: an adaptive learning rate method, 2012, arXiv preprint arXiv:1212.5701.

[41]

J. Li, M. Ott, C. Cardie, E. Hovy, Towards a general rule for identifying deceptive opinion spam, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014, pp. 1566–1576.

[42]

Mikolov T., Chen K., Corrado G., Dean J., Efficient estimation of word representations in vector space, 2013, arXiv preprint arXiv:1301.3781.

Cited By

Cai YWang HCao HWang WZhang LChen X(2024)Detecting Spam Movie Review Under Coordinated Attack With Multi-View Explicit and Implicit Relations Semantics FusionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.344194719(7588-7603)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3441947
Duma RNiu ZNyamawe AManjotho A(2024)A deep feature interaction and fusion model for fake review detectionNeurocomputing10.1016/j.neucom.2024.128097598:COnline publication date: 14-Sep-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128097
Qandos NHamad GAlharbi MAlturki SAlharbi WAlbelaihi A(2024)Multiscale cascaded domain-based approach for Arabic fake reviews detection in e-commerce platformsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10192636:2Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1016/j.jksuci.2024.101926
Show More Cited By

Index Terms

Detection of spam reviews through a hierarchical attention architecture with N-gram CNN and Bi-LSTM
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction

Index terms have been assigned to the content through auto-classification.

Recommendations

Spam review detection using self attention based CNN and bi-directional LSTM
Abstract
Opinion reviews are a valuable source of information in e-commerce. Indeed, it benefits users in buying decisions and businesses to enhance their quality. However, various greedy organizations employ spammers to post biased spam reviews to gain an ...
Single image super-resolution using deep hierarchical attention network
ICMIP '20: Proceedings of the 5th International Conference on Multimedia and Image Processing

In this paper, we present a compact and accurate super-resolution algorithm using the attention-augmented convolutional neural network, which can exploit and weight hierarchical features at multiple scales and levels to improve learning capability. The ...
A Hybrid Approach for Predicting Bitcoin Price Using Bi-LSTM and Bi-RNN Based Neural Network
Intelligent Data Engineering and Automated Learning – IDEAL 2021
Abstract
Bitcoin is an electronic or digital currency. However, unlike government-issued currencies, there is no single entity that issues bitcoin or is in charge of processing transactions. That’s why bitcoin has become popular in the recent era. As ...

Comments

Information & Contributors

Information

Published In

cover image Information Systems

Information Systems Volume 103, Issue C

Jan 2022

247 pages

ISSN:0306-4379

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 01 January 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cai YWang HCao HWang WZhang LChen X(2024)Detecting Spam Movie Review Under Coordinated Attack With Multi-View Explicit and Implicit Relations Semantics FusionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.344194719(7588-7603)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3441947
Duma RNiu ZNyamawe AManjotho A(2024)A deep feature interaction and fusion model for fake review detectionNeurocomputing10.1016/j.neucom.2024.128097598:COnline publication date: 14-Sep-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128097
Qandos NHamad GAlharbi MAlturki SAlharbi WAlbelaihi A(2024)Multiscale cascaded domain-based approach for Arabic fake reviews detection in e-commerce platformsJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10192636:2Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1016/j.jksuci.2024.101926
Zhang LXu MBu ZHe GZhu HFang C(2024)Collusive spam detection from Chinese community question answering sitesInformation Sciences: an International Journal10.1016/j.ins.2024.120379667:COnline publication date: 1-May-2024
https://dl.acm.org/doi/10.1016/j.ins.2024.120379
Mewada ADewang R(2024)NRWalk2Vec-HIN: spammer group detection based on heterogeneous information network embedding over social mediaThe Journal of Supercomputing10.1007/s11227-023-05537-080:2(1818-1851)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s11227-023-05537-0
Duma RNiu ZNyamawe ATchaye-Kondi JChambua JYusuf A(2024)DHMFRD – TER: a deep hybrid model for fake review detection incorporating review texts, emotions, and ratingsMultimedia Tools and Applications10.1007/s11042-023-15193-483:2(4533-4549)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s11042-023-15193-4
Duma RNiu ZNyamawe ATchaye-Kondi JJingili NYusuf ADeve A(2024)Fake review detection techniques, issues, and future research directions: a literature reviewKnowledge and Information Systems10.1007/s10115-024-02118-266:9(5071-5112)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10115-024-02118-2
Iqbal FJaved AJhaveri RAlmadhor AFarooq U(2023)Transfer Learning-based Forensic Analysis and Classification of E-Mail ContentACM Transactions on Asian and Low-Resource Language Information Processing10.1145/3604592Online publication date: 28-Jun-2023
https://dl.acm.org/doi/10.1145/3604592
Wu YZhao SDou SLi J(2023)ParsingPhraseInformation Sciences: an International Journal10.1016/j.ins.2023.03.089633:C(531-548)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.ins.2023.03.089
Pandey ARoy S(2023)Natural Language Generation Using Sequential Models: A SurveyNeural Processing Letters10.1007/s11063-023-11281-655:6(7709-7742)Online publication date: 12-May-2023
https://dl.acm.org/doi/10.1007/s11063-023-11281-6
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents