Data Augmentation Using BERT-Based Models for Aspect-Based Sentiment Analysis

Hollander, Bron; Frasincar, Flavius; van der Knaap, Finn

doi:10.1007/978-3-031-62362-2_8

Bron Hollander³⁰,
Flavius Frasincar ORCID: orcid.org/0000-0002-8031-758X³⁰ &
Finn van der Knaap³⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14629))

Included in the following conference series:

International Conference on Web Engineering

708 Accesses

Abstract

In today’s digital world, there is an overwhelming amount of opinionated data on the Web. However, effectively analyzing all available data proves to be a resource-intensive endeavor, requiring substantial time and financial investments to curate high-quality training datasets. To mitigate such problems, this paper compares data augmentation models for aspect-based sentiment analysis. Specifically, we analyze the effect of several BERT-based data augmentation methods on the performance of the state-of-the-art HAABSA++ model. We consider the following data augmentation models: EDA-adjusted (baseline), BERT, Conditional-BERT, BERT$_{\textrm{prepend}}$, and BERT$_{\textrm{expand}}$. Our findings show that incorporating data augmentation techniques can significantly improve the out-of-sample accuracy of the HAABSA++ model. Specifically, our results highlight the effectiveness of BERT$_{\textrm{prepend}}$ and BERT$_{\textrm{expand}}$, increasing the test accuracy from 78.56% to 79.23% and from 82.62% to 84.47% for the SemEval 2015 and SemEval 2016 datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-strategy text data augmentation for enhanced aspect-based sentiment analysis in resource-limited scenarios

Article 20 January 2024

HOSSemEval-EB23: a robust dataset for aspect-based sentiment analysis of hospitality reviews

Article 07 June 2024

BAN-ABSA: An Aspect-Based Sentiment Analysis Dataset for Bengali and Its Baseline Evaluation

References

Brauwers, G., Frasincar, F.: A survey on aspect-based sentiment classification. ACM Comput. Surv. 55(4), 65:1–65:37 (2023)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019), pp. 4171–4186. ACL (2019)
Google Scholar
Hoang, M., Bihorac, O.A., Rouces, J.: Aspect-based sentiment analysis using BERT. In: 22nd Nordic Conference on Computational Linguistics (NoDaLiDa 2019), pp. 187–196. Linköping University Electronic Press (2019)
Google Scholar
Kudo, T., Richardson, J.: SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), pp. 66–71. ACL (2018)
Google Scholar
Kumar, V., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models. arXiv preprint arXiv:2003.02245 (2020)
Lee, T.Y., Bradlow, E.T.: Automated marketing research using online customer reviews. J. Mark. Res. 48(5), 881–894 (2011)
Article Google Scholar
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pp. 7871–7880. ACL (2020)
Google Scholar
Liesting, T., Frasincar, F., Trusca, M.M.: Data augmentation in a hybrid approach for aspect-based sentiment analysis. In: 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021), pp. 828–835. ACM (2021)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Pantelidou, K., Chatzakou, D., Tsikrika, T., Vrochidis, S., Kompatsiaris, I.: Selective word substitution for contextualized data augmentation. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds.) NLDB 2022. LNCS, vol. 13286, pp. 508–516. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08473-7_47
Chapter Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1532–1543. ACL (2014)
Google Scholar
Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., Androutsopoulos, I.: SemEval-2015 task 12: aspect based sentiment analysis. In: 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 486–495. ACL (2015)
Google Scholar
Pontiki, M., et al.: SemEval-2016 task 5: aspect based sentiment analysis. In: 10th International Workshop on Semantic Evaluation (SemEval 2016), pp. 19–30. ACL (2016)
Google Scholar
Shani, C., Zarecki, J., Shahaf, D.: The lean data scientist: recent advances toward overcoming the data bottleneck. Commun. ACM 66(2), 92–102 (2023)
Article Google Scholar
Truşcǎ, M.M., Wassenberg, D., Frasincar, F., Dekker, R.: A hybrid approach for aspect-based sentiment analysis using deep contextual word embeddings and hierarchical attention. In: Bielikova, M., Mikkonen, T., Pautasso, C. (eds.) ICWE 2020. LNCS, vol. 12128, pp. 365–380. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50578-3_25
Chapter Google Scholar
Wallaart, O., Frasincar, F.: A hybrid approach for aspect-based sentiment analysis using a lexicalized domain ontology and attentional neural models. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 363–378. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_24
Chapter Google Scholar
Wei, J.W., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), pp. 6381–6387. ACL (2019)
Google Scholar
Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2019. LNCS, vol. 11539, pp. 84–95. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22747-0_7
Chapter Google Scholar
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Zhang, Y., Du, J., Ma, X., Wen, H., Fortino, G.: Aspect-based sentiment analysis for user reviews. Cogn. Comput. 13(5), 1114–1127 (2021)
Article Google Scholar
Zheng, S., Xia, R.: Left-center-right separated neural network for aspect-based sentiment analysis with rotatory attention. arXiv preprint arXiv:1802.00892 (2018)

Download references

Author information

Authors and Affiliations

Erasmus University Rotterdam, Burgemeester Oudlaan 50, 3062 PA, Rotterdam, The Netherlands
Bron Hollander, Flavius Frasincar & Finn van der Knaap

Authors

Bron Hollander
View author publications
You can also search for this author in PubMed Google Scholar
Flavius Frasincar
View author publications
You can also search for this author in PubMed Google Scholar
Finn van der Knaap
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Flavius Frasincar .

Editor information

Editors and Affiliations

Tampere University, Tampere, Finland
Kostas Stefanidis
Tampere University, Tampere, Finland
Kari Systä
Politecnico di Milano, Milano, Italy
Maristella Matera
Chemnitz University of Technology, Chemnitz, Germany
Sebastian Heil
University of Crete, Heraklion, Greece
Haridimos Kondylakis
University of Verona, Verona, Italy
Elisa Quintarelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hollander, B., Frasincar, F., van der Knaap, F. (2024). Data Augmentation Using BERT-Based Models for Aspect-Based Sentiment Analysis. In: Stefanidis, K., Systä, K., Matera, M., Heil, S., Kondylakis, H., Quintarelli, E. (eds) Web Engineering. ICWE 2024. Lecture Notes in Computer Science, vol 14629. Springer, Cham. https://doi.org/10.1007/978-3-031-62362-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-62362-2_8
Published: 16 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62361-5
Online ISBN: 978-3-031-62362-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Augmentation Using BERT-Based Models for Aspect-Based Sentiment Analysis