Abstract
In today’s digital world, there is an overwhelming amount of opinionated data on the Web. However, effectively analyzing all available data proves to be a resource-intensive endeavor, requiring substantial time and financial investments to curate high-quality training datasets. To mitigate such problems, this paper compares data augmentation models for aspect-based sentiment analysis. Specifically, we analyze the effect of several BERT-based data augmentation methods on the performance of the state-of-the-art HAABSA++ model. We consider the following data augmentation models: EDA-adjusted (baseline), BERT, Conditional-BERT, BERT\(_{\textrm{prepend}}\), and BERT\(_{\textrm{expand}}\). Our findings show that incorporating data augmentation techniques can significantly improve the out-of-sample accuracy of the HAABSA++ model. Specifically, our results highlight the effectiveness of BERT\(_{\textrm{prepend}}\) and BERT\(_{\textrm{expand}}\), increasing the test accuracy from 78.56% to 79.23% and from 82.62% to 84.47% for the SemEval 2015 and SemEval 2016 datasets, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brauwers, G., Frasincar, F.: A survey on aspect-based sentiment classification. ACM Comput. Surv. 55(4), 65:1–65:37 (2023)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019), pp. 4171–4186. ACL (2019)
Hoang, M., Bihorac, O.A., Rouces, J.: Aspect-based sentiment analysis using BERT. In: 22nd Nordic Conference on Computational Linguistics (NoDaLiDa 2019), pp. 187–196. Linköping University Electronic Press (2019)
Kudo, T., Richardson, J.: SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), pp. 66–71. ACL (2018)
Kumar, V., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models. arXiv preprint arXiv:2003.02245 (2020)
Lee, T.Y., Bradlow, E.T.: Automated marketing research using online customer reviews. J. Mark. Res. 48(5), 881–894 (2011)
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pp. 7871–7880. ACL (2020)
Liesting, T., Frasincar, F., Trusca, M.M.: Data augmentation in a hybrid approach for aspect-based sentiment analysis. In: 36th ACM/SIGAPP Symposium on Applied Computing (SAC 2021), pp. 828–835. ACM (2021)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Pantelidou, K., Chatzakou, D., Tsikrika, T., Vrochidis, S., Kompatsiaris, I.: Selective word substitution for contextualized data augmentation. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds.) NLDB 2022. LNCS, vol. 13286, pp. 508–516. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08473-7_47
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1532–1543. ACL (2014)
Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., Androutsopoulos, I.: SemEval-2015 task 12: aspect based sentiment analysis. In: 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 486–495. ACL (2015)
Pontiki, M., et al.: SemEval-2016 task 5: aspect based sentiment analysis. In: 10th International Workshop on Semantic Evaluation (SemEval 2016), pp. 19–30. ACL (2016)
Shani, C., Zarecki, J., Shahaf, D.: The lean data scientist: recent advances toward overcoming the data bottleneck. Commun. ACM 66(2), 92–102 (2023)
Truşcǎ, M.M., Wassenberg, D., Frasincar, F., Dekker, R.: A hybrid approach for aspect-based sentiment analysis using deep contextual word embeddings and hierarchical attention. In: Bielikova, M., Mikkonen, T., Pautasso, C. (eds.) ICWE 2020. LNCS, vol. 12128, pp. 365–380. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50578-3_25
Wallaart, O., Frasincar, F.: A hybrid approach for aspect-based sentiment analysis using a lexicalized domain ontology and attentional neural models. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 363–378. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_24
Wei, J.W., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), pp. 6381–6387. ACL (2019)
Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2019. LNCS, vol. 11539, pp. 84–95. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22747-0_7
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Zhang, Y., Du, J., Ma, X., Wen, H., Fortino, G.: Aspect-based sentiment analysis for user reviews. Cogn. Comput. 13(5), 1114–1127 (2021)
Zheng, S., Xia, R.: Left-center-right separated neural network for aspect-based sentiment analysis with rotatory attention. arXiv preprint arXiv:1802.00892 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hollander, B., Frasincar, F., van der Knaap, F. (2024). Data Augmentation Using BERT-Based Models for Aspect-Based Sentiment Analysis. In: Stefanidis, K., Systä, K., Matera, M., Heil, S., Kondylakis, H., Quintarelli, E. (eds) Web Engineering. ICWE 2024. Lecture Notes in Computer Science, vol 14629. Springer, Cham. https://doi.org/10.1007/978-3-031-62362-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-62362-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62361-5
Online ISBN: 978-3-031-62362-2
eBook Packages: Computer ScienceComputer Science (R0)