Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Stacking of BERT and CNN Models for Arabic Word Sense Disambiguation

Published: 18 November 2023 Publication History

Abstract

We propose a new approach for Arabic Word Sense Disambiguation (AWSD) by hybridization of single-layer Convolutional Neural Network (CNN) with contextual representation (BERT). WSD is the task of automatically detecting the correct meaning of a word used in a given context. WSD can be performed as a classification task, and the context is generally a short sentence. Kim [26] proved that combining a CNN with an RNN (recurrent neural network) provides a good result for text classification. Here, we use a concatenation of BERT models as a word embedding to get simultaneously the target and context representation. Our approach improves the performance of WSD in Arabic languages. The experimental results show that our model outperforms the state-of-the-art approaches and improves the accuracy of 96.42% on the Arabic WordNet dataset.

References

[1]
M. Alaeddine Abderrahim and M. El Amine Abderrahim. 2018. Arabic word sense disambiguation with conceptual density for information retrieval. Models and Optimisation and Mathematical Analysis Journal 6, 1 (2018), 5–9.
[2]
Mohammed Alaeddine Abderrahim and Mohammed El-Amine Abderrahim. 2022. Arabic word sense disambiguation for information retrieval. Transactions on Asian and Low-Resource Language Information Processing 21, 4 (2022), 1–19.
[3]
Muhammad Abdul-Mageed, AbdelRahim Elmadany, and El Moatez Billah Nagoudi. 2020. ARBERT & MARBERT: Deep bidirectional transformers for Arabic. arXiv preprint arXiv:2101.01785 (2020).
[4]
Rehab Hasan Abood and Sabrina Tiun. 2017. A comparative study of open-domain and specific-domain word sense disambiguation based on Quranic information retrieval. In MATEC Web of Conferences, Vol. 135. EDP Sciences, 00071.
[5]
Moustafa Al-Hajj and Mustafa Jarrar. 2021. ArabGlossBERT: Fine-tuning BERT on context-gloss pairs for WSD. (2021).
[6]
Mohammad Khaled A. Al-Maghasbeh and M. P. Bin Hamzah. 2015. Extract the semantic meaning of prepositions at Arabic texts: An exploratory study. Int. J. Comput. Trends Technol. 30, 3 (2015), 116–120.
[7]
Israa Alghanmi, Luis Espinosa-Anke, and Steven Schockaert. 2020. Combining BERT with static word embeddings for categorizing social media. (2020).
[8]
Marwah Alian, Arafat Awajan, and Akram Al-Kouz. 2016. Word sense disambiguation for Arabic text using Wikipedia and vector space model. International Journal of Speech Technology 19, 4 (2016), 857–867.
[9]
Ali Alkhatlan, Jugal Kalita, and Ahmed Alhaddad. 2018. Word sense disambiguation for Arabic exploiting Arabic WordNet and word embedding. Procedia Computer Science 142 (2018), 50–60.
[10]
Wissam Antoun, Fady Baly, and Hazem Hajj. 2020. AraBERT: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104 (2020).
[11]
Abdelaali Bakhouche, Tlili Yamina, Didier Schwab, and Andon Tchechmedjiev. 2015. Ant colony algorithm for Arabic word sense disambiguation through English lexical information. International Journal of Metadata, Semantics and Ontologies 10, 3 (2015), 202–211.
[12]
Nadia Bouhriz, Faouzia Benabbou, and E. H. Ben Lahmar. 2016. Word sense disambiguation approach for Arabic text. International Journal of Advanced Computer Science and Applications 7, 4 (2016), 381–385.
[13]
Ibrahim Bounhas, Bilel Elayeb, Fabrice Evrard, and Yahya Slimani. 2011. Organizing contextual knowledge for Arabic text disambiguation and terminology extraction. KO Knowledge Organization 38, 6 (2011), 473–490.
[14]
Yahui Chen. 2015. Convolutional Neural Network for Sentence Classification. Master’s thesis. University of Waterloo.
[15]
Hasna Chouikhi, Hamza Chniter, and Fethi Jarray. 2021. Arabic sentiment analysis using BERT model. In Advances in Computational Collective Intelligence: 13th International Conference, ICCCI 2021. Springer, 621–632.
[16]
Hasna Chouikhi, Hamza Chniter, and Fethi Jarray. 2021. Stacking BERT based models for Arabic sentiment analysis. (2021).
[17]
Fathi Debili and Hadhemi Achour. 1998. Voyellation automatique de l’arabe. In Computational Approaches to Semitic Languages.
[18]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[19]
M. Soha Eid, Almoataz B. Al-Said, Nayer M. Wanas, Mohsen A. Rashwan, and Nadia H. Hegazy. 2010. Comparative study of Rocchio classifier applied to supervised WSD using Arabic lexical samples. In Proceedings of the Tenth Conference of Language Engineering (SEOLEC’2010), Cairo, Egypt.
[20]
Mohamed M. El-Gamml, M. Waleed Fakhr, Mohsen A. Rashwan, and Almoataz B. Al-Said. 2011. A comparative study for Arabic word sense disambiguation using document preprocessing and machine learning techniques. ALTIC, Alexandria, Egypt (2011).
[21]
Madeeh Nayer El-Gedawy. 2013. Using fuzzifiers to solve word sense ambiguation in Arabic language. International Journal of Computer Applications 79, 2 (2013).
[22]
Mohammed El-Razzaz, Mohamed Waleed Fakhr, and Fahima A. Maghraby. 2021. Arabic gloss WSD using BERT. Applied Sciences 11, 6 (2021), 2567.
[23]
Samir Elmougy, H. Taher, and H. Noaman. 2008. Naïve Bayes classifier for Arabic word sense disambiguation. In Proceeding of the 6th International Conference on Informatics and Systems. Citeseer, 16–21.
[24]
Meryeme Hadni, Saïd El Alaoui Ouatik, and Abdelmonaime Lachkar. 2016. Word sense disambiguation for Arabic text categorization. Int. Arab J. Inf. Technol. 13, 1A (2016), 215–222.
[25]
Rohit Kumar Kaliyar, Anurag Goswami, and Pratik Narang. 2021. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimedia Tools and Applications 80, 8 (2021), 11765–11788.
[26]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746–1751.
[27]
Rim Laatar, Chafik Aloulou, and Lamia Hadrich Belghuith. 2018. Word2vec for Arabic word sense disambiguation. In International Conference on Applications of Natural Language to Information Systems. Springer, 308–311.
[28]
Jindřich Libovickỳ, Rudolf Rosa, and Alexander Fraser. 2019. How language-neutral is multilingual BERT? arXiv preprint arXiv:1911.03310 (2019).
[29]
Rui Mao, Chenghua Lin, and Frank Guerin. 2021. Combining pre-trained word embeddings and linguistic features for sequential metaphor identification. arXiv preprint arXiv:2104.03285 (2021).
[30]
Mohamed El Bachir Menai. 2014. Word sense disambiguation using evolutionary algorithms–application to Arabic language. Computers in Human Behavior 41 (2014), 92–103.
[31]
Mohamed El Bachir Menai and Wojdan Alsaeedan. 2012. Genetic algorithm for Arabic word sense disambiguation. In 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. IEEE, 195–200.
[32]
Laroussi Merhbene, Anis Zouaghi, and Mounir Zrigui. 2013. An experimental study for some supervised lexical disambiguation methods of Arabic language. In Fourth International Conference on Information and Communication Technology and Accessibility (ICTA). IEEE, 1–6.
[33]
Laroussi Merhbene, Anis Zouaghi, and Mounir Zrigui. 2013. A semi-supervised method for Arabic word sense disambiguation using a weighted directed graph. In Proceedings of the Sixth International Joint Conference on Natural Language Processing. 1027–1031.
[34]
David Pinto, Paolo Rosso, Yassine Benajiba, Anas Ahachad, and Héctor Jiménez-Salazar. 2007. Word sense induction in the Arabic language: A self-term expansion based approach. In Proc. 7th Conference on Language Engineering of the Egyptian Society of Language Engineering-ESOLE. Citeseer, 235–245.
[35]
Ali Safaya, Moutasem Abdullatif, and Deniz Yuret. 2020. KUISAIL at SemEval-2020 task 12: BERT-CNN for offensive speech identification in social media. In Proceedings of the Fourteenth Workshop on Semantic Evaluation. 2054–2059.
[36]
Rakia Saidi and Fethi Jarray. 2022. Combining Bert representation and POS tagger for Arabic word sense disambiguation. In International Conference on Intelligent Systems Design and Applications. Springer, 676–685.
[37]
Nadia Soudani, Ibrahim Bounhas, Bilel ElAyeb, and Yahya Slimani. 2014. Toward an Arabic ontology for Arabic word sense disambiguation based on normalized dictionaries. In OTM Confederated International Conferences “On the Move to Meaningful Internet Systems”. Springer, 655–658.
[38]
Robyn Speer and Joanna Lowry-Duda. 2017. ConceptNet at SemEval-2017 task 2: Extending word embeddings with multilingual relational knowledge. arXiv preprint arXiv:1704.03560 (2017).
[39]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, ŁLukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).
[40]
A. Zouaghi, L. Merhbene, and M. Zrigui. 2011. Word sense disambiguation for Arabic language using the variants of the Lesk algorithm. WORLDCOMP 11 (2011), 561–567.
[41]
Anis Zouaghi, Laroussi Merhbene, and Mounir Zrigui. 2012. A hybrid approach for Arabic word sense disambiguation. International Journal of Computer Processing of Languages 24, 02 (2012), 133–151.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 11
November 2023
255 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3633309
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 November 2023
Online AM: 07 September 2023
Accepted: 21 August 2023
Revised: 03 November 2022
Received: 29 May 2022
Published in TALLIP Volume 22, Issue 11

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Word sense disambiguation
  2. Arabic text
  3. supervised approach
  4. transformer
  5. BERT
  6. convolutional neural network

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 241
    Total Downloads
  • Downloads (Last 12 months)117
  • Downloads (Last 6 weeks)10
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media