Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Avram, Andrei-Marius; Mititelu, Verginica Barbu; Păiş, Vasile; Cercel, Dumitru-Clementin; Trăuşan-Matu, Ştefan

Computer Science > Computation and Language

arXiv:2306.10419 (cs)

[Submitted on 17 Jun 2023]

Title:Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Authors:Andrei-Marius Avram, Verginica Barbu Mititelu, Vasile Păiş, Dumitru-Clementin Cercel, Ştefan Trăuşan-Matu

View PDF

Abstract:Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification.

Comments:	Accepted at Mathematics 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2306.10419 [cs.CL]
	(or arXiv:2306.10419v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.10419

Submission history

From: Andrei-Marius Avram [view email]
[v1] Sat, 17 Jun 2023 20:28:32 UTC (2,425 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators