Unifying Parsing and Tree-Structured Models for Generating Sentence Semantic Representations

Antoine Simoulin; Benoît Crabbé

doi:10.18653/v1/2022.naacl-srw.33

Communication Dans Un Congrès Année : 2022

Unifying Parsing and Tree-Structured Models for Generating Sentence Semantic Representations

(1, 2) , (2)

1
2

Antoine Simoulin

Fonction : Auteur
PersonId : 1102658
IdHAL : antoine-simoulin

Quantmetry

Laboratoire de Linguistique Formelle

Benoît Crabbé

Fonction : Auteur
PersonId : 6726
IdHAL : benoit-crabbe
ORCID : 0000-0002-0821-0913
IdRef : 168451107

Laboratoire de Linguistique Formelle

Résumé

We introduce a novel tree-based model that learns its composition function together with its structure. The architecture produces sentence embeddings by composing words according to an induced syntactic tree. The parsing and the composition functions are explicitly connected and, therefore, learned jointly. As a result, the sentence embedding is computed according to an interpretable linguistic pattern and may be used on any downstream task. We evaluate our encoder on downstream tasks, and we observe that it outperforms tree-based models relying on external parsers. In some configurations, it is even competitive with Bert base model. Our model is capable of supporting multiple parser architectures. We exploit this property to conduct an ablation study by comparing different parser initializations. We explore to which extent the trees produced by our model compare with linguistic structures and how this initialization impacts downstream performances. We empirically observe that downstream supervision troubles producing stable parses and preserving linguistically relevant structures.

Domaines

Linguistique Informatique et langage [cs.CL]

Fichier principal

2022.naacl-srw.33.pdf (626.52 Ko)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

Alexandre Roulois : Connectez-vous pour contacter le contributeur

https://cnrs.hal.science/hal-03992330

Soumis le : vendredi 17 février 2023-16:54:04

Dernière modification le : mardi 20 février 2024-09:00:04

Archivage à long terme le : jeudi 18 mai 2023-19:36:49

Dates et versions

hal-03992330 , version 1 (17-02-2023)

Identifiants

HAL Id : hal-03992330 , version 1
DOI : 10.18653/v1/2022.naacl-srw.33

Citer

Antoine Simoulin, Benoît Crabbé. Unifying Parsing and Tree-Structured Models for Generating Sentence Semantic Representations. 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics "Human Language Technologies", Jul 2022, Seattle, WA, United States. pp.267-276, ⟨10.18653/v1/2022.naacl-srw.33⟩. ⟨hal-03992330⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LLF CAMPUS-AAR AAI UP-SOCIETES-HUMANITES

56 Consultations

47 Téléchargements

Unifying Parsing and Tree-Structured Models for Generating Sentence Semantic Representations

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager