Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words

Kurita, Hiroto; Kobayashi, Goro; Yokoi, Sho; Inui, Kentaro

Computer Science > Computation and Language

arXiv:2310.15921 (cs)

[Submitted on 24 Oct 2023]

Title:Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words

Authors:Hiroto Kurita, Goro Kobayashi, Sho Yokoi, Kentaro Inui

View PDF

Abstract:The performance of sentence encoders can be significantly improved through the simple practice of fine-tuning using contrastive loss. A natural question arises: what characteristics do models acquire during contrastive learning? This paper theoretically and experimentally shows that contrastive-based sentence encoders implicitly weight words based on information-theoretic quantities; that is, more informative words receive greater weight, while others receive less. The theory states that, in the lower bound of the optimal value of the contrastive learning objective, the norm of word embedding reflects the information gain associated with the distribution of surrounding words. We also conduct comprehensive experiments using various models, multiple datasets, two methods to measure the implicit weighting of models (Integrated Gradients and SHAP), and two information-theoretic quantities (information gain and self-information). The results provide empirical evidence that contrastive fine-tuning emphasizes informative words.

Comments:	16 pages, 6 figures, accepted to EMNLP 2023 Findings (short paper)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2310.15921 [cs.CL]
	(or arXiv:2310.15921v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.15921

Submission history

From: Hiroto Kurita [view email]
[v1] Tue, 24 Oct 2023 15:22:04 UTC (2,143 KB)

Computer Science > Computation and Language

Title:Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators