Emergent Properties of Finetuned Language Representation Models

Matton, Alexandre; de Oliveira, Luke

Computer Science > Computation and Language

arXiv:1910.10832 (cs)

[Submitted on 23 Oct 2019]

Title:Emergent Properties of Finetuned Language Representation Models

Authors:Alexandre Matton, Luke de Oliveira

View PDF

Abstract:Large, self-supervised transformer-based language representation models have recently received significant amounts of attention, and have produced state-of-the-art results across a variety of tasks simply by scaling up pre-training on larger and larger corpora. Such models usually produce high dimensional vectors, on top of which additional task-specific layers and architectural modifications are added to adapt them to specific downstream tasks. Though there exists ample evidence that such models work well, we aim to understand what happens when they work well. We analyze the redundancy and location of information contained in output vectors for one such language representation model -- BERT. We show empirical evidence that the [CLS] embedding in BERT contains highly redundant information, and can be compressed with minimal loss of accuracy, especially for finetuned models, dovetailing into open threads in the field about the role of over-parameterization in learning. We also shed light on the existence of specific output dimensions which alone give very competitive results when compared to using all dimensions of output vectors.

Comments:	7 pages
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1910.10832 [cs.CL]
	(or arXiv:1910.10832v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1910.10832

Submission history

From: Luke de Oliveira [view email]
[v1] Wed, 23 Oct 2019 23:01:10 UTC (421 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-10

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Luke de Oliveira

export BibTeX citation

Computer Science > Computation and Language

Title:Emergent Properties of Finetuned Language Representation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Emergent Properties of Finetuned Language Representation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators