Deepening Hidden Representations from Pre-trained Language Models

Yang, Junjie; Zhao, Hai

Computer Science > Computation and Language

arXiv:1911.01940 (cs)

[Submitted on 5 Nov 2019 (v1), last revised 29 Apr 2020 (this version, v2)]

Title:Deepening Hidden Representations from Pre-trained Language Models

Authors:Junjie Yang, Hai Zhao

View PDF

Abstract:Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation. However, current approaches only take advantage of the output of the encoder's final layer when fine-tuning the downstream tasks. We argue that only taking single layer's output restricts the power of pre-trained representation. Thus we deepen the representation learned by the model by fusing the hidden representation in terms of an explicit HIdden Representation Extractor (HIRE), which automatically absorbs the complementary representation with respect to the output from the final layer. Utilizing RoBERTa as the backbone encoder, our proposed improvement over the pre-trained models is shown effective on multiple natural language understanding tasks and help our model rival with the state-of-the-art models on the GLUE benchmark.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1911.01940 [cs.CL]
	(or arXiv:1911.01940v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1911.01940

Submission history

From: Junjie Yang [view email]
[v1] Tue, 5 Nov 2019 16:59:50 UTC (510 KB)
[v2] Wed, 29 Apr 2020 13:04:59 UTC (2,063 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-11

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Junjie Yang
Hai Zhao

export BibTeX citation

Computer Science > Computation and Language

Title:Deepening Hidden Representations from Pre-trained Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Deepening Hidden Representations from Pre-trained Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators