NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Yao, Xingcheng; Zheng, Yanan; Yang, Xiaocong; Yang, Zhilin

Computer Science > Computation and Language

arXiv:2111.04130 (cs)

[Submitted on 7 Nov 2021 (v1), last revised 22 Jul 2022 (this version, v2)]

Title:NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Authors:Xingcheng Yao, Yanan Zheng, Xiaocong Yang, Zhilin Yang

View PDF

Abstract:Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train. We propose a simple and efficient learning framework, TLM, that does not rely on large-scale pretraining. Given some labeled task data and a large general corpus, TLM uses task data as queries to retrieve a tiny subset of the general corpus and jointly optimizes the task objective and the language modeling objective from scratch. On eight classification datasets in four domains, TLM achieves results better than or similar to pretrained language models (e.g., RoBERTa-Large) while reducing the training FLOPs by two orders of magnitude. With high accuracy and efficiency, we hope TLM will contribute to democratizing NLP and expediting its development.

Comments:	14 pages, 5 figures
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2111.04130 [cs.CL]
	(or arXiv:2111.04130v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2111.04130
Journal reference:	Proceedings of the 39th International Conference on Machine Learning, PMLR 162:25438-25451, 2022

Submission history

From: Xingcheng Yao [view email]
[v1] Sun, 7 Nov 2021 17:13:59 UTC (1,303 KB)
[v2] Fri, 22 Jul 2022 14:49:16 UTC (2,133 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-11

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Xingcheng Yao
Zhilin Yang

export BibTeX citation

Computer Science > Computation and Language

Title:NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators