Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

Li, Belinda Z.; Yu, Jane; Khabsa, Madian; Zettlemoyer, Luke; Halevy, Alon; Andreas, Jacob

Computer Science > Computation and Language

arXiv:2112.03204 (cs)

[Submitted on 6 Dec 2021 (v1), last revised 4 May 2022 (this version, v2)]

Title:Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

Authors:Belinda Z. Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas

View PDF

Abstract:When a neural language model (LM) is adapted to perform a new task, what aspects of the task predict the eventual performance of the model? In NLP, systematic features of LM generalization to individual examples are well characterized, but systematic aspects of LM adaptability to new tasks are not nearly as well understood. We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500, built from 500 procedurally generated sequence modeling tasks. These tasks combine core aspects of language processing, including lexical semantics, sequence processing, memorization, logical reasoning, and world knowledge. Using TaskBench500, we evaluate three facets of adaptability, finding that: (1) adaptation procedures differ dramatically in their ability to memorize small datasets; (2) within a subset of task types, adaptation procedures exhibit compositional adaptability to complex tasks; and (3) failure to match training label distributions is explained by mismatches in the intrinsic difficulty of predicting individual labels. Our experiments show that adaptability to new tasks, like generalization to new examples, can be systematically described and understood, and we conclude with a discussion of additional aspects of adaptability that could be studied using the new benchmark.

Comments:	NAACL 2022; 20 pages, 6 figures, 8 tables
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2112.03204 [cs.CL]
	(or arXiv:2112.03204v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2112.03204

Submission history

From: Belinda Z. Li [view email]
[v1] Mon, 6 Dec 2021 18:00:25 UTC (1,812 KB)
[v2] Wed, 4 May 2022 23:11:27 UTC (4,436 KB)

Computer Science > Computation and Language

Title:Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators