Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Peng, Bo; Goldstein, Daniel; Anthony, Quentin; Albalak, Alon; Alcaide, Eric; Biderman, Stella; Cheah, Eugene; Du, Xingjian; Ferdinan, Teddy; Hou, Haowen; Kazienko, Przemysław; GV, Kranthi Kiran; Kocoń, Jan; Koptyra, Bartłomiej; Krishna, Satyapriya; McClelland Jr., Ronald; Lin, Jiaju; Muennighoff, Niklas; Obeid, Fares; Saito, Atsushi; Song, Guangyu; Tu, Haoqin; Wirawan, Cahya; Woźniak, Stanisław; Zhang, Ruichong; Zhao, Bingchen; Zhao, Qihang; Zhou, Peng; Zhu, Jian; Zhu, Rui-Jie

Computer Science > Computation and Language

arXiv:2404.05892 (cs)

[Submitted on 8 Apr 2024 (v1), last revised 26 Sep 2024 (this version, v4)]

Title:Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

View PDF

Abstract:We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokenizer based on greedy matching for enhanced multilinguality. We trained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two Finch models with 1.6 and 3.1 billion parameters and find that they achieve competitive performance across a wide variety of benchmarks. We release all our models on HuggingFace under the Apache 2.0 license. Models at: this https URL Training code at: this https URL Inference code at: this https URL Time-parallel training code at: this https URL

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.05892 [cs.CL]
	(or arXiv:2404.05892v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.05892

Submission history

From: Quentin Anthony [view email]
[v1] Mon, 8 Apr 2024 22:20:59 UTC (5,572 KB)
[v2] Wed, 10 Apr 2024 19:34:38 UTC (5,572 KB)
[v3] Wed, 25 Sep 2024 00:02:03 UTC (5,492 KB)
[v4] Thu, 26 Sep 2024 22:39:08 UTC (5,502 KB)

Computer Science > Computation and Language

Title:Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Submission history

Access Paper:

Ancillary files (details):

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Submission history

Access Paper:

Ancillary files (details):

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators