Routing Networks with Co-training for Continual Learning

Collier, Mark; Kokiopoulou, Efi; Gesmundo, Andrea; Berent, Jesse

Computer Science > Machine Learning

arXiv:2009.04381 (cs)

[Submitted on 9 Sep 2020]

Title:Routing Networks with Co-training for Continual Learning

Authors:Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent

View PDF

Abstract:The core challenge with continual learning is catastrophic forgetting, the phenomenon that when neural networks are trained on a sequence of tasks they rapidly forget previously learned tasks. It has been observed that catastrophic forgetting is most severe when tasks are dissimilar to each other. We propose the use of sparse routing networks for continual learning. For each input, these network architectures activate a different path through a network of experts. Routing networks have been shown to learn to route similar tasks to overlapping sets of experts and dissimilar tasks to disjoint sets of experts. In the continual learning context this behaviour is desirable as it minimizes interference between dissimilar tasks while allowing positive transfer between related tasks. In practice, we find it is necessary to develop a new training method for routing networks, which we call co-training which avoids poorly initialized experts when new tasks are presented. When combined with a small episodic memory replay buffer, sparse routing networks with co-training outperform densely connected networks on the MNIST-Permutations and MNIST-Rotations benchmarks.

Comments:	Presented at ICML Workshop on Continual Learning 2020
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2009.04381 [cs.LG]
	(or arXiv:2009.04381v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2009.04381

Submission history

From: Mark Collier [view email]
[v1] Wed, 9 Sep 2020 15:58:51 UTC (229 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-09

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Mark Collier
Efi Kokiopoulou
Andrea Gesmundo
Jesse Berent

export BibTeX citation

Computer Science > Machine Learning

Title:Routing Networks with Co-training for Continual Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Routing Networks with Co-training for Continual Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators