mixup: Beyond Empirical Risk Minimization

Zhang, Hongyi; Cisse, Moustapha; Dauphin, Yann N.; Lopez-Paz, David

Computer Science > Machine Learning

arXiv:1710.09412 (cs)

[Submitted on 25 Oct 2017 (v1), last revised 27 Apr 2018 (this version, v2)]

Title:mixup: Beyond Empirical Risk Minimization

Authors:Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz

View PDF

Abstract:Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

Comments:	ICLR camera ready version. Changes vs V1: fix repo URL; add ablation studies; add mixup + dropout etc
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1710.09412 [cs.LG]
	(or arXiv:1710.09412v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1710.09412

Submission history

From: Hongyi Zhang [view email]
[v1] Wed, 25 Oct 2017 18:30:49 UTC (566 KB)
[v2] Fri, 27 Apr 2018 21:39:25 UTC (574 KB)

Computer Science > Machine Learning

Title:mixup: Beyond Empirical Risk Minimization

Submission history

Access Paper:

References & Citations

10 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:mixup: Beyond Empirical Risk Minimization

Submission history

Access Paper:

References & Citations

10 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators