Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Li, Tao; Tao, Qinghua; Yan, Weihao; Lei, Zehao; Wu, Yingwen; Fang, Kun; He, Mingzhen; Huang, Xiaolin

Computer Science > Machine Learning

arXiv:2404.00357 (cs)

[Submitted on 30 Mar 2024]

Title:Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Authors:Tao Li, Qinghua Tao, Weihao Yan, Zehao Lei, Yingwen Wu, Kun Fang, Mingzhen He, Xiaolin Huang

View PDF HTML (experimental)

Abstract:Improving the generalization ability of modern deep neural networks (DNNs) is a fundamental challenge in machine learning. Two branches of methods have been proposed to seek flat minima and improve generalization: one led by sharpness-aware minimization (SAM) minimizes the worst-case neighborhood loss through adversarial weight perturbation (AWP), and the other minimizes the expected Bayes objective with random weight perturbation (RWP). While RWP offers advantages in computation and is closely linked to AWP on a mathematical basis, its empirical performance has consistently lagged behind that of AWP. In this paper, we revisit the use of RWP for improving generalization and propose improvements from two perspectives: i) the trade-off between generalization and convergence and ii) the random perturbation generation. Through extensive experimental evaluations, we demonstrate that our enhanced RWP methods achieve greater efficiency in enhancing generalization, particularly in large-scale problems, while also offering comparable or even superior performance to SAM. The code is released at this https URL.

Comments:	Accepted to TMLR 2024
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2404.00357 [cs.LG]
	(or arXiv:2404.00357v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2404.00357

Submission history

From: Tao Li [view email]
[v1] Sat, 30 Mar 2024 13:18:27 UTC (791 KB)

Computer Science > Machine Learning

Title:Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators