Rethinking Model Re-Basin and Linear Mode Connectivity

Qu, Xingyu; Horvath, Samuel

Computer Science > Machine Learning

arXiv:2402.05966 (cs)

[Submitted on 5 Feb 2024 (v1), last revised 9 Jul 2024 (this version, v2)]

Title:Rethinking Model Re-Basin and Linear Mode Connectivity

Authors:Xingyu Qu, Samuel Horvath

View PDF HTML (experimental)

Abstract:Recent studies suggest that with sufficiently wide models, most SGD solutions can, up to permutation, converge into the same basin. This phenomenon, known as the model re-basin regime, has significant implications for model averaging by ensuring the linear mode connectivity. However, current re-basin strategies are ineffective in many scenarios due to a lack of comprehensive understanding of underlying mechanisms. Addressing this gap, this paper provides novel insights into understanding and improving the standard practice. Firstly, we decompose re-normalization into rescaling and reshift, uncovering that rescaling plays a crucial role in re-normalization while re-basin performance is sensitive to shifts in model activation. The finding calls for a more nuanced handling of the activation shift. Secondly, we identify that the merged model suffers from the issue of activation collapse and magnitude collapse. Varying the learning rate, weight decay, and initialization method can mitigate the issues and improve model performance. Lastly, we propose a new perspective to unify the re-basin and pruning, under which a lightweight yet effective post-pruning technique is derived, which can significantly improve the model performance after pruning. Our implementation is available at this https URL.

Comments:	39 pages
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.05966 [cs.LG]
	(or arXiv:2402.05966v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.05966

Submission history

From: Xingyu Qu [view email]
[v1] Mon, 5 Feb 2024 17:06:26 UTC (1,451 KB)
[v2] Tue, 9 Jul 2024 09:23:25 UTC (2,406 KB)

Computer Science > Machine Learning

Title:Rethinking Model Re-Basin and Linear Mode Connectivity

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Rethinking Model Re-Basin and Linear Mode Connectivity

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators