Unsupervised Attention Mechanism across Neural Network Layers

Lin, Baihan

Computer Science > Machine Learning

arXiv:1902.10658v8 (cs)

[Submitted on 27 Feb 2019 (v1), revised 25 Nov 2019 (this version, v8), latest version 23 Dec 2021 (v13)]

Title:Unsupervised Attention Mechanism across Neural Network Layers

Authors:Baihan Lin

View PDF

Abstract:Inspired by the adaptation phenomenon of neuronal firing, we propose an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, UAM constrained the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrated the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and nonstationary input distribution in computer vision and reinforcement learning tasks. Lastly, UAM tracks dependency and critical learning stages across layers and recurrent time steps of deep networks.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
Cite as:	arXiv:1902.10658 [cs.LG]
	(or arXiv:1902.10658v8 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1902.10658

Submission history

From: Baihan Lin [view email]
[v1] Wed, 27 Feb 2019 17:44:50 UTC (39 KB)
[v2] Thu, 28 Feb 2019 04:30:58 UTC (39 KB)
[v3] Mon, 11 Mar 2019 13:06:31 UTC (41 KB)
[v4] Sat, 30 Mar 2019 21:55:07 UTC (41 KB)
[v5] Wed, 1 May 2019 18:42:40 UTC (42 KB)
[v6] Thu, 25 Jul 2019 21:47:10 UTC (284 KB)
[v7] Wed, 31 Jul 2019 15:01:28 UTC (284 KB)
[v8] Mon, 25 Nov 2019 17:03:43 UTC (2,786 KB)
[v9] Fri, 21 Feb 2020 20:25:26 UTC (6,630 KB)
[v10] Tue, 26 May 2020 01:07:00 UTC (6,630 KB)
[v11] Fri, 5 Jun 2020 21:49:26 UTC (8,344 KB)
[v12] Thu, 10 Sep 2020 08:58:16 UTC (15,722 KB)
[v13] Thu, 23 Dec 2021 11:20:37 UTC (13,415 KB)

Computer Science > Machine Learning

Title:Unsupervised Attention Mechanism across Neural Network Layers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Unsupervised Attention Mechanism across Neural Network Layers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators