Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
arXiv - CS - Computation and Language Pub Date : 2023-11-24 , DOI: arxiv-2311.14495
Shida Wang, Qianxiao Li

In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this "curse of memory" as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets and language models.

中文翻译:

StableSSM:通过稳定的重参数化缓解状态空间模型中的内存诅咒

在本文中,我们从参数化的角度研究了状态空间模型(SSM)的长期记忆学习能力。我们证明,没有任何重新参数化的状态空间模型表现出与传统 RNN 类似的内存限制:可以通过状态空间模型稳定近似的目标关系必须具有指数衰减内存。我们的分析将这种“记忆诅咒”确定为循环权重收敛到稳定性边界的结果,这表明重新参数化技术可能是有效的。为此,我们引入了一类 SSM 重新参数化技术,可以有效解除其内存限制。除了提高逼近能力之外,我们进一步说明重新参数化方案的原则性选择还可以增强优化稳定性。我们使用合成数据集和语言模型验证我们的发现。
更新日期:2023-11-27
down
wechat
bug