Fast-weights-RNN

发表于 2022-04-01 分类于 Paper ， Theory 阅读次数：本文字数： 1.3k 阅读时长 ≈ 1 分钟

Using Fast Weights to Attend to the Recent Past

NIPS 2016

作者将fast weights引入到RNN中实现了更好的效果。本质上是在RNN的t时刻到t+1时刻中间，插入了一段新的RNN结构，每个step计算之前的隐藏状态和当前隐藏状态的关系权重，不断累加，最后达到比较好的效果。

这里需要先介绍下fast weights。

在1987年的时候，有一篇paper《Using Fast Weights to Deblur Old Memories》，提出了下面的说法：

Despite the emerging biological evidence that changes in synaptic efficacy at a single synapse occur at many different time-scales (Kupferman, 1979; Hartzell, 1981), there have been relatively few attempts to investigate the computational advantages of giving each connection several different weights that change at different speeds.

意思是说如果把一个weight的更新看做是神经元的一次神经活动，那么weight的更新也应该是有不同time scalse的。

那么如果模仿这个过程，除了一般的weight外，还可以尝试加入其它time scale的weight，也就是fast weight，fast weight用来模拟短时的记忆。

Slow weight: The slow weights are like the weights normally used in connectionist networks-they change slowly and they hold all the long-term knowledge of the network.
Fast weight: The fast weights change more rapidly and they continually regress towards zero so that their magnitude is determined solely by their recent past.

来看一下作者具体怎么样把fast weight引入到RNN中：