This document discusses one-shot learning and memory-augmented neural networks. It begins by explaining traditional deep learning models require large datasets for training. One-shot learning aims to give models inference ability after training on just one or few examples. Neural Turing Machines were an early approach to one-shot learning using an external memory component. Memory-Augmented Neural Networks were later developed using only content-based addressing in the external memory to enable rapid learning from small datasets.
1 of 36
More Related Content
One shot learning - deep learning ( meta learn )
1. [N] – SHOT Learning
In deep learning models
davinnovation@gmail.com
14. Neural Turing Machine
https://arxiv.org/abs/1410.5401
Memory
Read Heads Write Heads
Controller
External Input External Output
Neural Network
(RNN)
Addressing Mechanism
How to calculate ‘Weight Vector’ 𝑊𝑡
Content-based addressing >
- Based on Similarity of ‘Current value’ and ‘predicted by controller value’
location-based addressing >
Like 𝑓 𝑥, 𝑦 = 𝑥 x 𝑦 : variable 𝑥 , 𝑦 store them in different addresses, retrieve
them and perform a multiplication algorithm
=>
- Based on addressed by location
Use both mechanisms concurrently
24. Memory-Augmented
Neural Networks https://arxiv.org/pdf/1605.06065.pdf
Memory
Read Heads Write Heads
Controller
External Input External OutputAddressing Mechanism
How to calculate ‘Weight Vector’ 𝑊𝑡
Content-based addressing >
- Based on Similarity of ‘Current value’ and ‘predicted by controller value’
location-based addressing >
Like 𝑓 𝑥, 𝑦 = 𝑥 x 𝑦 : variable 𝑥 , 𝑦 store them in different addresses, retrieve
them and perform a multiplication algorithm
=>
- Based on addressed by location
Use both mechanisms concurrently
=> Doesn’t use location-based addressing
=> Only Use Content-based addressing
Memory-Augmented Neural Networks (MANN)
30. Experiment – Data
Omniglot Dataset : 1600 > classes
1200 class train, 423 class test ( downscale to 20x20 )
+ plus rotate augmentation
31. Experiment – Classification Result
Trained with one-hot vector representations
With Five randomly chosen labels, train 100,000 episode ( each episode are ‘new class’ )
Instance
: class emerge count…?