How can we train with few data

How can we train with
few data
Dive with Example, no math required!
davinnovation@gmail.com

For Research...
http://www.image-net.org/
http://cocodataset.org/#home
10M > datasets
330K > datasets
https://research.google.com/youtube8m/
8M > datasets

Before do Something Fancy
http://blog.kaggle.com/2017/01/05/your-year-on-kaggle-most-memorable-
community-stats-from-2016/
on deep learning
1. Traditional Model

Before do Something Fancy
SVM 은 MDA, Logit, CBR 과 비교해서 우수한 예측력을 보였으며 .... 인공신경망과 비
슷한 수준의 높은 예측력을 나타낼 뿐만 아니라 인공신경망의 한계점으로 지적되었던
과대 적합, 국소 최적화와 같은 한계점들을 완화하는 장점을 가진다. (2003 한인구)
http://www.aistudy.co.kr/pattern/support_vector_machine.htm
https://www.researchgate.net/post/Which_classifier_is_the_better_in_case_of_small_data_s
amples
그냥 일단 먼저 다른 모델(SVM) 돌려보세요...
on deep learning
1. Traditional Model

Before do Something Fancy on deep learning
2. Data Augmentation
For Image
https://github.com/aleju/imgaug
For Audio
https://github.com/bmcfee/muda
For others
put some money

Wait... Why Deep Learning?
https://www.quora.com/Why-is-xgboost-given-so-much-less-attention-than-deep-
learning-despite-its-ubiquity-in-winning-Kaggle-solutions
When you do have "enough" training data, and when somehow you manage to
find the matching magical deep architecture, deep learning blows away any other
method by a large margin.
Will you still do it?

How can we train with few data

How can we train with
few data
Dive with Example, no math required!
ETRI 두스 2018 _ 2차 스터디
davinnovation@gmail.com
In deep learning perspective

Assumption
1. Dataset is small
2. Not (worked||satisfied) SVM(+etc) || want some cool thing

후에 소개되는 방법론들이
앞에서 설명한 결과보다 좋아진다는 보장은 없음!!!
It’s just the other tools. Not magic wand

Approaches
if (data size is small) and not (satisfied SVM):
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!

Approaches
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!
Transfer
Learning
Uncertainty
Learning Method

Transfer Learning
== Knowledge transfer
Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data
engineering 22.10 (2010): 1345-1359.
Before we dive into models...

Transfer Learning
This can be from [ImageNet, ...]
trained model
TRAIN
YOUR
DATA

Transfer Learning Applications
Learning from simulation This is not related to TITLE but...

Transfer Learning Applications
Curriculum Learning / Learn Different Domain This is not related to TITLE but...

Transfer Learning

Transfer Learning
Transfer Types
Instance-transfer re-weight (source-data trained)model by target-data
== min(model(source)) -> min(model(target))
Feature-representation
transfer
find good feature representation for source & target
== min ( model_feature(source).variation - model_feature(target).variation )
Parameter-transfer discover share parameter between source & target
== 1/2 ( model_feature(source).weight + model_feature(target).weight )
Relational-knowledge-
transfer
build mapping of relational knowledge between source & target
learn source - > learn target
learn source & target same time
model weight perspective...
learn some relational info

FINE TUNING
if (sufficient label)
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
Fixed Feature Extractor
Fine-tuning
In deep learning perspective with small data
http://cs231n.github.io/transfer-learning/

FINE TUNING
SC : Sparse Coding ( Scratch )
TF : Transfer Learning
CTL : Complete TL
PTL : Partial TL
MTL : Multi-task TL
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7480825 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7064414
16220 images

FINE TUNING
10M > datasets
https://arxiv.org/pdf/1311.2901v3.pdf
each layer can extract Features
Why fine-tune can work on deep learning

FINE TUNING
10M > datasets
https://arxiv.org/pdf/1411.1792.pdf
How much we fine tune?

Multi-task learning
https://openreview.net/pdf?id=S1PWi_lC-
70,000
70,000
70,000

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar )
Human Deep Learning

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar )
single one picture book 60,000 train data (MNIST)
( Actually is NOT! but just for fun...)

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar ) Meta-Learning
Meta Learning
Learning to Learn
http://bair.berkeley.edu/blog/2017/07/18/lear
ning-to-learn/

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
RNN Memory
https://www.slideshare.net/ssuserafc864/one-shot-learning-deep-learning-
meta-learn

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
One-shot Learn with Meta
Omniglot Dataset : 1600 > classes
 1200 class train, 423 class test ( downscale to 20x20 )
+ plus rotate augmentation

N-Shot
elif (few labels)
N-Shot ( One shot is more familiar ) Metric Learning Perspective
Metric Learning : learns Feature Extract + Features Manifold
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Metric Learning

N-Shot
elif (few labels)
Metric Learning

N-Shot
elif (few labels)
One-Shot Learn with metric
60, 000 color images of size 84 × 84 with 100 classes
NOT LEARN CLASS
LEARNS metrics

N-Shot
elif (few labels)
One-Shot Learn with metric

Zero-Shot
elif (no labels)
Zero-shot Metric Learning Perspective
Zero-shot Learn with metric

Domain Adaptation
elif (no labels)
No Domain
Adaptation

Domain Adaptation
elif (no labels)
Class = Backpack
Amazon
DSLR
Webcam
Caltech
 Training

Domain Adaptation
elif (no labels)
Domain-Adversarial Neural Network
Classifier의 성능을 유지하면서
(Classifier)
source || target feature 분포도 고려,
Source에서 왔는지 Target에서 왔는지 알
수 없도록 방해
(GAN)

Domain Adaptation
elif (no labels)
Domain-Adversarial Neural Network

Domain Adaptation
elif (no labels)
Domain Separation Networks
이미지 복원값
shared encoder와 차이classification loss
2개 difference를 비슷하게 만들어줌

Domain Adaptation
elif (no labels)
Domain Separation Networks
Source-only : Training with only source data
Target-only : Training with only Target data Testing on target data
SVHN
GTSRBMNIST

Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
https://www.youtube.com/watch?v=hHHmWmJG9Rw
It’s look like Zero – shot learn

https://www.datascience.com/blog/python-
anomaly-detection
Gaussian Distribution -> Check Uncertainty!!!

safe visual navigation via deep learning https://www.slideshare.net/samchoi7/modeling-uncertainty-in-deep-learning

Training Skills
elif (skewed labels)
Model weight update with balance
https://arxiv.org/pdf/1710.05381.pdf : imbalance class effect
Stratified Sampling / Bootstrapping... / K-Fold...

Training Skills
elif (skewed labels)
Don’t see accuracy
Or...

Refrence
• https://medium.com/nanonets/nanonets-how-to-
use-deep-learning-when-you-have-limited-data-
f68c0b512cab
• https://medium.com/@ShaliniAnanda1/an-open-
letter-to-yann-lecun-22b244fc0a5a
• http://ruder.io/transfer-learning/
• ........
• harsh to refer all thing

Above Things...
• Transfer Learning
• http://ruder.io/transfer-learning/
• One-Shot/Zero-Shot Learning
• http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/
• Uncertainty Deep Learning
• https://www.slideshare.net/samchoi7/modeling-uncertainty-
in-deep-learning
• Why Transfer Learning reduces require data?
• https://medium.com/nanonets/nanonets-how-to-use-deep-
learning-when-you-have-limited-data-f68c0b512cab

How can we train with few data

More Related Content

How can we train with few data