Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
How can we train with
few data
Dive with Example, no math required!
davinnovation@gmail.com
For Research...
http://www.image-net.org/
http://cocodataset.org/#home
10M > datasets
330K > datasets
https://research.google.com/youtube8m/
8M > datasets
In real...
Where we are
Before do Something Fancy
http://blog.kaggle.com/2017/01/05/your-year-on-kaggle-most-memorable-
community-stats-from-2016/
on deep learning
1. Traditional Model
Before do Something Fancy
SVM 은 MDA, Logit, CBR 과 비교해서 우수한 예측력을 보였으며 .... 인공신경망과 비
슷한 수준의 높은 예측력을 나타낼 뿐만 아니라 인공신경망의 한계점으로 지적되었던
과대 적합, 국소 최적화와 같은 한계점들을 완화하는 장점을 가진다. (2003 한인구)
http://www.aistudy.co.kr/pattern/support_vector_machine.htm
https://www.researchgate.net/post/Which_classifier_is_the_better_in_case_of_small_data_s
amples
그냥 일단 먼저 다른 모델(SVM) 돌려보세요...
on deep learning
1. Traditional Model
Before do Something Fancy on deep learning
2. Data Augmentation
For Image
https://github.com/aleju/imgaug
For Audio
https://github.com/bmcfee/muda
For others
put some money
Wait... Why Deep Learning?
https://www.quora.com/Why-is-xgboost-given-so-much-less-attention-than-deep-
learning-despite-its-ubiquity-in-winning-Kaggle-solutions
When you do have "enough" training data, and when somehow you manage to
find the matching magical deep architecture, deep learning blows away any other
method by a large margin.
Will you still do it?
How can we train with few data
How can we train with
few data
Dive with Example, no math required!
ETRI 두스 2018 _ 2차 스터디
davinnovation@gmail.com
In deep learning perspective
Assumption
1. Dataset is small
2. Not (worked||satisfied) SVM(+etc) || want some cool thing
후에 소개되는 방법론들이
앞에서 설명한 결과보다 좋아진다는 보장은 없음!!!
It’s just the other tools. Not magic wand
Approaches
if (data size is small) and not (satisfied SVM):
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!
Approaches
if (data size is small) and not (satisfied SVM):
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!
Transfer
Learning
Uncertainty
Learning Method
Transfer Learning
== Knowledge transfer
Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data
engineering 22.10 (2010): 1345-1359.
Before we dive into models...
Transfer Learning
== Knowledge transfer
This can be from [ImageNet, ...]
trained model
TRAIN
YOUR
DATA
Transfer Learning Applications
Learning from simulation This is not related to TITLE but...
Transfer Learning Applications
Curriculum Learning / Learn Different Domain This is not related to TITLE but...
Transfer Learning
== Knowledge transfer
Transfer Learning
== Knowledge transfer
Transfer Types
Instance-transfer re-weight (source-data trained)model by target-data
== min(model(source)) -> min(model(target))
Feature-representation
transfer
find good feature representation for source & target
== min ( model_feature(source).variation - model_feature(target).variation )
Parameter-transfer discover share parameter between source & target
== 1/2 ( model_feature(source).weight + model_feature(target).weight )
Relational-knowledge-
transfer
build mapping of relational knowledge between source & target
learn source - > learn target
learn source & target same time
model weight perspective...
learn some relational info
FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
Fixed Feature Extractor
Fine-tuning
In deep learning perspective with small data
http://cs231n.github.io/transfer-learning/
FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
SC : Sparse Coding ( Scratch )
TF : Transfer Learning
CTL : Complete TL
PTL : Partial TL
MTL : Multi-task TL
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7480825 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7064414
16220 images
FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
https://arxiv.org/pdf/1311.2901v3.pdf
each layer can extract Features
Why fine-tune can work on deep learning
In deep learning perspective with small data
FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
https://arxiv.org/pdf/1411.1792.pdf
How much we fine tune?
In deep learning perspective with small data
Multi-task learning
if (sufficient label)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://openreview.net/pdf?id=S1PWi_lC-
70,000
70,000
70,000
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar )
Human Deep Learning
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar )
single one picture book 60,000 train data (MNIST)
( Actually is NOT! but just for fun...)
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Meta-Learning
Meta Learning
Learning to Learn
http://bair.berkeley.edu/blog/2017/07/18/lear
ning-to-learn/
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
RNN Memory
https://www.slideshare.net/ssuserafc864/one-shot-learning-deep-learning-
meta-learn
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
One-shot Learn with Meta
https://arxiv.org/pdf/1605.06065.pdf
Omniglot Dataset : 1600 > classes
 1200 class train, 423 class test ( downscale to 20x20 )
+ plus rotate augmentation
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
Metric Learning : learns Feature Extract + Features Manifold
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Metric Learning
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Metric Learning
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
One-Shot Learn with metric
60, 000 color images of size 84 × 84 with 100 classes
NOT LEARN CLASS
LEARNS metrics
https://arxiv.org/pdf/1606.04080.pdf
N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
One-Shot Learn with metric
Zero-Shot
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Zero-shot Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Zero-shot Learn with metric
Zero-Shot
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Zero-shot Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Zero-shot Learn with metric
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
No Domain
Adaptation
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Class = Backpack
Amazon
DSLR
Webcam
Caltech
 Training
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain-Adversarial Neural Network
Classifier의 성능을 유지하면서
(Classifier)
source || target feature 분포도 고려,
Source에서 왔는지 Target에서 왔는지 알
수 없도록 방해
(GAN)
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain-Adversarial Neural Network
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain Separation Networks
이미지 복원값
shared encoder와 차이classification loss
2개 difference를 비슷하게 만들어줌
Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain Separation Networks
Source-only : Training with only source data
Target-only : Training with only Target data Testing on target data
SVHN
GTSRBMNIST
Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://www.youtube.com/watch?v=hHHmWmJG9Rw
It’s look like Zero – shot learn
Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://www.datascience.com/blog/python-
anomaly-detection
Gaussian Distribution -> Check Uncertainty!!!
Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
safe visual navigation via deep learning https://www.slideshare.net/samchoi7/modeling-uncertainty-in-deep-learning
Training Skills
elif (skewed labels)
if (data size is small) and not (satisfied SVM):
Model weight update with balance
https://arxiv.org/pdf/1710.05381.pdf : imbalance class effect
Stratified Sampling / Bootstrapping... / K-Fold...
Training Skills
elif (skewed labels)
if (data size is small) and not (satisfied SVM):
Don’t see accuracy
Or...
Refrence
• https://medium.com/nanonets/nanonets-how-to-
use-deep-learning-when-you-have-limited-data-
f68c0b512cab
• https://medium.com/@ShaliniAnanda1/an-open-
letter-to-yann-lecun-22b244fc0a5a
• http://ruder.io/transfer-learning/
• ........
• harsh to refer all thing
Above Things...
• Transfer Learning
• http://ruder.io/transfer-learning/
• One-Shot/Zero-Shot Learning
• http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/
• Uncertainty Deep Learning
• https://www.slideshare.net/samchoi7/modeling-uncertainty-
in-deep-learning
• Why Transfer Learning reduces require data?
• https://medium.com/nanonets/nanonets-how-to-use-deep-
learning-when-you-have-limited-data-f68c0b512cab

More Related Content

How can we train with few data

  • 1. How can we train with few data Dive with Example, no math required! davinnovation@gmail.com
  • 2. For Research... http://www.image-net.org/ http://cocodataset.org/#home 10M > datasets 330K > datasets https://research.google.com/youtube8m/ 8M > datasets
  • 5. Before do Something Fancy http://blog.kaggle.com/2017/01/05/your-year-on-kaggle-most-memorable- community-stats-from-2016/ on deep learning 1. Traditional Model
  • 6. Before do Something Fancy SVM 은 MDA, Logit, CBR 과 비교해서 우수한 예측력을 보였으며 .... 인공신경망과 비 슷한 수준의 높은 예측력을 나타낼 뿐만 아니라 인공신경망의 한계점으로 지적되었던 과대 적합, 국소 최적화와 같은 한계점들을 완화하는 장점을 가진다. (2003 한인구) http://www.aistudy.co.kr/pattern/support_vector_machine.htm https://www.researchgate.net/post/Which_classifier_is_the_better_in_case_of_small_data_s amples 그냥 일단 먼저 다른 모델(SVM) 돌려보세요... on deep learning 1. Traditional Model
  • 7. Before do Something Fancy on deep learning 2. Data Augmentation For Image https://github.com/aleju/imgaug For Audio https://github.com/bmcfee/muda For others put some money
  • 8. Wait... Why Deep Learning? https://www.quora.com/Why-is-xgboost-given-so-much-less-attention-than-deep- learning-despite-its-ubiquity-in-winning-Kaggle-solutions When you do have "enough" training data, and when somehow you manage to find the matching magical deep architecture, deep learning blows away any other method by a large margin. Will you still do it?
  • 10. How can we train with few data Dive with Example, no math required! ETRI 두스 2018 _ 2차 스터디 davinnovation@gmail.com In deep learning perspective
  • 11. Assumption 1. Dataset is small 2. Not (worked||satisfied) SVM(+etc) || want some cool thing
  • 12. 후에 소개되는 방법론들이 앞에서 설명한 결과보다 좋아진다는 보장은 없음!!! It’s just the other tools. Not magic wand
  • 13. Approaches if (data size is small) and not (satisfied SVM): if (sufficient label): Fine Tuning elif (few labels): N-shot Learning elif (no labels): Zero-shot Learning/Domain Adaptation elif (skewed labels): Anomaly Detection if special else Training Tricks! else: Hire Alba for labeling! else: if !(sufficient label): semi-supervised learning, unsupervised learning else: JUST DO DEEP LEARNING!
  • 14. Approaches if (data size is small) and not (satisfied SVM): if (sufficient label): Fine Tuning elif (few labels): N-shot Learning elif (no labels): Zero-shot Learning/Domain Adaptation elif (skewed labels): Anomaly Detection if special else Training Tricks! else: Hire Alba for labeling! else: if !(sufficient label): semi-supervised learning, unsupervised learning else: JUST DO DEEP LEARNING! Transfer Learning Uncertainty Learning Method
  • 15. Transfer Learning == Knowledge transfer Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data engineering 22.10 (2010): 1345-1359. Before we dive into models...
  • 16. Transfer Learning == Knowledge transfer This can be from [ImageNet, ...] trained model TRAIN YOUR DATA
  • 17. Transfer Learning Applications Learning from simulation This is not related to TITLE but...
  • 18. Transfer Learning Applications Curriculum Learning / Learn Different Domain This is not related to TITLE but...
  • 20. Transfer Learning == Knowledge transfer Transfer Types Instance-transfer re-weight (source-data trained)model by target-data == min(model(source)) -> min(model(target)) Feature-representation transfer find good feature representation for source & target == min ( model_feature(source).variation - model_feature(target).variation ) Parameter-transfer discover share parameter between source & target == 1/2 ( model_feature(source).weight + model_feature(target).weight ) Relational-knowledge- transfer build mapping of relational knowledge between source & target learn source - > learn target learn source & target same time model weight perspective... learn some relational info
  • 21. FINE TUNING if (sufficient label) if (data size is small) and not (satisfied SVM): https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html 10M > datasets Fixed Feature Extractor Fine-tuning In deep learning perspective with small data http://cs231n.github.io/transfer-learning/
  • 22. FINE TUNING if (sufficient label) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data SC : Sparse Coding ( Scratch ) TF : Transfer Learning CTL : Complete TL PTL : Partial TL MTL : Multi-task TL http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7480825 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7064414 16220 images
  • 23. FINE TUNING if (sufficient label) if (data size is small) and not (satisfied SVM): https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html 10M > datasets https://arxiv.org/pdf/1311.2901v3.pdf each layer can extract Features Why fine-tune can work on deep learning In deep learning perspective with small data
  • 24. FINE TUNING if (sufficient label) if (data size is small) and not (satisfied SVM): https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html 10M > datasets https://arxiv.org/pdf/1411.1792.pdf How much we fine tune? In deep learning perspective with small data
  • 25. Multi-task learning if (sufficient label) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data https://openreview.net/pdf?id=S1PWi_lC- 70,000 70,000 70,000
  • 26. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Human Deep Learning
  • 27. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) single one picture book 60,000 train data (MNIST) ( Actually is NOT! but just for fun...)
  • 28. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Meta-Learning Meta Learning Learning to Learn http://bair.berkeley.edu/blog/2017/07/18/lear ning-to-learn/
  • 29. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine ) RNN Memory https://www.slideshare.net/ssuserafc864/one-shot-learning-deep-learning- meta-learn
  • 30. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine ) One-shot Learn with Meta https://arxiv.org/pdf/1605.06065.pdf Omniglot Dataset : 1600 > classes  1200 class train, 423 class test ( downscale to 20x20 ) + plus rotate augmentation
  • 31. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Metric Learning Perspective Metric Learning : learns Feature Extract + Features Manifold https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view Metric Learning
  • 32. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Metric Learning Perspective https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view Metric Learning
  • 33. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Metric Learning Perspective https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view One-Shot Learn with metric 60, 000 color images of size 84 × 84 with 100 classes NOT LEARN CLASS LEARNS metrics https://arxiv.org/pdf/1606.04080.pdf
  • 34. N-Shot elif (few labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data N-Shot ( One shot is more familiar ) Metric Learning Perspective https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view One-Shot Learn with metric
  • 35. Zero-Shot elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Zero-shot Metric Learning Perspective https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view Zero-shot Learn with metric
  • 36. Zero-Shot elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Zero-shot Metric Learning Perspective https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view Zero-shot Learn with metric
  • 37. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data No Domain Adaptation
  • 38. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Class = Backpack Amazon DSLR Webcam Caltech  Training
  • 39. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Domain-Adversarial Neural Network Classifier의 성능을 유지하면서 (Classifier) source || target feature 분포도 고려, Source에서 왔는지 Target에서 왔는지 알 수 없도록 방해 (GAN)
  • 40. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Domain-Adversarial Neural Network
  • 41. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Domain Separation Networks 이미지 복원값 shared encoder와 차이classification loss 2개 difference를 비슷하게 만들어줌
  • 42. Domain Adaptation elif (no labels) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data Domain Separation Networks Source-only : Training with only source data Target-only : Training with only Target data Testing on target data SVHN GTSRBMNIST
  • 43. Anomaly Detection (Novelty Detection) elif (skewed labels) and (special case) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data https://www.youtube.com/watch?v=hHHmWmJG9Rw It’s look like Zero – shot learn
  • 44. Anomaly Detection (Novelty Detection) elif (skewed labels) and (special case) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data https://www.datascience.com/blog/python- anomaly-detection Gaussian Distribution -> Check Uncertainty!!!
  • 45. Anomaly Detection (Novelty Detection) elif (skewed labels) and (special case) if (data size is small) and not (satisfied SVM): In deep learning perspective with small data safe visual navigation via deep learning https://www.slideshare.net/samchoi7/modeling-uncertainty-in-deep-learning
  • 46. Training Skills elif (skewed labels) if (data size is small) and not (satisfied SVM): Model weight update with balance https://arxiv.org/pdf/1710.05381.pdf : imbalance class effect Stratified Sampling / Bootstrapping... / K-Fold...
  • 47. Training Skills elif (skewed labels) if (data size is small) and not (satisfied SVM): Don’t see accuracy Or...
  • 49. Above Things... • Transfer Learning • http://ruder.io/transfer-learning/ • One-Shot/Zero-Shot Learning • http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/ • Uncertainty Deep Learning • https://www.slideshare.net/samchoi7/modeling-uncertainty- in-deep-learning • Why Transfer Learning reduces require data? • https://medium.com/nanonets/nanonets-how-to-use-deep- learning-when-you-have-limited-data-f68c0b512cab