This document provides an overview of computer vision techniques including:
1. Using pre-trained CNN models for tasks like classification and object detection. Popular models discussed include AlexNet, VGG, ResNet, YOLO, and DenseNet.
2. Basic CNN operations like convolution, pooling, dropout, and normalization. Feature extraction using CNNs and techniques like transfer learning and fine-tuning pretrained models.
3. Additional computer vision tasks covered include object detection using Haar cascades, stereo vision, pattern detection, and reconstructing images from CNN features. Frameworks like PyTorch and libraries like TensorFlow are also mentioned.
2. Content
1. CNNs out of box
2. Basic operations
3. Feature extraction
4. Transfer learning
5. Fine Tuning
6. Benchmarking the CNNs
7. Object detection
8. Detection of patterns and anomalies
9. Stereo Vision
10. Reconstruct features back into images
11. Style Transfer
12. GAN
13. Labeling
14. Color detection
15. PyTorch
4. 4
The popular networks
Classification
o LeNet Model
o AlexNet Model
o VGG Model
o ResNet Paper
o YOLO9000 Paper
o DenseNet Paper
Segmentation
o FCN8 Paper
o SegNet Paper
o U-Net Paper
o E-Net Paper
o ResNetFCN Paper
o PSPNet Paper
o Mask RCNN Paper
Detection
o Faster RCNN Paper
o SSD Paper
o YOLOv2 Paper
o R-FCN Paper
CNNs out of box
5. 5
Some datasets available for research
MNIST: 10 classes, ~7000 ex. per class ImageNet: 1000 classes, ~100 ex per class
CNNs out of box
6. 6
Some datasets available for research
The Street View House Numbers
10 classes, ~2000 ex. per class
CIFAR: 10 classes, 6000 ex. per class
100 classes, 600 ex per class
CNNs out of box
7. 7
Some datasets available for research
Olivetti database: 40 classes .. and much more @ kaggle
CNNs out of box
9. 9
real-world image text extraction.
compressing and decompressing images
automatic speech recognition
semantic image segmentation
image matching and retrieval
image-to-text
computer vision
discovery of latent 3D keypoints
unsupervised learning
localizing and identifying multiple objects in a single image
3D object reconstruction
image classification
identify the name of a street (in France) from an image
predicting future video frames
https://github.com/keras-team/keras/tree/master/examples
https://github.com/tensorflow/models/tree/master/research
Usage of the neural networks
CNNs out of box
106. 106
Detection of patterns and anomalies
Input data:
o Still image of multiple similar objects sampled across the grid-like structure.
Output:
o Pattern – an image that best fits (represents) the sampled instances of the
similar objects
o Markup (labeling) of an image to indicate the positions of the patterns and
outliers as well as their confidence level
Limitations
o Samples on the input image have limited pose variance so their view angle is
similar (e.g. front view only, no side or back view)
127. 127
T
T
set of objects
object
Kkt label for object t
KTk :
set of labels
labeling
T
K all possible labelings
Ttt neighbor of t
tttt kkg , weight of an edge
K
t
Labeling
Definitions
128. 128
tttt
tt
TtKk
kkgk T
,&*
Find the valid labeling
1,0, tttt kkg
Labeling
The (OR, AND) problem
129. 129
Find a labeling with maximal sum
tt
Tt
tttt
Kk
kkgk
T
,maxarg*
Rkkg tttt ,
Labeling
The (MAX, +) problem
134. 134
tttt
tt
TtKk
kkgk T
,&*
Labels in all the neighbor objects should be
connected by an edge
Labeling
The (OR, AND) problem: example
135. 135
tttt
tt
TtKk
kkgk T
,maxmin*
Labeling
The (MIN, MAX) problem: clustering
The distance between most different objects of
the same cluster should be minimal
T = objects
K = clusters
𝑔 𝑡𝑡′(𝑘 𝑡, 𝑘 𝑡′) = distance between the objects
136. 136
tttt
tt
TtKk
kkgk T
,*
maxmin,
minmax,
min,
max,
min,
max,
andor,
,
,
Count of valid labelings
Valid labeling
Travelling salesman problem
Travelling salesman problem
Optimization on Gibbs field
Optimization on Gibbs field
Hamiltonian path
Clustering
Labeling
The general labeling problem
143. 143
1 1 2 2 1 2 2 1, & , , & ,g k k g k k g k k g k k
1 2 1 2> , >k k k k
Labeling
(OR, AND) labeling problem with supermodular weights
154. 154
1 2 1 2> , >k k k k
1 2 2 1 1 1 2 2, , , , , ,min g k k g k k min g k k g k k
Labeling
(MAX, MIN) labeling problem with supermodular weights
165. 165
tttt
tt
TtKk
kkgk T
,maxmin*
кластераодногообъектамимеждуотличиеkkg tttt ,
объектыT
кластерыK
Labeling
(MIN, MAX) labeling problem: clustering
The distance between most different objects of
the same cluster should be minimal