Images, Neural Networks, CNNs
Images, Neural Networks, CNNs
Images, Neural Networks, CNNs
convolutional neural
networks
1
Computer vision
2
From picture to pixels
5
Convolutional neural
networks
Convolutional neural network
(CNN, ConvNet)
● Dense or fully-connected: each neuron connected
to all neurons in previous layer
● CNN: only connected to a small “local” set of
neurons
● Radically reduces numberDense layer Convolutional
of network connections layer
7
Convolution for image data
3✕3 weights
3✕3 image (conv. kernel)
area output
● Image represented as 2D grid of neuron
values
● Each output neuron connected to
neuron
● Weights stay the same
(shared weights)
● Border effect: without
padding output area is
smaller
● Outputs form a “feature feature map
map”
9
Image source: https://mlnotebook.github.io/post/CNN1/
A real example
11
Convolution for image data K feature maps each
252✕252✕1
K kernels
● We can repeat for different each 5✕5(✕3)
sets of weights (kernels)
● Each learns a different
“feature”
image
● Typically: edges, corners, 256✕256✕3
etc
● Each outputs a feature
map
...
...
12
Convolution for image data
output tensor
252✕252✕K
...
13
Convolution in layers: intuition
● We can then add
another
convolutional layer
● This operates on the
previous layer’s
output tensor “cat”
(feature maps)
● Features layered
from simple to more
complex
14
learned learned learned
learned
low-level mid-level high-level ca
classifier
features features features t
Image from lecture by Yann Le Cun, original from Zeiler & Fergus (2013)
15
Image datasets
• Color image mini-batches are 4D
tensors:
width ✕ height ✕ color
channels ✕ samples
• Plenty of big datasets for training
exist, e.g., ImageNet with 1,2 million
images in 1000 classes
• Data augmentation for small datasets:
generate more training data by
transforming existing data
• E.g., shifting, rotation, cropping,
Scaling, adding noise, etc …
16
Convolutional layers
• Input: tensor of size N × Wi × Hi × Ci
• Hyperparameters:
• K: number of filters
• w, h: kernel size
• padding: how to handle image borders
• activation function
• Output: tensor of size N × Wo × Ho × K
• In tf.keras:
layers.Conv2D(filters, kernel_size,
padding, activation)
17
Pooling layers
18
Image from http://cs231n.github.io/convolutional-networks/
Other layers
• Flatten
• flattens the input into a vector
(typically before dense layers)
• Dropout
• similar as with dense layers
• In tf.keras:
layers.Flatten()
layers.Dropout(rate)
19
Typical architecture
21
AlexNet
VGG
22
Inception /
GoogLeNet
ResNet
DenseNet
23
Large-scale CNNs with pre-trained
weights retrain
replace
output layer
extracted
features