0% found this document useful (0 votes)

23 views

L12_intro-cnn-part1__slides

Uploaded by

Osman Hamdi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

L12_intro-cnn-part1__slides

Uploaded by

Osman Hamdi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Lecture 12

Introduction to
Convolutional Neural Networks
Part 1
STAT 453: Deep Learning, Spring 2020
Sebastian Raschka
http://stat.wisc.edu/~sraschka/teaching/stat453-ss2020/

https://github.com/rasbt/stat453-deep-learning-ss20/tree/master/L12-cnns

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 1
CNNs for Image Classification

output

Image Source:
twitter.com%2Fcats&psig=AOvVaw30_o-PCM-
K21DiMAJQimQ4&ust=1553887775741551
p(y=cat)

Image Source: https://www.pinterest.com/pin/

244742560974520446

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 2
Object Detection

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (pp. 779-788).

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 3
Object Segmentation

umbrella.98 bus.99

umbrella.98
person1.00

person1.00
person1.00
backpack1.00
person1.00 person.99
handbag.96 person.99
person1.00 person1.00 person1.00
person1.00 person1.00
person.95 person.98
person1.00
person1.00 person1.00 person.94 person1.00 person1.00 person.89

person1.00 sheep.99
backpack.99
sheep.99 sheep.86
backpack.93 sheep.82 sheep.96
sheep.96 sheep.93 sheep.91 sheep.95 sheep.96 sheep1.00
sheep1.00
sheep.99
sheep1.00
sheep.99
sheep.96

sheep.99

person.99
bottle.99
dining table.96

bottle.99
bottle.99

person.99person1.00
person1.00
traffic light.96 tv.99

chair.98 chair.99
chair.90
dining table.99 chair.96 wine glass.97
chair.86
bottle.99wine glass.93 chair.99
bowl.85 wine glass1.00

elephant1.00
wine glass.99
wine glass1.00
person1.00 chair.96 chair.99 fork.95

person1.00 traffic light.95 bowl.81

person1.00
traffic light.92 traffic light.84
person1.00 person.85
person.96 truck1.00 person.99
motorcycle1.00 person.96person1.00
person.83 person1.00
motorcycle1.00 person.98
person.99 person.91
person.90 person.87 car.99 car.92
person.99
person.92 car.99 car.93
car1.00
motorcycle.95
knife.83

person.96

Figure 2. Mask R-CNN results on the COCO test set. These results are based on ResNet-101 [15], achieving a mask AP of 35.7 and
running at 5 fps. Masks are shown in color, and bounding box, category, and confidences are also shown.

ingly minor change, RoIAlign has a large impact: it im- 2. Related Work
proves mask accuracy by relative 10% to 50%, showing
He,bigger
Kaiming,
gainsGeorgia Gkioxari,
under stricter Piotr Dollár,
localization and Ross
metrics. Girshick.
Second, we R-CNN:
"Mask The Region-based
R-CNN." CNN
In Proceedings (R-CNN)
of the approach [10]
IEEE International
Conference on Computer
found it essential Vision,mask
to decouple pp. 2961-2969. 2017. we
and class prediction: to bounding-box object detection is to attend to a manage-
predict a binary mask for each class independently, without able number of candidate object regions [33, 16] and evalu-
competition among classes, and rely on the network’s RoI ate convolutional networks [20, 19] independently on each
Sebastian
classification Raschka
branch to predict theSTAT 453: In
category. Intro RoI. R-CNN
to Deep Learning
contrast, and was extendedModels
Generative [14, 9] to allow attending
SS 2020to RoIs 4
Face Recognition

[1]
x
<latexit sha1_base64="p8Wx+cqqkWj+1zNtDaf7R0Gpalg=">AAAB7nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/YA0ls122y7dbMLuRCyhP8KLB0W8+nu8+W/ctjlo64OBx3szzMwLEykMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3Q2q4FIo3UKDk7URzGoWSt8LRzdRvPXJtRKzucZzwIKIDJfqCUbRS6+kh871g0i2V3Yo7A1kmXk7KkKPeLX11ejFLI66QSWqM77kJBhnVKJjkk2InNTyhbEQH3LdU0YibIJudOyGnVumRfqxtKSQz9fdERiNjxlFoOyOKQ7PoTcX/PD/F/lWQCZWkyBWbL+qnkmBMpr+TntCcoRxbQpkW9lbChlRThjahog3BW3x5mTSrFe+8Ur27KNeu8zgKcAwncAYeXEINbqEODWAwgmd4hTcncV6cd+dj3rri5DNH8AfO5w81Jo97</latexit>

Similarity/
Distance
Score

[2]
x
<latexit sha1_base64="vzgd/QPklE2GpKgvXahAxpOTUdw=">AAAB7nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/YA0ls122i7dbMLuRiyhP8KLB0W8+nu8+W/ctjlo64OBx3szzMwLE8G1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1Q6pRcIkNw43AdqKQRqHAVji6mfqtR1Sax/LejBMMIjqQvM8ZNVZqPT1kfjWYdEtlt+LOQJaJl5My5Kh3S1+dXszSCKVhgmrte25igowqw5nASbGTakwoG9EB+pZKGqEOstm5E3JqlR7px8qWNGSm/p7IaKT1OAptZ0TNUC96U/E/z09N/yrIuExSg5LNF/VTQUxMpr+THlfIjBhbQpni9lbChlRRZmxCRRuCt/jyMmlWK955pXp3Ua5d53EU4BhO4Aw8uIQa3EIdGsBgBM/wCm9O4rw4787HvHXFyWeO4A+czx82rI98</latexit>

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 5
Lecture Overview

Today Next Lecture

Image Classification Padding

Convolutional Neural Network Basics Dropout2d and BatchNorm2d
CNN Architectures CNNs on the GPU
What a CNN Can See Common CNN Architectures in Detail
CNNs in PyTorch Transfer Learning

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 6
Lecture Overview

1. Image Classification
2. Convolutional Neural Network Basics
3. CNN Architectures
4. What a CNN Can See
5. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 7
Why Image Classification is Hard

Diﬀerent lighting, contrast, viewpoints, etc.

Image Source: Image Source: https://www.123rf.com/

twitter.com%2Fcats&psig=AOvVaw30_o-PCM- photo_76714328_side-view-of-tabby-cat-face-over-
K21DiMAJQimQ4&ust=1553887775741551 white.html

Or even simple translation This is hard for traditional

methods like multi-layer
perceptrons, because
the prediction is
basically based on a sum
of pixel intensities

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 8
Traditional Approaches

a) Use hand-engineered features

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 9
Traditional Approaches

a) Use hand-engineered features

Sasaki, K., Hashimoto, M., & Nagata, N. (2016). Person Invariant Classification of Subtle Facial Expressions Using Coded Movement Direction of
Keypoints. In Video Analytics. Face and Facial Expression Recognition and Audience Measurement (pp. 61-72). Springer, Cham.

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 10
Traditional Approaches

b) Preprocess images (centering, cropping, etc.)

Image Source: https://www.tokkoro.com/2827328-cat-animals-nature-feline-park-green-trees-grass.html

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 11
Lecture Overview

1. Image Classification
2. Convolutional Neural Network Basics
3. CNN Architectures
4. What a CNN Can See
5. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 12
Main Concepts Behind
Convolutional Neural Networks

• Sparse-connectivity: A single element in the feature map is

connected to only a small patch of pixels. (This is very diﬀerent
from connecting to the whole input image, in the case of multi-layer
perceptrons.)

• Parameter-sharing: The same weights are used for diﬀerent

patches of the input image.

• Many layers: Combining extracted local patterns to global patterns

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 13
Convolutional Neural Networks

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel: Backpropagation Applied to
Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, Winter 1989.

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 14
Convolutional Neural Networks
C3: f. maps 16@10x10
C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Haffner: Gradient Based Learning Applied to Document Recognition,
Proceedings of IEEE, 86(11):2278–2324, 1998.

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 15
Hidden Layers
C3: f. maps 16@10x10
C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

"Automatic feature extractor" "Regular classifier"

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 16
Hidden Layers
C3: f. maps 16@10x10
C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

Each "bunch" of feature maps represents one

hidden layer in the neural network.

Counting the FC layers, this network has 5 layers

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 17
Convolutional Neural Networks
Size of the resulting layers
Number of feature detectors
C3: f. maps 16@10x10
INPUT
C1: feature maps
6@28x28
S4: f. maps 16@5x5 Multi-layer perceptron
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

basically a fully-connected
nowadays called "pooling"
layer + MSE loss
"Feature detectors" (weight matrices) (nowadays better to use
that are being reused ("weight sharing") fc-layer + softmax
=> also called "kernel" or "filter" + cross entropy

Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Haffner: Gradient Based Learning Applied to Document Recognition,
Proceedings of IEEE, 86(11):2278–2324, 1998.

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 18
Weight Sharing
A "feature detector" (filter, kernel) slides over the inputs to generate
a feature map

9
X
The pixels are w j xj
j=1
referred to
<latexit sha1_base64="A0KexUBWYzFCrOQ6nv7KbgccmW0=">AAAB/3icbVDLSgMxFM34rPU1KrhxEyyCqzJTBXUhFN24rGAf0I5DJk3btElmSDJqGWfhr7hxoYhbf8Odf2PazkJbD1w4nHMv994TRIwq7Tjf1tz8wuLScm4lv7q2vrFpb23XVBhLTKo4ZKFsBEgRRgWpaqoZaUSSIB4wUg8GlyO/fkekoqG40cOIeBx1Be1QjLSRfHu3pWLuJ/1zN71NzlJ47/fhg9/37YJTdMaAs8TNSAFkqPj2V6sd4pgToTFDSjVdJ9JegqSmmJE034oViRAeoC5pGioQJ8pLxven8MAobdgJpSmh4Vj9PZEgrtSQB6aTI91T095I/M9rxrpz6iVURLEmAk8WdWIGdQhHYcA2lQRrNjQEYUnNrRD3kERYm8jyJgR3+uVZUisV3aNi6fq4UL7I4siBPbAPDoELTkAZXIEKqAIMHsEzeAVv1pP1Yr1bH5PWOSub2QF/YH3+AHSflbs=</latexit>

as "receptive field"

"feature map"

Rationale: A feature detector that works well in one region

may also work well in another region

Plus, it is a nice reduction in parameters to fit

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 19
Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

9
X
w j xj
<latexit sha1_base64="A0KexUBWYzFCrOQ6nv7KbgccmW0=">AAAB/3icbVDLSgMxFM34rPU1KrhxEyyCqzJTBXUhFN24rGAf0I5DJk3btElmSDJqGWfhr7hxoYhbf8Odf2PazkJbD1w4nHMv994TRIwq7Tjf1tz8wuLScm4lv7q2vrFpb23XVBhLTKo4ZKFsBEgRRgWpaqoZaUSSIB4wUg8GlyO/fkekoqG40cOIeBx1Be1QjLSRfHu3pWLuJ/1zN71NzlJ47/fhg9/37YJTdMaAs8TNSAFkqPj2V6sd4pgToTFDSjVdJ9JegqSmmJE034oViRAeoC5pGioQJ8pLxven8MAobdgJpSmh4Vj9PZEgrtSQB6aTI91T095I/M9rxrpz6iVURLEmAk8WdWIGdQhHYcA2lQRrNjQEYUnNrRD3kERYm8jyJgR3+uVZUisV3aNi6fq4UL7I4siBPbAPDoELTkAZXIEKqAIMHsEzeAVv1pP1Yr1bH5PWOSub2QF/YH3+AHSflbs=</latexit>
j=1

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 20
Weight Sharing
A "feature detector" (kernel) slides over the inputs to generate
a feature map

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 21
9
X (@1)
wj xj
j=1
<latexit sha1_base64="f26ph3SsblXR0kXxlacC2FehXAE=">AAACBnicbVDLSsNAFJ3UV62vqEsRBotQNyWpgroQim5cVrAPaNMwmU7aaWeSMDNRS8jKjb/ixoUibv0Gd/6N08dCWw9cOJxzL/fe40WMSmVZ30ZmYXFpeSW7mltb39jcMrd3ajKMBSZVHLJQNDwkCaMBqSqqGGlEgiDuMVL3Blcjv35HhKRhcKuGEXE46gbUpxgpLbnmfkvG3E36F3baTs5TeO/220mhbB+l8MHtu2beKlpjwHliT0keTFFxza9WJ8QxJ4HCDEnZtK1IOQkSimJG0lwrliRCeIC6pKlpgDiRTjJ+I4WHWulAPxS6AgXH6u+JBHEph9zTnRypnpz1RuJ/XjNW/pmT0CCKFQnwZJEfM6hCOMoEdqggWLGhJggLqm+FuIcEwkonl9Mh2LMvz5NaqWgfF0s3J/ny5TSOLNgDB6AAbHAKyuAaVEAVYPAInsEreDOejBfj3fiYtGaM6cwu+APj8wfkL5gZ</latexit>

Multiple "feature
detectors" (kernels) are used
to create multiple feature
maps
9
X (@2)
w j xj
<latexit sha1_base64="nCqyd07UuJkUGWlSGLMV2F7bQVM=">AAACBnicbVDLSsNAFJ3UV62vqEsRBotQNyWpgroQim5cVrAPaNMwmU7aaSeTMDNRS8jKjb/ixoUibv0Gd/6N08dCWw9cOJxzL/fe40WMSmVZ30ZmYXFpeSW7mltb39jcMrd3ajKMBSZVHLJQNDwkCaOcVBVVjDQiQVDgMVL3Blcjv35HhKQhv1XDiDgB6nLqU4yUllxzvyXjwE36F3baTs5TeO/220mhXDpK4YPbd828VbTGgPPEnpI8mKLiml+tTojjgHCFGZKyaVuRchIkFMWMpLlWLEmE8AB1SVNTjgIinWT8RgoPtdKBfih0cQXH6u+JBAVSDgNPdwZI9eSsNxL/85qx8s+chPIoVoTjySI/ZlCFcJQJ7FBBsGJDTRAWVN8KcQ8JhJVOLqdDsGdfnie1UtE+LpZuTvLly2kcWbAHDkAB2OAUlME1qIAqwOARPINX8GY8GS/Gu/Exac0Y05ld8AfG5w/luZga</latexit>
j=1

9
X (@3)
wj xj
<latexit sha1_base64="N3BOf0nmcHBzr6vnBzaSoMFhcQo=">AAACBnicbVDLSsNAFJ34rPUVdSnCYBHqpiStoC6EohuXFewD2jRMppN22pkkzEzUErJy46+4caGIW7/BnX/j9LHQ1gMXDufcy733eBGjUlnWt7GwuLS8sppZy65vbG5tmzu7NRnGApMqDlkoGh6ShNGAVBVVjDQiQRD3GKl7g6uRX78jQtIwuFXDiDgcdQPqU4yUllzzoCVj7ib9CzttJ+cpvHf77SRfLh2n8MHtu2bOKlhjwHliT0kOTFFxza9WJ8QxJ4HCDEnZtK1IOQkSimJG0mwrliRCeIC6pKlpgDiRTjJ+I4VHWulAPxS6AgXH6u+JBHEph9zTnRypnpz1RuJ/XjNW/pmT0CCKFQnwZJEfM6hCOMoEdqggWLGhJggLqm+FuIcEwkonl9Uh2LMvz5NasWCXCsWbk1z5chpHBuyDQ5AHNjgFZXANKqAKMHgEz+AVvBlPxovxbnxMWheM6cwe+APj8wfnQ5gb</latexit>
j=1

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 22
Size Before and After Convolutions

Feature map size: kernel width

input width

padding
W K + 2P
O= +1
<latexit sha1_base64="F3e+5qMk1hWaddof/b46u0hNgJ4=">AAACBXicbZC7SgNBFIbPxluMt1VLLQaDIIhhNwraCEEbwcKI5gLJEmYns8mQ2Qszs0JYtrHxVWwsFLH1Hex8GyfJFpr4w8DHf87hzPndiDOpLOvbyM3NLywu5ZcLK6tr6xvm5lZdhrEgtEZCHoqmiyXlLKA1xRSnzUhQ7LucNtzB5ajeeKBCsjC4V8OIOj7uBcxjBCttdczdG3SO2p7AJGmgI3SNDlG5miZ3qQa7YxatkjUWmgU7gyJkqnbMr3Y3JLFPA0U4lrJlW5FyEiwUI5ymhXYsaYTJAPdoS2OAfSqdZHxFiva100VeKPQLFBq7vycS7Es59F3d6WPVl9O1kflfrRUr78xJWBDFigZkssiLOVIhGkWCukxQovhQAyaC6b8i0sc6EqWDK+gQ7OmTZ6FeLtnHpfLtSbFykcWRhx3YgwOw4RQqcAVVqAGBR3iGV3gznowX4934mLTmjGxmG/7I+PwBia6VZg==</latexit>
S

output width stride

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 23
Kernel Dimensions and Trainable Parameters

For a grayscale image with a

5x5 feature detector (kernel),
we have the following dimensions
(number of parameters to learn)

What do you think is the output size

for this 28x28 image?

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 24
Cross-Correlation vs Convolution

Deep Learning Jargon: convolution in DL is actually cross-correlation

Cross-correlation is our sliding dot product over the image

Z[i, j]
<latexit sha1_base64="pgja+B12YuauQl9BN2y3pM0zL0U=">AAAB7nicbVBNSwMxEJ2tX7V+VT16CRbBg5TdKuix6MVjBfuB26Vk02wbm2SXJCuUpT/CiwdFvPp7vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPWjpOFaFNEvNYdUKsKWeSNg0znHYSRbEIOW2Ho5up336iSrNY3ptxQgOBB5JFjGBjpfaDz87QY9ArV9yqOwNaJl5OKpCj0St/dfsxSQWVhnCste+5iQkyrAwjnE5K3VTTBJMRHlDfUokF1UE2O3eCTqzSR1GsbEmDZurviQwLrccitJ0Cm6Fe9Kbif56fmugqyJhMUkMlmS+KUo5MjKa/oz5TlBg+tgQTxeytiAyxwsTYhEo2BG/x5WXSqlW982rt7qJSv87jKMIRHMMpeHAJdbiFBjSBwAie4RXenMR5cd6dj3lrwclnDuEPnM8faFKO9Q==</latexit>

"feature map"

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 25
Cross-Correlation vs Convolution

"feature map"

Cross-Correlation:
k
X k
X
Z[i, j] = K[u, v]A[i + u, j + v]
<latexit sha1_base64="yrGEywl1Y1LByNhPCkHvbOrPHc4=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiAIiWEmCgoSiHoRvEQwC07G0NPpSTrpWeglEIZ8jBd/xYsHFzx48VvsLIhGCxrqVb3H61duxKiQpvlhJObmFxaXksupldW19Y305lZVhIpjUsEhC3ndRYIwGpCKpJKResQJ8l1Gam7vYuTX+oQLGgY3chARx0ftgHoUI6mlZvr01qY52HWKDaH8ZqyKB73hXdwbwknd/66vbJWDfQee2TSrWTfbd5rpjJk3x4B/iTUlGTBFuZl+bbRCrHwSSMyQELZlRtKJEZcUMzJMNZQgEcI91Ca2pgHyiXDi8ZFDuKeVFvRCrl8g4Vj9OREjX4iB7+pOH8mOmPVG4n+eraR34sQ0iJQkAZ4s8hSDMoSjxGCLcoIlG2iCMKf6rxB3EEdY6lxTOgRr9uS/pFrIW4f5wvVRpnQ+jSMJdsAu2AcWOAYlcAnKoAIwuAeP4Bm8GA/Gk/FmvE9aE8Z0Zhv8gvH5BZ6lo4U=</latexit>
u= k v= k

Z[i, j] = K ⌦ A
<latexit sha1_base64="i/izFKQB/27hVxepexqMFOTaYd8=">AAAB/nicbVBNS8NAEN34WetXVDx5WSyCBylJFfQiVL0IXirYD0xD2Wy37dpNNuxOhBIK/hUvHhTx6u/w5r9x2+agrQ8GHu/NMDMviAXX4Djf1tz8wuLScm4lv7q2vrFpb23XtEwUZVUqhVSNgGgmeMSqwEGwRqwYCQPB6kH/auTXH5nSXEZ3MIiZH5JuxDucEjBSy9699/gRfvDxOb7BTQk8ZBpftOyCU3TGwLPEzUgBZai07K9mW9IkZBFQQbT2XCcGPyUKOBVsmG8mmsWE9kmXeYZGxKzx0/H5Q3xglDbuSGUqAjxWf0+kJNR6EAamMyTQ09PeSPzP8xLonPkpj+IEWEQnizqJwCDxKAvc5opREANDCFXc3IppjyhCwSSWNyG40y/Pklqp6B4XS7cnhfJlFkcO7aF9dIhcdIrK6BpVUBVRlKJn9IrerCfrxXq3Piatc1Y2s4P+wPr8AYAlk+g=</latexit>

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 26
Cross-Correlation vs Convolution
Cross-Correlation:
k
X k
X
Z[i, j] = K[u, v]A[i + u, j + v]
<latexit sha1_base64="yrGEywl1Y1LByNhPCkHvbOrPHc4=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiAIiWEmCgoSiHoRvEQwC07G0NPpSTrpWeglEIZ8jBd/xYsHFzx48VvsLIhGCxrqVb3H61duxKiQpvlhJObmFxaXksupldW19Y305lZVhIpjUsEhC3ndRYIwGpCKpJKResQJ8l1Gam7vYuTX+oQLGgY3chARx0ftgHoUI6mlZvr01qY52HWKDaH8ZqyKB73hXdwbwknd/66vbJWDfQee2TSrWTfbd5rpjJk3x4B/iTUlGTBFuZl+bbRCrHwSSMyQELZlRtKJEZcUMzJMNZQgEcI91Ca2pgHyiXDi8ZFDuKeVFvRCrl8g4Vj9OREjX4iB7+pOH8mOmPVG4n+eraR34sQ0iJQkAZ4s8hSDMoSjxGCLcoIlG2iCMKf6rxB3EEdY6lxTOgRr9uS/pFrIW4f5wvVRpnQ+jSMJdsAu2AcWOAYlcAnKoAIwuAeP4Bm8GA/Gk/FmvE9aE8Z0Zhv8gvH5BZ6lo4U=</latexit>
u= k v= k

Looping direction 4) 5) 6)
indicated in red 0,-1 0,0 0,1

7) 8) 9)
1,-1 1,0 1,1

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 27
Cross-Correlation vs Convolution
k
X k
X
Cross-Correlation: Z[i, j] = K[u, v]A[i + u, j + v] Z[i, j] = K ⌦ A
<latexit sha1_base64="i/izFKQB/27hVxepexqMFOTaYd8=">AAAB/nicbVBNS8NAEN34WetXVDx5WSyCBylJFfQiVL0IXirYD0xD2Wy37dpNNuxOhBIK/hUvHhTx6u/w5r9x2+agrQ8GHu/NMDMviAXX4Djf1tz8wuLScm4lv7q2vrFpb23XtEwUZVUqhVSNgGgmeMSqwEGwRqwYCQPB6kH/auTXH5nSXEZ3MIiZH5JuxDucEjBSy9699/gRfvDxOb7BTQk8ZBpftOyCU3TGwLPEzUgBZai07K9mW9IkZBFQQbT2XCcGPyUKOBVsmG8mmsWE9kmXeYZGxKzx0/H5Q3xglDbuSGUqAjxWf0+kJNR6EAamMyTQ09PeSPzP8xLonPkpj+IEWEQnizqJwCDxKAvc5opREANDCFXc3IppjyhCwSSWNyG40y/Pklqp6B4XS7cnhfJlFkcO7aF9dIhcdIrK6BpVUBVRlKJn9IrerCfrxXq3Piatc1Y2s4P+wPr8AYAlk+g=</latexit>

<latexit sha1_base64="yrGEywl1Y1LByNhPCkHvbOrPHc4=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiAIiWEmCgoSiHoRvEQwC07G0NPpSTrpWeglEIZ8jBd/xYsHFzx48VvsLIhGCxrqVb3H61duxKiQpvlhJObmFxaXksupldW19Y305lZVhIpjUsEhC3ndRYIwGpCKpJKResQJ8l1Gam7vYuTX+oQLGgY3chARx0ftgHoUI6mlZvr01qY52HWKDaH8ZqyKB73hXdwbwknd/66vbJWDfQee2TSrWTfbd5rpjJk3x4B/iTUlGTBFuZl+bbRCrHwSSMyQELZlRtKJEZcUMzJMNZQgEcI91Ca2pgHyiXDi8ZFDuKeVFvRCrl8g4Vj9OREjX4iB7+pOH8mOmPVG4n+eraR34sQ0iJQkAZ4s8hSDMoSjxGCLcoIlG2iCMKf6rxB3EEdY6lxTOgRr9uS/pFrIW4f5wvVRpnQ+jSMJdsAu2AcWOAYlcAnKoAIwuAeP4Bm8GA/Gk/FmvE9aE8Z0Zhv8gvH5BZ6lo4U=</latexit>
u= k v= k

k
X k
X
Convolution:
Z[i, j] = K[u, v]A[i u, j v]
<latexit sha1_base64="gciLvvrtiG4n9L4bASqngj19+7w=">AAACJHicbVDJSgNBFOyJW4xb1KOXxiB4SMJMFBQkEPUieIlgFpyMoafTSTrTs9BLIAz5GC/+ihcPLnjw4rfYWRBNLGioV/Uer1+5EaNCmuankVhYXFpeSa6m1tY3NrfS2ztVESqOSQWHLOR1FwnCaEAqkkpG6hEnyHcZqbne5civ9QkXNAxu5SAijo86AW1TjKSWmumzO5tmYc8pNoTym7Eq5rzhfewN4aTu/9TXtsrCvgPPbZrTrJfrO810xsybY8B5Yk1JBkxRbqbfGq0QK58EEjMkhG2ZkXRixCXFjAxTDSVIhLCHOsTWNEA+EU48PnIID7TSgu2Q6xdIOFZ/T8TIF2Lgu7rTR7IrZr2R+J9nK9k+dWIaREqSAE8WtRWDMoSjxGCLcoIlG2iCMKf6rxB3EUdY6lxTOgRr9uR5Ui3kraN84eY4U7qYxpEEe2AfHAILnIASuAJlUAEYPIAn8AJejUfj2Xg3PiatCWM6swv+wPj6BqTHo4k=</latexit>
u= k v= k
Z[i, j] = K ⇤ A
<latexit sha1_base64="vZGNFWQeTgSkyiycymTGu76B7qc=">AAAB+HicbVDLSgNBEOyNrxgfWfXoZTAIIhJ2o6AXIepF8BLBPHCzhNnJbDJm9sHMrBCXfIkXD4p49VO8+TdOkj1oYkFDUdVNd5cXcyaVZX0buYXFpeWV/GphbX1js2hubTdklAhC6yTikWh5WFLOQlpXTHHaigXFgcdp0xtcjf3mIxWSReGdGsbUDXAvZD4jWGmpYxbvHXaEHlx0jm7QIbromCWrbE2A5omdkRJkqHXMr3Y3IklAQ0U4ltKxrVi5KRaKEU5HhXYiaYzJAPeoo2mIAyrddHL4CO1rpYv8SOgKFZqovydSHEg5DDzdGWDVl7PeWPzPcxLln7kpC+NE0ZBMF/kJRypC4xRQlwlKFB9qgolg+lZE+lhgonRWBR2CPfvyPGlUyvZxuXJ7UqpeZnHkYRf24ABsOIUqXEMN6kAggWd4hTfjyXgx3o2PaWvOyGZ24A+Mzx9QvZDp</latexit>
9) 8) 7)
-1,-1 -1,0 -1,1
Basically, we are flipping the kernel (or the
6) 5) 4)
receptive field) horizontally and vertically
0,-1 0,0 0,1

3) 2) 1)
Looping direction 1,-1 1,0 1,1
indicated in red
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 28
Cross-Correlation vs Convolution

Deep Learning Jargon: convolution in DL is actually cross-correlation

"Real" convolution has the nice associative property:

(A ⇤ B) ⇤ C = A ⇤ (B ⇤ C)
<latexit sha1_base64="NmDEdG9hUsHl7PiAA9yVIT7ok0s=">AAAB+nicbVDLSsNAFJ3UV62vVJduBovQZlGSKuhG6GPjsoJ9QBvKZDpph04mYWailNhPceNCEbd+iTv/xmmbhbYeuHA4517uvceLGJXKtr+NzMbm1vZOdje3t39weGTmj9syjAUmLRyyUHQ9JAmjnLQUVYx0I0FQ4DHS8SaNud95IELSkN+raUTcAI049SlGSksDM1+sWfWS1YA3sGYV61ajNDALdtleAK4TJyUFkKI5ML/6wxDHAeEKMyRlz7Ej5SZIKIoZmeX6sSQRwhM0Ij1NOQqIdJPF6TN4rpUh9EOhiyu4UH9PJCiQchp4ujNAaixXvbn4n9eLlX/tJpRHsSIcLxf5MYMqhPMc4JAKghWbaoKwoPpWiMdIIKx0WjkdgrP68jppV8rORblyd1mo1tM4suAUnIEicMAVqIJb0AQtgMEjeAav4M14Ml6Md+Nj2Zox0pkT8AfG5w8dmJCs</latexit>

In DL, we usually don't care about that (as opposed to many traditional
computer vision and signal processing applications).

Also, cross-correlation is easier to implement.

Maybe the term "convolution" for cross-correlation became popular,

because "Cross-Correlational Neural Network" sounds weird ;)

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 29
Backpropagation in CNNs

Same overall concept as before: Multivariable chain rule,

but now with an additional weight sharing constraint

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 30
Remember Lecture 6? Graph with Weight Sharing

<latexit sha1_base64="Kk055JsIihKrUR2Qz/L8arieVek=">AAAB/HicbZDLSsNAFIZP6q3WW7RLN4NFqJuSqKAboejGZQV7gTaEyXTSDp1MwsxEiKW+ihsXirj1Qdz5Nk7bLLT1h4GP/5zDOfMHCWdKO863VVhZXVvfKG6WtrZ3dvfs/YOWilNJaJPEPJadACvKmaBNzTSnnURSHAWctoPRzbTefqBSsVjc6yyhXoQHgoWMYG0s3y73FBtE2Herj757gq6QQd+uODVnJrQMbg4VyNXw7a9ePyZpRIUmHCvVdZ1Ee2MsNSOcTkq9VNEEkxEe0K5BgSOqvPHs+Ak6Nk4fhbE0T2g0c39PjHGkVBYFpjPCeqgWa1Pzv1o31eGlN2YiSTUVZL4oTDnSMZomgfpMUqJ5ZgATycytiAyxxESbvEomBHfxy8vQOq25ZzXn7rxSv87jKMIhHEEVXLiAOtxCA5pAIINneIU368l6sd6tj3lrwcpnyvBH1ucPPZSTMQ==</latexit>
1 (z1 ) = a1
@a1
a1 @o
y
@w1
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>

<latexit sha1_base64="BOU8IhEf1nCpOTJ2JoJhJKmU0Z0=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0oMeiF48V7Qe0oUy2m3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqKGvSWMSqE6BmgkvWNNwI1kkUwygQrB2Mb2d++4kpzWP5aCYJ8yMcSh5yisZKD9j3+uWKW3XnIKvEy0kFcjT65a/eIKZpxKShArXuem5i/AyV4VSwaamXapYgHeOQdS2VGDHtZ/NTp+TMKgMSxsqWNGSu/p7IMNJ6EgW2M0Iz0sveTPzP66YmvPYzLpPUMEkXi8JUEBOT2d9kwBWjRkwsQaq4vZXQESqkxqZTsiF4yy+vktZF1busuve1Sv0mj6MIJ3AK5+DBFdThDhrQBApDeIZXeHOE8+K8Ox+L1oKTzxzDHzifP+lBjYs=</latexit>

@l
<latexit sha1_base64="4Z/KrxxA+GlxhVaTOyJaNHrf4VU=">AAACC3icbVC7TsMwFHV4lvIKMLJYrZCYqgSQYKxgYSwSfUhtFN24TmvVcSLbAVVRdxZ+hYUBhFj5ATb+BqeNBLQcydK559xr+54g4Uxpx/mylpZXVtfWSxvlza3tnV17b7+l4lQS2iQxj2UnAEU5E7Spmea0k0gKUcBpOxhd5X77jkrFYnGrxwn1IhgIFjIC2ki+XemFEkjWS0BqBhyD705+qntTYd+uOjVnCrxI3IJUUYGGb3/2+jFJIyo04aBU13US7WX5nYTTSbmXKpoAGcGAdg0VEFHlZdNdJvjIKH0cxtIcofFU/T2RQaTUOApMZwR6qOa9XPzP66Y6vPAyJpJUU0FmD4UpxzrGeTC4zyQlmo8NASKZ+SsmQzDhaBNf2YTgzq+8SFonNfe05tycVeuXRRwldIgq6Bi56BzV0TVqoCYi6AE9oRf0aj1az9ab9T5rXbKKmQP0B9bHN7v7mtM=</latexit>

w1 <latexit sha1_base64="A5BNLDamJxmqDahJkf2wo8PNgJk=">AAACCXicbZBPS8MwGMbT+W/Of1WPXoJD8DRaFfQ49OJxgtuErZS3WbqFpWlJUmGUXr34Vbx4UMSr38Cb38Z0K6ibDwR+PO/7JnmfIOFMacf5sipLyyura9X12sbm1vaOvbvXUXEqCW2TmMfyLgBFORO0rZnm9C6RFKKA024wvirq3XsqFYvFrZ4k1ItgKFjICGhj+TbuhxJI1k9AagYcx/kPg+/m2LfrTsOZCi+CW0IdlWr59md/EJM0okITDkr1XCfRXlbcSTjNa/1U0QTIGIa0Z1BARJWXTTfJ8ZFxBjiMpTlC46n7eyKDSKlJFJjOCPRIzdcK879aL9XhhZcxkaSaCjJ7KEw51jEuYsEDJinRfGIAiGTmr5iMwESjTXg1E4I7v/IidE4a7mnDuTmrNy/LOKroAB2iY+Sic9RE16iF2oigB/SEXtCr9Wg9W2/W+6y1YpUz++iPrI9vdbeaJw==</latexit>
@a1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>

<latexit sha1_base64="EBSdFgeDtyPz9IVeKoGbRsZIvwA=">AAACB3icbZDLSsNAFIZP6q3WW9SlIINFcFUSFXRZdOOygr1AG8pkOmmHTiZhZiKUkJ0bX8WNC0Xc+grufBsnbUBt/WHg4z/nzMz5/ZgzpR3nyyotLa+srpXXKxubW9s79u5eS0WJJLRJIh7Jjo8V5UzQpmaa004sKQ59Ttv++Dqvt++pVCwSd3oSUy/EQ8ECRrA2Vt8+7AUSk7QXY6kZ5ohnPxxlqG9XnZozFVoEt4AqFGr07c/eICJJSIUmHCvVdZ1Ye2l+I+E0q/QSRWNMxnhIuwYFDqny0ukeGTo2zgAFkTRHaDR1f0+kOFRqEvqmM8R6pOZruflfrZvo4NJLmYgTTQWZPRQkHOkI5aGgAZOUaD4xgIlk5q+IjLAJRpvoKiYEd37lRWid1tyzmnN7Xq1fFXGU4QCO4ARcuIA63EADmkDgAZ7gBV6tR+vZerPeZ60lq5jZhz+yPr4BSfGZjg==</latexit>
@o L(y, o) = l
x1 w 1 · x 1 = z1 o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>

<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="+e1bOL2+yE8wQHw7R7Wi1lbuH7o=">AAAB/HicbZDLSsNAFIYn9VbrLdqlm8EiuCqJCroRim5cVrAXaEOYTCbt0MlMmJmoMdRXceNCEbc+iDvfxmmbhbb+MPDxn3M4Z/4gYVRpx/m2SkvLK6tr5fXKxubW9o69u9dWIpWYtLBgQnYDpAijnLQ01Yx0E0lQHDDSCUZXk3rnjkhFBb/VWUK8GA04jShG2li+Xb33XdjHodDwwdAFfPRd3645dWcquAhuATVQqOnbX/1Q4DQmXGOGlOq5TqK9HElNMSPjSj9VJEF4hAakZ5CjmCgvnx4/hofGCWEkpHlcw6n7eyJHsVJZHJjOGOmhmq9NzP9qvVRH515OeZJqwvFsUZQyqAWcJAFDKgnWLDOAsKTmVoiHSCKsTV4VE4I7/+VFaB/X3ZO6c3Naa1wWcZTBPjgAR8AFZ6ABrkETtAAGGXgGr+DNerJerHfrY9ZasoqZKvgj6/MHXCGTRw==</latexit>

<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>

l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>

w1 3 (a1 , a2 ) =o
a2
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>

<latexit sha1_base64="MADeaGuZ5x0zDKtg8jYEX763ets=">AAAB/3icbVBNS8NAEN3Ur1q/ooIXL4tFqCAlaQW9CEUvHivYD2hDmGy37dJNNuxuhBJ78K948aCIV/+GN/+N2zYHrT4YeLw3w8y8IOZMacf5snJLyyura/n1wsbm1vaOvbvXVCKRhDaI4EK2A1CUs4g2NNOctmNJIQw4bQWj66nfuqdSMRHd6XFMvRAGEeszAtpIvn3QVWwQgl8tge+eYvArJ/gSC98uOmVnBvyXuBkpogx13/7s9gRJQhppwkGpjuvE2ktBakY4nRS6iaIxkBEMaMfQCEKqvHR2/wQfG6WH+0KaijSeqT8nUgiVGoeB6QxBD9WiNxX/8zqJ7l94KYviRNOIzBf1E461wNMwcI9JSjQfGwJEMnMrJkOQQLSJrGBCcBdf/kualbJbLTu3Z8XaVRZHHh2iI1RCLjpHNXSD6qiBCHpAT+gFvVqP1rP1Zr3PW3NWNrOPfsH6+Aa4qZP0</latexit>

@a2 <latexit sha1_base64="i3V+Hv5zU7q3mZ39kUdT/gvBnUk=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHM0nQj+hQ8pAzaqz0QPu1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYP6PKcCZwWuqlGhPKxnSIXUsljVD72fzUKTmzyoCEsbIlDZmrvycyGmk9iQLbGVEz0sveTPzP66YmvPYzLpPUoGSLRWEqiInJ7G8y4AqZERNLKFPc3krYiCrKjE2nZEPwll9eJa1a1buouveXlfpNHkcRTuAUzsGDK6jDHTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwDqxY2M</latexit>

<latexit sha1_base64="UfwdaiaQv7vCe0d17Hqvq1wxekM=">AAACC3icbVDLSgMxFM3UV62vUZduQovgqsxUQZdFNy4r2Ad0huFOmmlDMw+SjFKG7t34K25cKOLWH3Dn35hpB9TWA4GTc+69yT1+wplUlvVllFZW19Y3ypuVre2d3T1z/6Aj41QQ2iYxj0XPB0k5i2hbMcVpLxEUQp/Trj++yv3uHRWSxdGtmiTUDWEYsYARUFryzKoTCCCZk4BQDDgGrzH9ud179hR7Zs2qWzPgZWIXpIYKtDzz0xnEJA1ppAgHKfu2lSg3y2cSTqcVJ5U0ATKGIe1rGkFIpZvNdpniY60McBALfSKFZ+rvjgxCKSehrytDUCO56OXif14/VcGFm7EoSRWNyPyhIOVYxTgPBg+YoETxiSZABNN/xWQEOhyl46voEOzFlZdJp1G3T+vWzVmteVnEUUZHqIpOkI3OURNdoxZqI4Ie0BN6Qa/Go/FsvBnv89KSUfQcoj8wPr4BvY+a1A==</latexit>
@w1 <latexit sha1_base64="UHK8Rq4ihGcm4gMByPiIvg3wEO0=">AAAB/HicbZDLSsNAFIZPvNZ6i3bpZrAIdVOSKuhGKLpxWcFeoA1hMp20Q2eSMDMRYqmv4saFIm59EHe+jdM2C239YeDjP+dwzvxBwpnSjvNtrayurW9sFraK2zu7e/v2wWFLxakktEliHstOgBXlLKJNzTSnnURSLAJO28HoZlpvP1CpWBzd6yyhnsCDiIWMYG0s3y71FBsI7Ncqj757iq6QQd8uO1VnJrQMbg5lyNXw7a9ePyapoJEmHCvVdZ1Ee2MsNSOcToq9VNEEkxEe0K7BCAuqvPHs+Ak6MU4fhbE0L9Jo5v6eGGOhVCYC0ymwHqrF2tT8r9ZNdXjpjVmUpJpGZL4oTDnSMZomgfpMUqJ5ZgATycytiAyxxESbvIomBHfxy8vQqlXds6pzd16uX+dxFOAIjqECLlxAHW6hAU0gkMEzvMKb9WS9WO/Wx7x1xcpnSvBH1ucPQKeTMw==</latexit>
2 (z1 ) = a2

Upper path
@l @l @o @a1 @l @o @a2
= · · + · · (multivariable chain rule)
<latexit sha1_base64="TYYPnCIxpTv7H9H6wB8RxqOKTTU=">AAAC9XicrVLLSsNAFJ3EV019VF26GSyCIJSkCroRim5cVrAPaEqZTCc6OMmEmYlSQv7DjQtF3Pov7vwbJ2nBmpauvDBwOPfce2buXC9iVCrb/jbMpeWV1bXSulXe2NzaruzstiWPBSYtzBkXXQ9JwmhIWooqRrqRICjwGOl4D1dZvvNIhKQ8vFWjiPQDdBdSn2KkNDXYMcquLxBO3AgJRRGDLP3FTwMnhdYFXCDhKXTxkCtoQVjU8SkdylrlyjnCLFlwPbb+xbS+yLReMLUGlapds/OAs8CZgCqYRHNQ+XKHHMcBCRVmSMqeY0eqn2Q9MSOp5caSRAg/oDvS0zBEAZH9JP+1FB5qZgh9LvQJFczZ6YoEBVKOAk8rA6TuZTGXkfNyvVj55/2EhlGsSIjHRn7MoOIwWwE4pIJgxUYaICyovivE90gPR+lFyYbgFJ88C9r1mnNSs29Oq43LyThKYB8cgCPggDPQANegCVoAG8J4Nl6NN/PJfDHfzY+x1DQmNXvgT5ifPxdI7YY=</latexit>
@w1 @o @a1 @w1 @o @a2 @w1

Lower path

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 31
Backpropagation in CNNs
Same overall concept as before: Multivariable chain rule,
but now with an additional weight sharing constraint

Due to weight sharing: w1 = w2 <latexit sha1_base64="FKGwU2qvUIEYUjcCdiIWnsvCgXU=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSJ4KrtV0ItQ9OKxgv2QdlmyabYNTbJLkrWUpb/CiwdFvPpzvPlvTNs9aOuDgcd7M8zMCxPOtHHdb2dldW19Y7OwVdze2d3bLx0cNnWcKkIbJOaxaodYU84kbRhmOG0nimIRctoKh7dTv/VElWaxfDDjhPoC9yWLGMHGSo+jwEPXaBRUg1LZrbgzoGXi5aQMOepB6avbi0kqqDSEY607npsYP8PKMMLppNhNNU0wGeI+7VgqsaDaz2YHT9CpVXooipUtadBM/T2RYaH1WIS2U2Az0IveVPzP66QmuvIzJpPUUEnmi6KUIxOj6feoxxQlho8twUQxeysiA6wwMTajog3BW3x5mTSrFe+8Ur2/KNdu8jgKcAwncAYeXEIN7qAODSAg4Ble4c1Rzovz7nzMW1ecfOYI/sD5/AFAMo9k</latexit>

w1
<latexit sha1_base64="BGIvv1Que1aISVw+1pGEuT4uC1M=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qnn9Uplt+LOQJaJl5My5Kj3Sl/dfszSiCtkkhrT8dwE/YxqFEzySbGbGp5QNqID3rFU0YgbP5udOiGnVumTMNa2FJKZ+nsio5Ex4yiwnRHFoVn0puJ/XifF8MrPhEpS5IrNF4WpJBiT6d+kLzRnKMeWUKaFvZWwIdWUoU2naEPwFl9eJs1qxTuvVO8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AELeI2j</latexit>

w2
<latexit sha1_base64="k9TH6JRVGzznxlg0BHK2AhK6Dh8=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindeqd5dlGvXeRwFOIYTOAMPLqEGt1CHBjAYwDO8wpsjnRfn3fmYt644+cwR/IHz+QMM/I2k</latexit>

Optional averaging
weight update: ✓ ◆
1 @L @L
w1 := w2 := w1 ⌘· +
<latexit sha1_base64="MYHrCBQOQYN/sQ1qQKoAWS1dkO0=">AAACenicjVFNa9tAEF2pSZs6Ses2x1IYYvJFqJHcQEugYNJLDz0kECcBy4jReuUsWX2wO6oxQj8ify23/pJeesjKVku+DhlY9vFm5u3smyhX0pDn/XbcF0vLL1+tvG6trq2/edt+9/7MZIXmYsAzlemLCI1QMhUDkqTERa4FJpES59HV9zp//ktoI7P0lGa5GCU4SWUsOZKlwvb1NPTh8BtMw97i8uETBIIQIODjjCCINfLSr8peBUEkJ5PdhoIgR00SFQQJ0iVHVf6sqvI/a6Uq2H928T/5vbDd8brePOAx8BvQYU0ch+2bYJzxIhEpcYXGDH0vp1FZK3MlqlZQGJEjv8KJGFqYYiLMqJxbV8GWZcYQZ9qelGDO3u0oMTFmlkS2sh7cPMzV5FO5YUHx11Ep07wgkfLFQ3GhgDKo9wBjqQUnNbMAuZZ2VuCXaL0iu62WNcF/+OXH4KzX9T93eycHnf5RY8cK+8A22S7z2RfWZz/YMRswzv44H51tZ8f56266e+7+otR1mp4Ndi/cg1vG1r5V</latexit>
2 @w1 @w2
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 32
CNNs and Translation/Rotation/Scale Invariance
Note that CNNs are not really invariant to scale, rotation,
translation, etc.

The activations are still

dependent on the location, etc.

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 33
Pooling Layers Can Help With Local Invariance

Sebastian Raschka, Vahid Mirjalili. Python Machine

Learning. 3rd Edition. Birmingham, UK: Packt
Publishing, 2019. ISBN: 978-1789955750

Downside: Information is lost.

May not matter for classification, but applications where relative position is
important (like face recognition)

In practice for CNNs: some image preprocessing still recommended

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 34
Pooling Layers Can Help With Local Invariance

Note that typical pooling layers do not have any learnable parameters
Downside: Information is lost.
May not matter for classification, but applications where relative position is
important (like face recognition)

In practice for CNNs: some image preprocessing still recommended

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 35
Lecture Overview

1. Image Classification
2. Convolutional Neural Network Basics
3. CNN Architectures
4. What a CNN Can See
5. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 36
Main Breakthrough for CNNs:
AlexNet & ImageNet

Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities
between the two GPUs. One GPU runs the layer-parts at the top of the figure while the other runs the layer-parts
at the bottom. The GPUs communicate only at certain layers. The network’s input is 150,528-dimensional, and
the number of neurons in the network’s remaining layers is given by 253,440–186,624–64,896–64,896–43,264–
4096–4096–1000.

neurons in a kernel map). The second convolutional layer takes as input the (response-normalized
and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 ⇥ 5 ⇥ 48.
TheSutskever,
Krizhevsky, A., third, fourth, and
I., &fifth convolutional
Hinton, G. E.layers are connected
(2012). Imagenet to one another withoutwith
classification any intervening
deep convolutional neural
pooling or normalization layers. The third convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥
networks. In Advances in neural information processing systems (pp. 1097-1105).
256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth
convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥ 192 , and the fifth convolutional layer has 256
kernels of size 3 ⇥ 3 ⇥ 192. The fully-connected layers have 4096 neurons each.
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 37
rward pass and do not participate in back-
ral network samples a different architecture,
Main Breakthrough for CNNs:
reduces complex co-adaptations of neurons,
ar other neurons. It is, therefore, forced to
n with many different random subsets of the
ut multiply their outputs by 0.5, which is a AlexNet & ImageNet
of the predictive distributions produced by

Figure 2. Without dropout, our network ex-

he number of iterations required to converge.

nt
nd
nt
In Figure 3: 96 convolutional kernels of size
er: 11⇥11⇥3 learned by the first convolutional
or layer on the 224⇥224⇥3 input images. The Figure 4: (Left) Eight ILSVRC-2010 test images and the five labels considered most probable by our model.
The correct label is written under each image, and the probability assigned to the correct label is also shown
top 48 kernels were learned on GPU 1 while with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in the first column. The
the bottom 48 kernels were learned on GPU remaining columns show the six training images that produce feature vectors in the last hidden layer with the
smallest Euclidean distance from the feature vector for the test image.
2. See Section 6.1 for details.
i
In the left panel of Figure 4 we qualitatively assess what the network has learned by computing its
top-5 predictions on eight test images. Notice that even off-center objects, such as the mite in the
D E top-left, can be recognized by the net. Most of the top-5 labels appear reasonable. For example,
ble, ✏ is the learning rate, and @w
@L
wi D
is only other types of cat are considered plausible labels for the leopard. In some cases (grille, cherry)
there is genuine ambiguity about the intended focus of the photograph.
he objective with respect to w, evaluated at G. E.
Krizhevsky, A., Sutskever, I., & Hinton, (2012). Imagenet
the network’s classification with thedeepfeatureconvolutional neural
i
Another way to probe visual knowledge is to consider activations induced
networks. In Advances in neural information processing
by an systems (pp.
image at the last, 4096-dimensional 1097-1105).
hidden layer. If two images produce feature activation
vectors with a small Euclidean separation, we can say that the higher levels of the neural network
consider them to be similar. Figure 4 shows five images from the test set and the six images from
ean Gaussian distribution with standard de- the training set that are most similar to each of them according to this measure. Notice that at the
pixel level, the retrieved training images are generally not close in L2 to the query images in the first
cond, fourth, and fifth convolutional layers, column. For example, the retrieved dogs and elephants appear in a variety of poses. We present the
e constantSebastian Raschka
1. This initialization STAT 453:
accelerates Intro tomany
results for Deep Learning
more test and
images in the Generative
supplementary material.Models SS 2020 38
Main Breakthrough for CNNs:
AlexNet & ImageNet
The ImageNet set that was used
has ~1.2 million
images and 1000 classes

Accuracy is measured as top-5

performance:
Correct prediction if the true
label matches one of the top 5
predictions of the model

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural
FigureIn4:
networks. (Left) in
Advances Eight
neuralILSVRC-2010 test images
information processing and
systems the1097-1105).
(pp. five labels considered most prob
The correct label is written under each image, and the probability assigned to the correct
with a red bar (if it happens to be in the top 5). (Right) Five ILSVRC-2010 test images in th
Sebastiancolumns
remaining Raschka showSTAT 453: training
the six Intro to Deep Learning
images andproduce
that Generativefeature
Models vectorsSSin2020
the last 39
hid
Main Breakthrough for CNNs:
AlexNet & ImageNet

Note that the actual network

inputs were still 224x224 images
(random crops from
downsampled 256x256 images)

224x224 is still a good/

reasonable size today
(224*224*3 = 150,528 features)

Figure 1: Top1 vs. network. Single-crop top-1 vali- Figure 2: Top1 vs. operations, size / parameters.
dation accuracies for top scoring single-model archi- Top-1 one-crop accuracy versus amount of operations
tectures. We introduce with this chart our choice of required for a single forward pass. The size of the
colour scheme, which will be used throughout this blobs is proportional to the number of network pa-
publication to distinguish effectively different archi- rameters; a legend is reported in the bottom right cor-
tectures and their correspondent authors. Notice that ner, spanning from 5⇥106 to 155⇥106 params. Both
networks of the same group share the same hue, for these figures share the same y-axis, and the grey dots
example ResNet are all variations of pink. highlight the centre of the blobs.

Canziani, A., Paszke, A., & Culurciello, E. (2016). An analysis of deep neural network models for practical
single arXiv
applications. VGG-161arXiv:1605.07678.
run of preprint (Simonyan & Zisserman, 2014) and GoogLeNet (Szegedy et al., 2014) are
8.70% and 10.07% respectively, revealing that VGG-16 performs better than GoogLeNet. When
models are run with 10-crop sampling,2 then the errors become 9.33% and 9.15% respectively, and
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 41
therefore VGG-16 will perform worse than GoogLeNet, using a single central-crop. For this reason,
Convolutions with Color Channels

Sebastian Raschka, Vahid Mirjalili. Python Machine

Learning. 3rd Edition. Birmingham, UK: Packt
Publishing, 2019. ISBN: 978-1789955750

Image dimension: X 2 Rn1 ⇥n2 ⇥cin in NWHC format,

<latexit sha1_base64="GzJ9t8GJxsmHc4EQcbLCmf2kfPA=">AAACIXicbVDLSsNAFJ34rPUVdelmsAiuSlIFuyy6cVnFPqCpYTKdtEMnkzBzI5SQX3Hjr7hxoUh34s84fQjaemDg3HPv5c45QSK4Bsf5tFZW19Y3Ngtbxe2d3b19++CwqeNUUdagsYhVOyCaCS5ZAzgI1k4UI1EgWCsYXk/6rUemNI/lPYwS1o1IX/KQUwJG8u2qFxEYBGHWzrHHJZ6VQXaXP2TSd7EHPGIaS7/yQ6mfcZnnvl1yys4UeJm4c1JCc9R9e+z1YppGTAIVROuO6yTQzYgCTgXLi16qWULokPRZx1BJzLFuNnWY41Oj9HAYK/Mk4Kn6eyMjkdajKDCTEwN6sTcR/+t1UgirXWMoSYFJOjsUpgJDjCdx4R5XjIIYGUKo4uavmA6IIhRMqEUTgrtoeZk0K2X3vFy5vSjVruZxFNAxOkFnyEWXqIZuUB01EEVP6AW9oXfr2Xq1PqzxbHTFmu8coT+wvr4Bx1+j5A==</latexit>

CUDA & PyTorch use NCWH

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 42
Lecture Overview

1. Image Classification
2. Convolutional Neural Network Basics
3. CNN Architectures
4. What a CNN Can See
5. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 43
What a CNN Can See
Simple example: vertical edge detector

(From classical computer vision research)

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 44
What a CNN Can See
Simple example: vertical edge detector

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 45
What a CNN Can See
Simple example: horizontal edge detector

A CNN can learn whatever it finds

best based on optimizing the objective
(e.g., minimizing a particular loss
to achieve good classification accuracy)

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 46
repeated in layers 2,3,4,5. The last two layers are fully connected, taking features from

What a CNN Can See

the top convolutional layer as input in vector form (6 · 6 · 256 = 9216 dimensions).
The final layer is a C-way softmax function, C being the number of classes. All filters
and feature maps are square in shape.
Which patterns from the training set activate the feature map?

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Fig. 4. Evolution of a randomly chosen subset of model features through training.

Each layer’s features are displayed in a different block. Within each block, we show
a randomly chosen subset of features at epochs [1,2,5,10,20,30,40,64]. The visualiza-
tion shows the strongest activation (across all training examples) for a given feature
map, projected down to pixel space using our deconvnet approach. Color contrast is
artificially enhanced and the figure is best viewed in electronic form.
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. European
occluderIncovers theconference on computer
image region vision (pp.
that appears 818-833).
in the Springer, Cham.
visualization, we see a
strong drop in activity in the feature map. This shows that the visualization
Method: backpropagate
genuinely correspondsstrong
to theactivation signals in
image structure thathidden layers that
stimulates to the input map,
feature images,
thenhence
applyvalidating the to
"unpooling" other
mapvisualizations
the values toshown in Fig.pixel
the original 4 and Fig. for
space 2.
visualization
5 Experiments STAT 453: Intro to Deep Learning and Generative Models
Sebastian Raschka SS 2020 47
What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks.
824 In European
M.D. Zeiler andconference
R. Ferguson computer vision (pp. 818-833). Springer, Cham.

Layer 1

Layer 2

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 48
What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. In European conference on computer vision (pp. 818-833). Springer, Cham.
Layer 2

Layer 3

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 49
What a CNN Can See
Which patterns from the training set activate the feature map?
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional
networks. In European
Layer 3 conference on computer vision (pp. 818-833). Springer, Cham.

Layer 4 Layer 5
Fig. 2. Visualization of features in a fully trained model. For layers 2-5 we show the top
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 50
Lecture Overview

1. Image Classification
2. Convolutional Neural Network Basics
3. CNN Architectures
4. What a CNN See
5. CNNs in PyTorch

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 51
LeNet-5 in PyTorch
C3: f. maps 16@10x10
C1: feature maps S4: f. maps 16@5x5
INPUT 6@28x28
32x32 S2: f. maps C5: layer F6: layer OUTPUT
6@14x14 120 10
84

Full connection Gaussian connections

Convolutions Subsampling Convolutions Subsampling Full connection

https://github.com/rasbt/stat453-
deep-learning-ss20/tree/master/L12-
cnns/code

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 52
Cats and Dogs Classifier (VGG16)

https://github.com/rasbt/stat453-
deep-learning-ss20/tree/master/L12-
cnns/code

Test accuracy: 88.28%

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 53
Cats and Dogs Classifier (VGG16)
and Guided Backpropagation
Visualization of the loss gradients with respect
to the inputs (images) as a naive way to
visualize CNN predictions.
2 @L 3
@x1
6 @L 7
rL(x) = 4 @x2 5
..
.
• In a normal forward pass, negative activation values are clamped
by ReLU functions (gradient is 0 for these).

• In guided backpropagation, we also clamp the negative gradients

during backpropagation to 0.

• Focus is on those activations that have a positive influence on the

class of interest.

https://github.com/rasbt/stat453-deep-learning-ss20/
tree/master/L12-cnns/code

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 54
Optional Reading Material

http://www.deeplearningbook.org/contents/convnets.html

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 55
https://twitter.com/boredyannlecun/status/1237460174811602946?s=20

Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 56

DLL-MATH 8 Week 6
100% (1)
DLL-MATH 8 Week 6
6 pages
Deep Learning For Computer Vision PDF
7% (14)
Deep Learning For Computer Vision PDF
24 pages
Approximating The Shapiro-Wilk W-Test For Non-Normality
No ratings yet
Approximating The Shapiro-Wilk W-Test For Non-Normality
3 pages
Integer Programming Practice
No ratings yet
Integer Programming Practice
5 pages
Anticipation Guide Example and Explanation
No ratings yet
Anticipation Guide Example and Explanation
5 pages
L13 Intro-Cnn Slides
No ratings yet
L13 Intro-Cnn Slides
65 pages
L18_gan__slides
No ratings yet
L18_gan__slides
33 pages
Dlincv 161110052148 PDF
No ratings yet
Dlincv 161110052148 PDF
271 pages
L01-intro_slides
No ratings yet
L01-intro_slides
67 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
Matconvnet: Convolutional Neural Networks For Matlab
No ratings yet
Matconvnet: Convolutional Neural Networks For Matlab
55 pages
Lecture 13
No ratings yet
Lecture 13
57 pages
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
No ratings yet
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
55 pages
Week8 WEB
No ratings yet
Week8 WEB
54 pages
stuff-in-the-news-02
No ratings yet
stuff-in-the-news-02
12 pages
CERN Deep Learning and Vision
No ratings yet
CERN Deep Learning and Vision
72 pages
Class Generative Models.pptx
No ratings yet
Class Generative Models.pptx
54 pages
Matconvnet Manual
No ratings yet
Matconvnet Manual
59 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
Lecture 2 PDF
No ratings yet
Lecture 2 PDF
62 pages
DAAI - Lecture - 15 - 23nov22
No ratings yet
DAAI - Lecture - 15 - 23nov22
113 pages
7 CNN
No ratings yet
7 CNN
66 pages
CVlecture 6
No ratings yet
CVlecture 6
33 pages
Lecture 08
No ratings yet
Lecture 08
43 pages
Part 2
No ratings yet
Part 2
225 pages
2019 6S191 L3 PDF
No ratings yet
2019 6S191 L3 PDF
71 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
80879v00 Deep Learning Ebook
No ratings yet
80879v00 Deep Learning Ebook
15 pages
Week5_Computer_Vision
No ratings yet
Week5_Computer_Vision
58 pages
Master's Thesis Deep Learning For Visual Recognition: Remi Cadene Supervised by Nicolas Thome and Matthieu Cord
No ratings yet
Master's Thesis Deep Learning For Visual Recognition: Remi Cadene Supervised by Nicolas Thome and Matthieu Cord
58 pages
Research and Prospect of Image Recognition Based o
No ratings yet
Research and Prospect of Image Recognition Based o
7 pages
ConvNet1
No ratings yet
ConvNet1
93 pages
Deep 2
No ratings yet
Deep 2
57 pages
Deep Learning For Computer Vision PDF
No ratings yet
Deep Learning For Computer Vision PDF
24 pages
Vbook - Pub Deep Learning For Computer Visionpdf
No ratings yet
Vbook - Pub Deep Learning For Computer Visionpdf
24 pages
DL Unit - 5
No ratings yet
DL Unit - 5
14 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
Syllabus Udacity Default en Us
No ratings yet
Syllabus Udacity Default en Us
4 pages
Lecture 5
No ratings yet
Lecture 5
114 pages
Lecture06 - Copie
No ratings yet
Lecture06 - Copie
52 pages
MA - Koelbl Memoire CNN
No ratings yet
MA - Koelbl Memoire CNN
79 pages
DL UNIT-5
No ratings yet
DL UNIT-5
34 pages
Intro4 ANN Deep CNN PDF
No ratings yet
Intro4 ANN Deep CNN PDF
20 pages
anthony
No ratings yet
anthony
33 pages
Cnn
No ratings yet
Cnn
56 pages
AI_slide_2
No ratings yet
AI_slide_2
82 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
Full download Deep Learning for Computer Vision: Image Classification, Object Detection, and Face Recognition in Python Jason Brownlee pdf docx
100% (1)
Full download Deep Learning for Computer Vision: Image Classification, Object Detection, and Face Recognition in Python Jason Brownlee pdf docx
40 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
No ratings yet
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
10 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Going Deeper With Convolutions
No ratings yet
Going Deeper With Convolutions
9 pages
ch4_CNN
No ratings yet
ch4_CNN
35 pages
Literature Review On Image Classification Architecture
No ratings yet
Literature Review On Image Classification Architecture
14 pages
group4 (2)
No ratings yet
group4 (2)
30 pages
Slides 11 - Image Pattern Classification
No ratings yet
Slides 11 - Image Pattern Classification
86 pages
Going Deeper With Convolutions
No ratings yet
Going Deeper With Convolutions
9 pages
Full-versionINBPSO
No ratings yet
Full-versionINBPSO
20 pages
Bayes_Expected_Utility
No ratings yet
Bayes_Expected_Utility
50 pages
Next Generation Factory Layouts
No ratings yet
Next Generation Factory Layouts
19 pages
L13_intro-cnn-part2__slides
No ratings yet
L13_intro-cnn-part2__slides
92 pages
Particle_swarm_optimization_for_integer_programmin
No ratings yet
Particle_swarm_optimization_for_integer_programmin
7 pages
Constraint Programming: Michael Trick Carnegie Mellon
No ratings yet
Constraint Programming: Michael Trick Carnegie Mellon
41 pages
Materials Requirements Planning Vs Just in Time
0% (1)
Materials Requirements Planning Vs Just in Time
28 pages
Math484 V1 PDF
No ratings yet
Math484 V1 PDF
167 pages
Sequencing and Scheduling
100% (1)
Sequencing and Scheduling
20 pages
Managing Innovation-Course Outline
No ratings yet
Managing Innovation-Course Outline
3 pages
Shadowing Techniques-2023
100% (1)
Shadowing Techniques-2023
91 pages
HR Management PSO
No ratings yet
HR Management PSO
1 page
Science Lesson Plan Rationale
0% (1)
Science Lesson Plan Rationale
6 pages
Basic German Language Programme Brochure-1
No ratings yet
Basic German Language Programme Brochure-1
2 pages
RPH Cup-Bi Y1 (Tapak) Lesson 12 Recap
100% (1)
RPH Cup-Bi Y1 (Tapak) Lesson 12 Recap
2 pages
The Teacher As Organizational Leader Melvie
No ratings yet
The Teacher As Organizational Leader Melvie
18 pages
(Full Set) Get Ready For Ielts Speaking + Writing + Listening + Reading
No ratings yet
(Full Set) Get Ready For Ielts Speaking + Writing + Listening + Reading
5 pages
Annex B Baseline and Damages School Data
No ratings yet
Annex B Baseline and Damages School Data
3 pages
Least-Learned-Skills THIRD Grading 2021-2022
No ratings yet
Least-Learned-Skills THIRD Grading 2021-2022
2 pages
Adaptive Boosting For Classification and Regression
No ratings yet
Adaptive Boosting For Classification and Regression
4 pages
Empowering Leaders Professional Development
No ratings yet
Empowering Leaders Professional Development
18 pages
Morning Session: Teaching Load School Year 2019 - 2020
No ratings yet
Morning Session: Teaching Load School Year 2019 - 2020
28 pages
EFL Teaching Methods
No ratings yet
EFL Teaching Methods
8 pages
Written Activity No. 7 Assessment of Learning
No ratings yet
Written Activity No. 7 Assessment of Learning
8 pages
SEMI DETAILED LESSON PLAN IN ENGLISH 3
No ratings yet
SEMI DETAILED LESSON PLAN IN ENGLISH 3
5 pages
Table 1: Different Types of Reinforcement Schedules
No ratings yet
Table 1: Different Types of Reinforcement Schedules
1 page
Guided Reading - The Zoo
No ratings yet
Guided Reading - The Zoo
3 pages
IPCRF Scores Summary
No ratings yet
IPCRF Scores Summary
3 pages
Psychsm TB Ch05
No ratings yet
Psychsm TB Ch05
37 pages
40 đề minh họa theo đề Bộ TN THPT Môn Anh 2025 mới nhất có giải chi tiết
100% (1)
40 đề minh họa theo đề Bộ TN THPT Môn Anh 2025 mới nhất có giải chi tiết
7 pages
School Board Operational Plan Example
No ratings yet
School Board Operational Plan Example
36 pages
Lolomboy National High School
No ratings yet
Lolomboy National High School
7 pages
Sample MATATAG WAP For Teachers
67% (3)
Sample MATATAG WAP For Teachers
7 pages
Auge Grant Proposal October 24 2016
No ratings yet
Auge Grant Proposal October 24 2016
10 pages
Ami School Standards 7092
No ratings yet
Ami School Standards 7092
2 pages
Quigley Caroline Resume
No ratings yet
Quigley Caroline Resume
2 pages
Linking Knowledge Management Orientation To Balanced Scorecard Outcomes
No ratings yet
Linking Knowledge Management Orientation To Balanced Scorecard Outcomes
26 pages