Lecture 14 Autoencoders

- Autoencoders are unsupervised neural networks that learn to compress input data into a latent space encoding and then reconstruct the output from this encoding to be as close to the original input as possible. - The encoder compresses the input into the latent space bottleneck layer, while the decoder then reconstructs the output from this compressed encoding. The network is trained to minimize the reconstruction loss between the input and output. - Autoencoders can be used for tasks like dimensionality reduction, feature extraction, image recognition, and recommendations by learning efficient data encodings and representations in their hidden layers.

Uploaded by

Devyansh Gupta

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Lecture 14 Autoencoders

Uploaded by

Devyansh Gupta

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

Autoencoders

Introduction
• Autoencoder is an unsupervised artificial neural network that learns how to
efficiently compress and encode data then learns how to reconstruct the
data back from the reduced encoded representation to a representation that
is as close to the original input as possible.
• Autoencoder, by design, reduces data dimensions by learning how to ignore
the noise in the data.
• Here is an example of the input/output image from the MNIST dataset to an
autoencoder.
Autoencoder
• An autoencoder is an unsupervised machine learning algorithm that takes
an image as input and tries to reconstruct it using fewer number of bits
from the bottleneck also known as latent space.
• The image is majorly compressed at the bottleneck.
• The compression in autoencoders is achieved by training the network for a
period of time and as it learns it tries to best represent the input image at
the bottleneck.
• The general image compression algorithms like JPEG and JPEG lossless
compression techniques compress the images without the need for any
kind of training and do fairly well in compressing the images.
• Autoencoders are similar to dimensionality reduction techniques like
Principal Component Analysis (PCA).
• They project the data from a higher dimension to a lower dimension using
linear transformation and try to preserve the important features of the
data while removing the non-essential parts.
• However, the major difference between autoencoders and PCA lies in the
transformation part: as you already read, PCA uses linear transformation
whereas autoencoders use non-linear transformations.
• Now that you have a bit of understanding about autoencoders, let's now
break this term and try to get some intuition about it!

• The above figure is a two-layer vanilla autoencoder with one hidden layer.
• In deep learning terminology, you will often notice that the input layer is
never taken into account while counting the total number of layers in an
architecture.
• The total layers in an architecture only comprises of the number of hidden
layers and the output layer.
• As shown in the image above, the input and output layers have the same
number of neurons.
• Let's take an example. You feed an image with just five pixel values into
the autoencoder which is compressed by the encoder into three pixel
values at the bottleneck (middle layer) or latent space.
• Using these three values, the decoder tries to reconstruct the five pixel
values or rather the input image which you fed as an input to the network.
Autoencoder Components
• Autoencoders consists of 4 main parts:
– Encoder: In which the model learns how to reduce the input
dimensions and compress the input data into an encoded
representation.
– Bottleneck: which is the layer that contains the compressed
representation of the input data. This is the lowest possible
dimensions of the input data.
– Decoder: In which the model learns how to reconstruct the data from
the encoded representation to be as close to the original input as
possible.
– Reconstruction Loss: This is the method that measures how well the
decoder is performing and how close the output is to the original
input.
• The training then involves using back propagation in order to minimize the
network’s reconstruction loss.
How does Autoencoders work?
• We take the input, encode it to identify latent feature representation.
Decode the latent feature representation to recreate the input.
• We calculate the loss by comparing the input and output.
• To reduce the reconstruction error we back propagate and update the
weights.
• Weight is updated based on how much they are responsible for the error.
• In our example, we have taken the dataset for products bought by
customers.
• Step 1: Take the first row from the customer data for all products bought
in an array as the input. 1 represent that the customer bought the
product. 0 represents that the customer did not buy the product.
• Step 2: Encode the input into another vector h. h is a lower dimension
vector than the input. We can use sigmoid activation function for h as the
it ranges from 0 to 1. W is the weight applied to the input and b is the bias
term.
h=f(Wx+b)
• Step 3: Decode the vector h to recreate the input. Output will be of same
dimension as the input.
• Step 4 : Calculate the reconstruction error L. Reconstruction error is the
difference between the input and output vector. Our goal is to minimize
the reconstruction error so that output is similar to the input vector
• Reconstruction error= input vector — output vector

• Step 5: Back propagate the error from output layer to the input layer to
update the weights. Weights are updated based on how much they were
responsible for the error.
• Learning rate decides by how much we update the weights.
• Step 6: Repeat step 1 through 5 for each of the observation in the
dataset. Weights are updated after each
• Step 7: Repeat more epochs. Epoch is when all the rows in the dataset
has passed through the neural network.
Where are Auto encoders used ?

• Used for Non Linear Dimensionality Reduction. Encodes input in the

hidden layer to a smaller dimension compared to the input dimension.
Hidden layer is later decoded as output. Output layer has the same
dimension as input. Autoencoder reduces dimensionality of linear and
nonlinear data hence it is more powerful than PCA.
• Used in Recommendation Engines. This uses deep encoders to
understand user preferences to recommend movies, books or items
• Used for Feature Extraction : Autoencoders tries to minimize the
reconstruction error. In the process to reduce the error, it learns some of
important features present in the input. It reconstructs the input from the
encoded state present in the hidden layer. Encoding generates a new set
of features which is a combination of the original features. Encoding in
autoencoders helps to identify the latent features presents in the input
data.
• Image recognition : Stacked autoencoder are used for image recognition.
We can use multiple encoders stacked together helps to learn different
features of an image.
Different Types of Autoencoders
• Undercomplete Autoencoders
– Goal of the Autoencoder is to capture the most
important features present in the data.
– Undercomplete autoencoders have a smaller dimension
for hidden layer compared to the input layer. This helps to
obtain important features from the data.
– Objective is to minimize the loss function by penalizing
the g(f(x)) for being different from the input x.

– When decoder is linear and we use a mean squared error

loss function then undercomplete autoencoder generates
a reduced feature space similar to PCA
– We get a powerful nonlinear generalization of PCA when
encoder function f and decoder function g are non linear.
– Undercomplete autoencoders do not need any
regularization as they maximize the probability of data
rather than copying the input to the output.
• Sparse Autoencoders
– Sparse autoencoders have hidden nodes greater than
input nodes. They can still discover important features
from the data.
– Sparsity constraint is introduced on the hidden layer. This
is to prevent output layer copy input data.
– Sparse autoencoders have a sparsity penalty, Ω(h), a value
close to zero but not zero. Sparsity penalty is applied on
the hidden layer in addition to the reconstruction error.
This prevents overfitting.

– Sparse autoencoders take the highest activation values in

the hidden layer and zero out the rest of the hidden
nodes. This prevents autoencoders to use all of the
hidden nodes at a time and forcing only a reduced
number of hidden nodes to be used.
– As we activate and inactivate hidden nodes for each row
in the dataset. Each hidden node extracts a feature from
the data
• Denoising Autoencoders(DAE)
– Denoising refers to intentionally adding noise to the
raw input before providing it to the network.
Denoising can be achieved using stochastic
mapping.
– Denoising autoencoders create a corrupted copy of
the input by introducing some noise. This helps to
avoid the autoencoders to copy the input to the
output without learning features about the data.
– Corruption of the input can be done randomly by
making some of the input as zero. Remaining nodes
copy the input to the noised input.
– Denoising autoencoders must remove the
corruption to generate an output that is similar to
the input. Output is compared with input and not
with noised input. To minimize the loss function we
continue until convergence
– Denoising autoencoders minimizes the loss function
between the output node and the corrupted input.
• Denoising helps the autoencoders to learn the latent representation
present in the data. Denoising autoencoders ensures a good
representation is one that can be derived robustly from a corrupted input
and that will be useful for recovering the corresponding clean input.
• Denoising is a stochastic autoencoder as we use a stochastic corruption
process to set some of the inputs to zero
Contractive Autoencoders(CAE)
• Contractive autoencoder(CAE) objective is to have
a robust learned representation which is less
sensitive to small variation in the data.
• Robustness of the representation for the data is
done by applying a penalty term to the loss
function. The penalty term is Frobenius norm of
the Jacobian matrix. Frobenius norm of
the Jacobian matrix for the hidden layer is
calculated with respect to input. Frobenius norm of
the Jacobian matrix is the sum of square of all
elements.

• Contractive autoencoder is another regularization

technique like sparse autoencoders and denoising
autoencoders.
• CAE surpasses results obtained by regularizing autoencoder using weight
decay or by denoising. CAE is a better choice than denoising autoencoder
to learn useful feature extraction.
• Penalty term generates mapping which are strongly contracting the data
and hence the name contractive autoencoder.
Stacked Denoising Autoencoders
• Stacked Autoencoders is a neural network with multiple
layers of sparse autoencoders
• When we add more hidden layers than just one hidden layer
to an autoencoder, it helps to reduce a high dimensional data
to a smaller code representing important features
• Each hidden layer is a more compact representation than the
last hidden layer
• We can also denoise the input and then pass the data
through the stacked autoencoders called as stacked
denoising autoencoders
• In Stacked Denoising Autoencoders, input corruption is used
only for initial denoising. This helps learn important features
present in the data. Once the mapping function f(θ) has been
learnt. For further layers we use uncorrupted input from the
previous layers.
• After training a stack of encoders as explained above, we can
use the output of the stacked denoising autoencoders as an
input to a stand alone supervised machine learning like
support vector machines or multi class logistics regression.
Deep Autoencoders
• Deep Autoencoders consist of two identical deep belief networks. One
network for encoding and another for decoding.
• Typically deep autoencoders have 4 to 5 layers for encoding and the next 4
to 5 layers for decoding. We use unsupervised layer by layer pre-training
• Restricted Boltzmann Machine(RBM) is the basic building block of the deep
belief network.
• In the figure, we take an image with 784 pixel. Train using a stack of 4
RBMs, unroll them and then finetune with back propagation
• Final encoding layer is compact and fast
Implementation of Autoencoder : Basic Autoencoder

• Load the dataset

• To start, you will train the basic autoencoder using the Fashon MNIST
dataset. Each image in this dataset is 28x28 pixels
• Define an autoencoder with two Dense layers: an encoder, which
compresses the images into a 64 dimensional latent vector, and a decoder,
that reconstructs the original image from the latent space.
• Train the model using x_train as both the input and the target.
The encoder will learn to compress the dataset from 784 dimensions to
the latent space, and the decoder will learn to reconstruct the original
images.

• Now that the model is trained, let's test it by encoding and decoding
images from the test set.
Example: Image denoising

• n autoencoder can also be trained to remove noise from images. In the

following section, you will create a noisy version of the Fashion MNIST
dataset by applying random noise to each image. You will then train an
autoencoder using the noisy image as input, and the original image as the
target.
• Let's reimport the dataset to omit the modifications made earlier.
• Adding random noise to the images.

• Plot the noisy images.

Define a convolutional autoencoder

• In this example, you will train a convolutional autoencoder using Conv2D

layers in the encoder, and Conv2DTranspose layers in the decoder.
• The decoder upsamples the images back from 7x7 to 28x28.

• Plotting both the noisy images and the denoised images produced by the
autoencoder.
• In this example, you will train an autoencoder to detect anomalies on the
Example: Anomaly detection

ECG5000 dataset. This dataset contains 5,000 Electrocardiograms, each

with 140 data points. You will use a simplified version of the dataset,
where each example has been labeled either 0 (corresponding to an
abnormal rhythm), or 1 (corresponding to a normal rhythm). You are
interested in identifying the abnormal rhythms.
• How will you detect anomalies using an autoencoder? Recall that an
autoencoder is trained to minimize reconstruction error. You will train an
autoencoder on the normal rhythms only, then use it to reconstruct all the
data. Our hypothesis is that the abnormal rhythms will have higher
reconstruction error. You will then classify a rhythm as an anomaly if the
reconstruction error surpasses a fixed threshold.
• Load ECG data
Build the model
• You will soon classify an ECG as anomalous if the reconstruction error is
greater than one standard deviation from the normal training examples.
First, let's plot a normal ECG from the training set, the reconstruction after
it's encoded and decoded by the autoencoder, and the reconstruction
error.
• Detect anomalies
• Detect anomalies by calculating whether the reconstruction loss is greater
than a fixed threshold. In this tutorial, you will calculate the mean average
error for normal examples from the training set, then classify future
examples as anomalous if the reconstruction error is higher than one
standard deviation from the training set.
• Plot the reconstruction error on normal ECGs from the training set
Thanks

Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Autoencoder - Unit 4
No ratings yet
Autoencoder - Unit 4
39 pages
Auto Encoder
No ratings yet
Auto Encoder
10 pages
D5_PPT
No ratings yet
D5_PPT
79 pages
Autoencoders
No ratings yet
Autoencoders
4 pages
Unit II
No ratings yet
Unit II
35 pages
Ch3-Auto-encoder
No ratings yet
Ch3-Auto-encoder
40 pages
Autoencoders
No ratings yet
Autoencoders
12 pages
MODULE 5 Auto-Encoders and Generative Models
No ratings yet
MODULE 5 Auto-Encoders and Generative Models
25 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
Study Materials - Denoising Autoencoders
No ratings yet
Study Materials - Denoising Autoencoders
7 pages
L23_autoencoders
No ratings yet
L23_autoencoders
16 pages
Autoencoders
No ratings yet
Autoencoders
20 pages
DL UNIT 4
No ratings yet
DL UNIT 4
21 pages
Introduction To Autoencoders: A Brief Overview
No ratings yet
Introduction To Autoencoders: A Brief Overview
27 pages
Unit-5 Auto Encoders in Deep Learning
No ratings yet
Unit-5 Auto Encoders in Deep Learning
23 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
DL M3 Tech
No ratings yet
DL M3 Tech
15 pages
ML Lec 19 Autoencoder
No ratings yet
ML Lec 19 Autoencoder
54 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
UNIT 3
No ratings yet
UNIT 3
23 pages
DL Class5
No ratings yet
DL Class5
23 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
UNIT-V DL
No ratings yet
UNIT-V DL
31 pages
DeepLearning Unit IV Notes
No ratings yet
DeepLearning Unit IV Notes
58 pages
Autoencoders
No ratings yet
Autoencoders
14 pages
Auto Encoders
No ratings yet
Auto Encoders
4 pages
UNIT V
No ratings yet
UNIT V
32 pages
6. Brief Introduction on Current Research Areas - Autoencoders
No ratings yet
6. Brief Introduction on Current Research Areas - Autoencoders
20 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
Autoencoder
No ratings yet
Autoencoder
39 pages
Unit 5e - Autoencoders
No ratings yet
Unit 5e - Autoencoders
32 pages
Autoencoders: Presented By: 2019220013 Balde Lansana (
No ratings yet
Autoencoders: Presented By: 2019220013 Balde Lansana (
21 pages
Vae Gan
No ratings yet
Vae Gan
214 pages
Auto Encoder
No ratings yet
Auto Encoder
39 pages
Lecture 2.3.1 - Autoencoders
No ratings yet
Lecture 2.3.1 - Autoencoders
6 pages
Chapter17 Autoencoders
No ratings yet
Chapter17 Autoencoders
23 pages
ch14 Autoencoder
No ratings yet
ch14 Autoencoder
42 pages
Autoencoders U
No ratings yet
Autoencoders U
44 pages
Deep Learning Module-2 & 4
No ratings yet
Deep Learning Module-2 & 4
48 pages
Experiment 4
No ratings yet
Experiment 4
26 pages
7& 9 Autoencoder and Variational Autoencoder
No ratings yet
7& 9 Autoencoder and Variational Autoencoder
13 pages
Unit 3
No ratings yet
Unit 3
39 pages
Generative_Models
No ratings yet
Generative_Models
65 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
659451A19_DL_EXP5
No ratings yet
659451A19_DL_EXP5
8 pages
Autoencoder
No ratings yet
Autoencoder
4 pages
Lecture_6373_07
No ratings yet
Lecture_6373_07
53 pages
Autoencoder_2
No ratings yet
Autoencoder_2
16 pages
UNIT-5 part1
No ratings yet
UNIT-5 part1
15 pages
Auto Encoder S
No ratings yet
Auto Encoder S
32 pages
Deep Learning: Autoencoder
No ratings yet
Deep Learning: Autoencoder
42 pages
Deep Learning Autoencoders
No ratings yet
Deep Learning Autoencoders
31 pages
Vae - Gan 1
No ratings yet
Vae - Gan 1
136 pages
Unit 5
No ratings yet
Unit 5
23 pages
Unit 5
No ratings yet
Unit 5
27 pages
DL Unit3 Autoencoder
No ratings yet
DL Unit3 Autoencoder
91 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Correct Maintenance - Cognex DataMan 8500
From Everand
Correct Maintenance - Cognex DataMan 8500
Unique Content
No ratings yet
Lecture 7 Ethical Issues in Human Resource Management
No ratings yet
Lecture 7 Ethical Issues in Human Resource Management
9 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Scala Notes
No ratings yet
Scala Notes
71 pages
Indian Partnership Act
No ratings yet
Indian Partnership Act
5 pages
Lecture 8.2 (Capm and Apt)
No ratings yet
Lecture 8.2 (Capm and Apt)
30 pages
Lecture 18 MGNM571
No ratings yet
Lecture 18 MGNM571
33 pages
Lecture 16 MGNM571
No ratings yet
Lecture 16 MGNM571
14 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
2 pages
Lecture 0
No ratings yet
Lecture 0
33 pages
Lecture 2.3 (Risk and Return)
No ratings yet
Lecture 2.3 (Risk and Return)
47 pages
Lecture 1 (Introduction To Security Analysis)
No ratings yet
Lecture 1 (Introduction To Security Analysis)
44 pages
Lecture 4 (Efficient Market Hypothesis)
100% (1)
Lecture 4 (Efficient Market Hypothesis)
46 pages
Lecture 2.1 (Risk and Return)
No ratings yet
Lecture 2.1 (Risk and Return)
40 pages
Lecture 2.2 (Risk and Return)
No ratings yet
Lecture 2.2 (Risk and Return)
33 pages
Acc501 2nd Quiz 2012
No ratings yet
Acc501 2nd Quiz 2012
16 pages
Lecture (Free Cash Flow Model - DDM)
No ratings yet
Lecture (Free Cash Flow Model - DDM)
23 pages
Book Building: IPO Price Discovery Mechanism
No ratings yet
Book Building: IPO Price Discovery Mechanism
35 pages
Central Tendency
No ratings yet
Central Tendency
32 pages
Lecture 1 Investment Banking
No ratings yet
Lecture 1 Investment Banking
35 pages
ACC501 - Final Term Papers 03
No ratings yet
ACC501 - Final Term Papers 03
17 pages
Netiquettes of Students
No ratings yet
Netiquettes of Students
5 pages
Show Plan
No ratings yet
Show Plan
28 pages
Teradata 13.10 Features
No ratings yet
Teradata 13.10 Features
43 pages
A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents
No ratings yet
A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents
11 pages
Base Gecor Manual
No ratings yet
Base Gecor Manual
40 pages
PARAM - Wikipedia, The Free Encyclopedia PDF
No ratings yet
PARAM - Wikipedia, The Free Encyclopedia PDF
6 pages
POM QM Software Manual
No ratings yet
POM QM Software Manual
220 pages
TOC Assignment No-2
No ratings yet
TOC Assignment No-2
3 pages
A Brief Report On Geometrical Modeling and Meshing of Textile Structures in TexGen
33% (3)
A Brief Report On Geometrical Modeling and Meshing of Textile Structures in TexGen
11 pages
Smart Cane Proposal
No ratings yet
Smart Cane Proposal
5 pages
Module 1 and 2 QB
No ratings yet
Module 1 and 2 QB
2 pages
SAS Simulation Studio
No ratings yet
SAS Simulation Studio
288 pages
BAC 2107 Business Process Reengineering
No ratings yet
BAC 2107 Business Process Reengineering
1 page
Kyocera Product Guide
No ratings yet
Kyocera Product Guide
8 pages
MURTHY Resume
100% (1)
MURTHY Resume
3 pages
BlueRoom - Onboarding - PractitionerV1.6
No ratings yet
BlueRoom - Onboarding - PractitionerV1.6
31 pages
Aix Admin
100% (1)
Aix Admin
224 pages
Oracle Hyperion Epm Certific
No ratings yet
Oracle Hyperion Epm Certific
88 pages
Warehouse App
No ratings yet
Warehouse App
13 pages
Dimatix
0% (1)
Dimatix
2 pages
Exercises With Solutions
No ratings yet
Exercises With Solutions
6 pages
Auditing SAP R3 - Control Risk Assessment
No ratings yet
Auditing SAP R3 - Control Risk Assessment
28 pages
The Incredible Power of Post Selection (Scott Aaronson)
100% (1)
The Incredible Power of Post Selection (Scott Aaronson)
33 pages
TCS NQT Programming Language Logic Q&A
No ratings yet
TCS NQT Programming Language Logic Q&A
6 pages
Pyexplabsys
No ratings yet
Pyexplabsys
194 pages
How To Kill A Great Idea
100% (47)
How To Kill A Great Idea
11 pages
Dominos Digital Transformation Rev 2.0
100% (1)
Dominos Digital Transformation Rev 2.0
2 pages
Midd - 2 Paper (BCS301)
No ratings yet
Midd - 2 Paper (BCS301)
1 page
Operating System Notes
No ratings yet
Operating System Notes
118 pages
Numerical Analysis: Doron Levy
100% (1)
Numerical Analysis: Doron Levy
112 pages