Lecture 14 Autoencoders
Lecture 14 Autoencoders
Introduction
• Autoencoder is an unsupervised artificial neural network that learns how to
efficiently compress and encode data then learns how to reconstruct the
data back from the reduced encoded representation to a representation that
is as close to the original input as possible.
• Autoencoder, by design, reduces data dimensions by learning how to ignore
the noise in the data.
• Here is an example of the input/output image from the MNIST dataset to an
autoencoder.
Autoencoder
• An autoencoder is an unsupervised machine learning algorithm that takes
an image as input and tries to reconstruct it using fewer number of bits
from the bottleneck also known as latent space.
• The image is majorly compressed at the bottleneck.
• The compression in autoencoders is achieved by training the network for a
period of time and as it learns it tries to best represent the input image at
the bottleneck.
• The general image compression algorithms like JPEG and JPEG lossless
compression techniques compress the images without the need for any
kind of training and do fairly well in compressing the images.
• Autoencoders are similar to dimensionality reduction techniques like
Principal Component Analysis (PCA).
• They project the data from a higher dimension to a lower dimension using
linear transformation and try to preserve the important features of the
data while removing the non-essential parts.
• However, the major difference between autoencoders and PCA lies in the
transformation part: as you already read, PCA uses linear transformation
whereas autoencoders use non-linear transformations.
• Now that you have a bit of understanding about autoencoders, let's now
break this term and try to get some intuition about it!
• The above figure is a two-layer vanilla autoencoder with one hidden layer.
• In deep learning terminology, you will often notice that the input layer is
never taken into account while counting the total number of layers in an
architecture.
• The total layers in an architecture only comprises of the number of hidden
layers and the output layer.
• As shown in the image above, the input and output layers have the same
number of neurons.
• Let's take an example. You feed an image with just five pixel values into
the autoencoder which is compressed by the encoder into three pixel
values at the bottleneck (middle layer) or latent space.
• Using these three values, the decoder tries to reconstruct the five pixel
values or rather the input image which you fed as an input to the network.
Autoencoder Components
• Autoencoders consists of 4 main parts:
– Encoder: In which the model learns how to reduce the input
dimensions and compress the input data into an encoded
representation.
– Bottleneck: which is the layer that contains the compressed
representation of the input data. This is the lowest possible
dimensions of the input data.
– Decoder: In which the model learns how to reconstruct the data from
the encoded representation to be as close to the original input as
possible.
– Reconstruction Loss: This is the method that measures how well the
decoder is performing and how close the output is to the original
input.
• The training then involves using back propagation in order to minimize the
network’s reconstruction loss.
How does Autoencoders work?
• We take the input, encode it to identify latent feature representation.
Decode the latent feature representation to recreate the input.
• We calculate the loss by comparing the input and output.
• To reduce the reconstruction error we back propagate and update the
weights.
• Weight is updated based on how much they are responsible for the error.
• In our example, we have taken the dataset for products bought by
customers.
• Step 1: Take the first row from the customer data for all products bought
in an array as the input. 1 represent that the customer bought the
product. 0 represents that the customer did not buy the product.
• Step 2: Encode the input into another vector h. h is a lower dimension
vector than the input. We can use sigmoid activation function for h as the
it ranges from 0 to 1. W is the weight applied to the input and b is the bias
term.
h=f(Wx+b)
• Step 3: Decode the vector h to recreate the input. Output will be of same
dimension as the input.
• Step 4 : Calculate the reconstruction error L. Reconstruction error is the
difference between the input and output vector. Our goal is to minimize
the reconstruction error so that output is similar to the input vector
• Reconstruction error= input vector — output vector
• Step 5: Back propagate the error from output layer to the input layer to
update the weights. Weights are updated based on how much they were
responsible for the error.
• Learning rate decides by how much we update the weights.
• Step 6: Repeat step 1 through 5 for each of the observation in the
dataset. Weights are updated after each
• Step 7: Repeat more epochs. Epoch is when all the rows in the dataset
has passed through the neural network.
Where are Auto encoders used ?
• Now that the model is trained, let's test it by encoding and decoding
images from the test set.
Example: Image denoising
• Plotting both the noisy images and the denoised images produced by the
autoencoder.
• In this example, you will train an autoencoder to detect anomalies on the
Example: Anomaly detection