Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
111 views

Generative Model For Image Classification

This document proposes a hybrid generative model called a hybrid adversarial autoencoder for image classification and anomaly detection. The model combines an adversarial autoencoder with a generative adversarial network. It uses two discriminators, one to distinguish real from generated images and another to ensure different classes are represented by distinct distributions in the latent space. The model aims to generate images with low reconstruction loss while learning class-specific latent representations to improve classification. An evaluation of the model architecture and hyperparameter tuning methods is presented.

Uploaded by

jitendra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Generative Model For Image Classification

This document proposes a hybrid generative model called a hybrid adversarial autoencoder for image classification and anomaly detection. The model combines an adversarial autoencoder with a generative adversarial network. It uses two discriminators, one to distinguish real from generated images and another to ensure different classes are represented by distinct distributions in the latent space. The model aims to generate images with low reconstruction loss while learning class-specific latent representations to improve classification. An evaluation of the model architecture and hyperparameter tuning methods is presented.

Uploaded by

jitendra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Generative Model for Image Classification based on

Hybrid Adversarial Auto Encoder


Rikin J Nayak Dr. Jitendra P Chaudhari
PhD Research Scholar, V.T.Patel Department of Electronics Associate Professor, V.T.Patel Department of Electronics
Communication Engineering, Chandubhai S Patel Institute Communication Engineering, Chandubhai S Patel Institute
of Technology, Charotar University of Science and of Technology, Charotar University of Science and
Technology, Changa, Gujarat, India Technology, Changa, Gujarat, India
rikinnayak@gmail.co m jitendrachaudhari.ec@charusat.ac.in

Abstract Generative model used widely in machine learning finally in trained model you have trained Generator which can
due to its ability to learn the feature from the input and able to generate data samples from random vector given to it.
regenerate the same. However, for different application different The proposed custom model in this manuscript uses
models are used. Hybrid model gives more precise results in some Adversarial Autoencoders (AAEs) as the reference model. In
of the area. One such model is designed for specific application of
anomaly detection. Here adversarial auto encoder model is used AAEs, the compressed vector has a known distribution that
as base model and added adversarial effect from generative depends on one of its inputs, which is a noise sample. The
adversarial network to generate image from features. The results compressed vector in AAEs is generated by the encoder
show that the proposed model can leverage less reconstruction network using both the data sample and the noise s ample, and
loss in image regeneration and its compressed representation as its distribution is learned during the training process and
latent space can classify each class differently. This research influenced by both inputs. In our model, we have used two
study discusses about the proposed model architecture and discriminators. The second discriminator is for images and
various methods used for hyper parameter tuning. discriminates between the original image and the regenerated
image. We have considered the reconstruction error as the
Keywords Auto encoder; Generative model; Anomaly detection.
performance parameter, and our model learns from the input
I. INT RODUCT ION to keep the error as low as possible while ensuring that images
of different classes have distinct distributions.
The paper is organized as follow: In section II we present the
Autoencoders are unsupervised neural networks in deep related work and concept of unsupervised generative models.
learning that were first introduced in [1]. They consist of an In section III we have described the proposed architecture, in
encoder and decoder that work together to learn a compressed section IV we have described experimental results and in
representation of input data, which can be used for vario us section V we conclude the paper.
applications such as data compression [2], feature extraction
[3], and denoising [4]. Autoencoders have the advantage of II. RELATED WORK
being an unsupervised network, which means they can be
There are many variations of autoencoders, including
trained without any labeled data, making them suitable for
denoising autoencoders [6], variational autoencoders [7], and
unsupervised tasks . The key idea is to train the model in such adversarial autoencoders [8], each with its own strengths and
a way that it can regenerate the original data with minimal
weaknesses. In autoencoders , the size of the latent vector is a
losses.
key parameter, and it needs careful tuning for effective
Reconstruction is a fundamental aspect of autoencoder regeneration of the original data back. However, autoencoders
models, where the encoded latent representation is decoded to
are used in many different applications like compression [9],
produce the original input. Achieving lossless regeneration is a
recommendation systems for collaborative filtering [10],
challenging problem, and limited research has been conducted anomaly detection [11], compressive sensing [12], etc. They
in this area [5].
are also used in natural language processing [13], object
Compare to Auto encoder, Generative Adversarial Networks
detection [14], and image analysis [15]. The popularity of
(GANs) is generative model, in which two networks are there variational autoencoders stems from their capacity to
called generator and discriminator. Generator generates the
regenerate data using a latent representation that is
new samples based on trained data while descriminator
probabilistically distributed [16]. Adversarial autoencoders
attempt to identify the original sample from mixed samples of use a combination of generative and discriminative models to
original and fake samples. Basically it inspired by game
learn a compressed representation of input data that can be
theory in which both the model compete with each other and
used in various applications [8].
Generative adversarial networks (GANs) have also become Sometimes, it happens that a neural network-based model can
popular in computer vision due to their ability to generate new reconstruct certain data that were not present during training.
sample data, as compared to autoencoders, which simply This is an example of poor performance and may occur due to
reconstruct them. Due to this, they are successfully used in the similarity in structure or closeness with other class labels.
computer vision tasks like image inpainting, image super- In other words, the model may generalize poorly on unseen
resolution, image compression, anomaly detection, etc. data due to overfitting. Overfitting occurs when the model
[17,18]. GANs have many advantages, but they are hard to memorizes the training data instead of learning the underlying
train. They consider feature-wise error during training rather patterns, resulting in poor generalization performance.
than element-wise error. Hybrid networks called VAE-GANs Similarly, the image reconstruction loss, also known as the
are models that combine the reconstruction loss from VAEs reconstruction error, measures the difference between the
and the adversarial loss from GANs to generate high -quality original input image and the output image generated by the
and diverse samples while preserving the disentanglement of decoder. The goal of the autoencoder is to minimize this loss,
the latent variables [19]. Few other models like BiGA N [20], which encourages the network to learn a compact
EBGAN [21] are also having similar advantages of having representation of the input data that can accurately reconstruct
hybrid characteristics. Model effectiveness can be measured the original input image. .
by how precisely the model can regenerate original data and
how the model can classify different class data with proper III. A RCHIT ECT URE
distance between them. Below table I summarizes different General architecture of the proposed model is as in fig 1. It
types of autoencoders , including their descriptions and includes two network. 1. Adversarial auto encoder. 2.
applications. Discriminator. In case of variational autoencoder the
T ABLE 1 SUMMARY OF VARIOUS AUTOENCODERS distribution of latent vector would be normal distribution
because of KL divergence term in loss function. While
Auto Encoder Description Applications regardless reconstruction loss in adversarial auto encoder the
Variational Uses probabilistic Image and speech latent distribution could be any distribution and it depends on
Autoencoders distribution for generation, noise vector p (z) . In AAE it uses adversarial concept due to
latent vector dimensionality latent vector q (z) has better distribution compare to VAE. We
reduction are considering this advantage while choosing base model for
Adversarial Auto Combination of Image and speech our network. Second Network in our model is discriminator
encoders generative and generation, data which discriminate the input image and generated image from
discriminative compression AAE, because of which network will jointly trained by two
models models which results in to better reconstruction.
VAE-GANs Hybrid model that Image and speech AAE is having three components. First encoder will encode
combines VAEs generation, data the input image and generates the latent vector which is
and GANs compression compressed representation of input data. Decoder will takes
BiGAN Uses generative Image and speech the input from this latent vector and regenerates the input data
and discriminative generation, by minimizing reconstruction loss between generated image
models anomaly detection and original image. Third component is discriminator 1 which
EBGAN Uses energy-based takes two input, one from vector with known distribution P
model for learning and one from latent vector Q and this discriminator will force
compressed latent vector Q to have its data distribution close to known
representation distribution P. This will allow user to generate any desired
distribution from Q. Discriminator 2 is other network which

Fig. 1 Block diagram of the architecture


will discriminate the input image from generated image and 4 70.16187 69.60713 70.1442
work as generative network which will jointly train the 1 78.3558 78.47906 79.93134
encoder decoder part to regenerate the input data by 4 66.8167 66.7238 66.45378
ensuring better reconstruction which will be proved by the 9 66.7407 66.62511 66.19086
simulation results in consecutive part of the paper. In above 5 62.18497 61.84954 60.58648
9 66.83105 66.77806 67.17739
encoder which is, it consider feature wise error over element Average 69.34338 69.31545 69.39389
wise error.

Encoder Decoder Architecture:


Encoder and Decoder are working as simple auto encoder
which regenerates the original image back by reducing
reconstruction loss and KL Divergence between original
image and generated image as

Here KL divergence will minimize which will force the


network to generate the sample having distribution as close as
input image.
Fig. 2 Original vs Generated
Discriminator 1:
This network will work as AAE, which generates the latent A. Original B. Proposed Model C. AAE D.AE

of discriminator 1 is to force the encoder to generate a latent T ABLE 3 PARAMETER FOR GIVEN MODEL
vector that has the same distribution as the prior distribution,
which is given as the second input to discriminator 1. Parameter Value
Discriminator 2: Initial learning rate 0.001
This network will work as generative adversarial network Batch Size 16
which will discriminate the input image with output image of Epoch Random batch 20000 times
the auto encoder, this will force auto encoder to generate Prior distribution Gaussian Normal
original image due to training with the custom loss function. Latent Space 8
Trained auto encoder will be able to generate original image to Dimension
fool the discriminator while discriminator will classify original
as true and generated as fake. The function of this network is In Figure 3, the decoder uses a randomly generated vector
similar to GAN. from the prior distribution to produce a reconstructed sample,
as shown. Figure 4 presents t-SNE visualizations that illustrate
IV. EXPERIMENTS the clustering of the MNIST dataset. The AAE and our model
Here we have compared our model with auto encoder and exhibit virtually identical data distributions for the
Adversarial auto encoder. MNIST dataset is selected for reconstructed images. However, the AE shows a gap between
performing experiments. It contain handwritten digits from 0 different samples, which could indicate that the model is
to 9, each of 28x28 size. capable of generating diverse types of images when the input
For the Experiment we have considered following
parameter as shown in Table 3. Fig 2 shows regenerated
images of A. using different models (B,C,D). Table 1 shows
recorded peak signal to noise for the generated images in Fig
2. Comparison of Peak Signal to Noise ratio for reconstructed
image is as shown in Table 2.

T ABLE 2 PSNR COMPARISION

MNIST Auto Adversarial Proposed


encoder Auto encoder Model
2 67.6747 68.2452 67.61954
1 77.5252 77.75206 78.90123
0 67.79948 67.77908 67.54023 Fig. 3 Reconstructed image of decoder for random vector of prior's
distribution
image falls within that gap. [2] Liu, Jinyang, et al. "Exploring autoencoder-based error-bounded
The results demonstrate that our hybrid model achieves the compression for scientific data." 2021 IEEE International Conference on
Cluster Computing (CLUST ER). IEEE, 2021.
highest PSNR for the original image and its reconstructed
counterpart. These findings suggest that our model could serve [3] Gogna, Anupriya, and Angshul Majumdar. "Discriminative autoencoder
as a valuable generative tool that is trained to regenerate for feature extraction: Application to character recognition." Neural
Processing Letters 49 (2019): 1723-1735.
desired inputs. Additionally, this model has the potential to be
extended for various applications that require a precise and [4] Komal Bajaj, Dushyant Kumar Singh, Mohd. Aquib
effective generative architecture. We have employed the Ansari,Autoencoders Based Deep Learner for Image Denoising, Procedia
Computer Science, Volume 171, 2020, Pages 1535-1541, ISSN 1877-0509,
model with some modifications in a challenging application
known as anomaly detection. This involved fine- tuning the [5] Li, Honggui, and Maria T rocan. "Generative Adversarial Networks-
hyper parameters and optimizing the loss function to ensure based Reconstruction of Low Dimensional Autoencoder Representations."
Proceedings of the 2019 International Conference on Artificial Intelligence
the model's ability to accurately detect anomalous data points
and Computer Science. 2019.
within a given dataset. The modified auto encoder was able to
effectively learn the underlying patterns in the data and [6] Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P. A. (2008).
Extracting and composing robust features with denoising autoencoders. In
distinguish between normal and anomalous samples, making it
Proceedings of the 25th international conference on Machine learning (pp .
a promising solution for detecting anomalies in various 1096-1103).
domains such as medical imaging, cyber security, and
[7] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes.
industrial quality control.
arXiv preprint arXiv:1312.6114.
[8] Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2016).
Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
[9] Cheng, Zhengxue, et al. "Energy compaction-based image compression
using convolutional autoencoder." IEEE Transactions on Multimedia 22.4
(2019): 860-873.
[10] Zhang, Xiaofeng, Jingbin Zhong, and Kai Liu. "Wasserstein
autoencoders for collaborative filtering." Neural Computing and
Applications (2020): 1-10.
[11] Borghesi, Andrea, et al. "Anomaly detection using autoencoders in high
performance computing systems." Proceedings of the AAAI Conference on
Artificial Intelligence. Vol. 33. 2019.
[12] Zhang, Zufan, et al. "The optimally designed autoencoder network for
compressed sensing." EURASIP Journal on Image and Video Processing
2019.1 (2019): 1-12.
[13]
Fig. 4 T SNE visualizations of clustered MNIST dataset for different models speech recognition using deep denoising autoencoder." Engineerin g
(a) original (b) AAE (c) AE (d) Proposed Applications of Artificial Intelligence 59 (2017): 15-22.
[14] Li, Jia, Changqun Xia, and Xiaowu Chen. "A benchmark dataset and
V. CONCLUSION saliency-guided stacked autoencoders for video-based salient object
This study introduces a novel method to improve detection." IEEE Transactions on Image Processing 27.1 (2017): 349-364.
reconstruction performance by combining the benefits of [15] Chen, Min, et al. "Deep features learning for medical image analysis
Generative Adversarial Networks (GANs) and Adversarial with convolutional autoencoder neural network." IEEE Transactions on Big
Autoencoder (AAE). The proposed method involves Data (2017).
integrating a discriminator into the AAE architecture to jointly [16] Hou, Xianxu, et al. "Improving variational autoencoder with deep
train the model and achieve reduced reconstruction error. The feature consistent and generative adversarial training." Neurocomputing 341
experimental results reveal that the proposed method generates (2019): 183-194.
a denser cluster of projected pixels using t-SNE, facilitating [17] Liang, Kevin J., et al. "Generative adversarial network training is a
improved digit distinction. Furthermore, the clustered continual learning problem." arXiv preprint arXiv:1811.11083 (2018).
visualization allows for easy classification of each label from [18] Mescheder, Lars, Andreas Geiger, and Sebastian Nowozin. "Which
the trained data. The proposed model holds potential for training methods for GANs do actually converge?." arXiv preprint
detecting anomalies in images and videos, with the possibility arXiv:1801.04406 (2018).
of extending the architecture to a 3D model. [19] Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2016).
Autoencoding beyond pixels using a learned similarity metric. arXiv
REFERENCES preprint arXiv:1512.09300.
[1] Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Parallel distributed [20] Donahue, J., Krähenbühl, P., & Darrell, T. (2016). Adversarial feature
processing: Explorations in the microstructure of cognition, vol. 1. chap. learning. In Proceedings of the 33rd International Conference on Machine
Learning Internal Representations by Error Propagation, pp. 318 362. MIT Learning (ICML), pp. 97-105.
Press, Cambridge, MA, USA (1986). URL http://dl.acm. [21] Zhao, J., Mathieu, M., & LeCun, Y. (2016). Energy-based generative
org/citation.cfm?id=104279.104293 adversarial network. arXiv preprint arXiv:1609.03126.

You might also like