'AI & Machine Vision Coursework Implementation of Deep Learning For Classification of Natural Images
'AI & Machine Vision Coursework Implementation of Deep Learning For Classification of Natural Images
'AI & Machine Vision Coursework Implementation of Deep Learning For Classification of Natural Images
1. Introduction..................................................................................................................................2
2. Research Methodology................................................................................................................3
4 Results...........................................................................................................................................9
5. Conclusion.................................................................................................................................11
6. References..................................................................................................................................12
Abstract
Convoluted neural network is a master algorithm in computer vision recent years. Chollet(2017)
reported that CNNs manage to accomplish superhuman performance to achieve complex visual
tasks with robust computational power. It is one of the outperformer architecture that has been
executed in this study for transfer learning. It merges with idea of other famous CNN
architecture such as GoogleNet, ResNet and SqueezeNet which replaces the inception modules
with special layer known as depth wise separable convolution (Liu et al, 2018).
machine learning technique in the deep network is high accuracy and speed. The
train wide range of images. It is rich in time is usually plotted with mini-batch size
characteristic features for identification and 128. The prediction time is estimated using
classification. The features of CNN literally relative to the fastest network. The
SURF, HOG and LBP. It is the simple validation set is the most common way to
method to leverage power of CNN without measure the accuracy of networks trained on
consuming time and effort into training and ImageNet. Networks that are accurate on
just employ pre-trained network as a feature ImageNet are also often accurate when you
extractor. It is very easy to transfer learning apply them to other natural image data sets
to the pre-trained network instead to training using transfer learning or feature extraction.
a network from the scratch. Pre-trained This generalization is possible because the
neural networks are used for classification, networks have learned to extract powerful
feature extraction and transfer learning. and informative features from natural
There are different types of neural network images that generalize to other similar data
in its features and can be applied based on sets. However, high accuracy on ImageNet
the nature of the problem. The key traits of does not always transfer directly to other
the pre-trained network are network tasks, so it is a good idea to try multiple
accuracy, speed and size. Choosing the networks. There are different methods to
estimate the prediction and classification objective of the present study, description
accuracy especially for the ImageNet about deep learning techniques and CNN
validation dataset and wide range of sources neural network. The research methodology
employ different type of estimation section explains about the implementation
methods. However, ensemble of multiple procedure for the deep learning technique
models is also employed and in some for image classification. The simulation
scenario every image is assessed several explains the procedure for MatLab
times using multiple crops. Although, the simulation protocol for natural image
top most 5 kinds of accuracy values are identification, classification, feature
considered instead of the standard 1st extraction, batch normalization. The result
accuracy value in ensemble learning section infers the performance assessment
method. Despite of such variations, it is using confusion matrix. The conclusion
usually impossible to compute the accuracy section summarizes the current study results
value directly without comparing the and scope for the future improvement.
accuracy from other methods. The
accuracies of pre-trained networks in the
2. Research Methodology
MatLab Deep Learning Toolbox™ are the 2.1 Method for image classification
standard (top-1) accuracies computed using The research method for image
a single model and also single central image identification and classification involves
crop. The main research objective is to build three steps such as training a neural network,
an image identification and classification of validating the neural network and testing the
natural images and then to extract the neural network. This is also called as
features from the image and use those multilayer classification. The Validation
images to develop machine learning step is to assess the specific neural
algorithm to recognize what type of image network’s efficacy in prediction,
class it was among different natural images classification and identification as needed
(Liu et al, 2018). The structure of this report and to achieve the same, a unique validation
is divided into five sections such as data is employed. The validation data is
Introduction, research methodology, MatLab fetched from the portion of train test dataset
simulation, Results and conclusion. splitting. Upon completion of the validation,
Introduction section explains about the the testing dataset is classified as per the
different object classes using the trained implemented. (Liu et al, 2018; Bengio et al,
network. 2017). In this method, a couple of layers
were mapped and stacked up residually to
2.2 Current literature
map the image patterns in order to make
The benefits associated with recruiting pre-
easier for unsupervised learning. This
trained image classification network is to
method is used to train CNN which also
extract strong and informative features from
enhances the accuracy and thereby
the natural images which could be used as
improving the object identification and
initiation point to learn new task. Most of
classification.
the pre-trained network are trained on a
subset of the ImageNet database. This 2.4 Batch normalization
database is widely used for Large-Scale According to (Chan et al, 2015), internal
Visual Recognition Challenge (LSVRC). covariate shift was used to scale and shift
These networks are trained using millions of the non-linearity in the input layer. Two
images and possess the capacity to classify activation and back propagation was
images into more than 1000 object classes conducted and batch normalization was
such as flowers, birds, animals, utensils etc. performed to enhance efficiency so that the
Transfer learning from the group of actual training time could be reduced.
images to the specific images generally
enhances the prediction accuracy (Afzal et
3. MatLab simulation process
al, 2015). This method also facilitates the 3.1 Pre-processing of data
deep learning tool box and CNN A common issue occurs while performing
architectures and its classification efficacy image classification study is the differing
even if the sample size of the training data size of images within the dataset. Images
set is limited. with differing in its height and width are
invalid to be stacked in an array as an input
2.3 Residual learning
for machine learning algorithm. The
The training precision reduces with the
continuous transition can be introduced as
increase in the depth of the network. Hence,
input via interpolating pixel color by
ResNet – 50 was chosen for this study and
achieving output resized image. In this
to handle the accuracy issues very
study, bicubic interpolation is employed for
effectively residual learning was
image resizing. Though this method is quite
expensive, it is robust than other code.
interpolation techniques, it yield better
results.
6. References
5. Conclusion
Liu, K., Liu, H., Chan, K., Liu, T., and Pei,
The current study analysed Natural Images
S. “Age Estimation via Fusion of
dataset and the model is not restricted to
Depthwise Separable Convolutional Neural
classify only those images in the dataset but
Networks,” 2018 IEEE International
also wide array of images. The model has
Workshop on Information Forensics and
superior accuracy in identification,
Security (WIFS), Hong Kong, Hong Kong,
classification that other computer vision
2018, pp. 1–8
applications as well. It is evident from the
study showed some of the issues with the Deng, J., et al., "Imagenet: A large-scale
machine learning algorithm in which the hierarchical image database." IEEE
model has to be pre-trained and conduct Conference on Computer Vision and
residual learning to handle the identification Pattern Recognition, 2009.
and classification with higher accuracy. The
Donahue, J., et al., "Decaf: A deep
ability of Augmenting Image data store
convolutional activation feature for generic
function in order to decrease the dimensions
visual recognition." arXiv preprint
has specifically assisted and fastened up the
arXiv:1310.1531, 2013
training process. Data augmentation
function helped to replace the issues with