Object Classification Using CNN
Object Classification Using CNN
OBJECT CLASSIFICATION
USING CNN
OBJECT CLASSIFICATION USING CNN - STEPS OBJECT CLASSIFICATION USING CNN - STEPS
INVOLVED : INVOLVED :
COMMON DATASET USED ARE:
• STEP1: Data Collection and Preparation:
• Gather a labeled dataset containing images of the objects you want to MNIST (Modified National Institute of Standards and Technology) is a well-
classify. known dataset used in Computer Vision that was built by Yann Le Cun et al. It is
• Split the dataset into training, validation, and test sets. Common splits include composed of images that are handwritten digits (0-9), split into a training set
70-80% for training, 10-15% for validation, and 10-15% for testing of 50,000 images and a test set of 10,000, where each image is 28 x 28 pixels in
width and height.
OBJECT CLASSIFICATION USING CNN - STEPS OBJECT CLASSIFICATION USING CNN - STEPS
INVOLVED :
INVOLVED :
COMMON DATASET USED ARE:
• STEP2 – DATA AUGMENTATION (OPTIONAL):
Data augmentation techniques like rotation, scaling, flipping, and
The Imagenet dataset consists of 1000 object categories, organized according to cropping can be applied to increase the diversity of training data and
WordNet hierarchy.
improve the model's generalization.
Resize images
Jitter color
Warp images
Simulate noise
Simulate blur
Crop images
15-10-2023
OBJECT CLASSIFICATION USING CNN - STEPS OBJECT CLASSIFICATION USING CNN - STEPS
INVOLVED : INVOLVED :
• Step 3 – Preprocessing: • Step 4 – Build the CNN Model
It helps in training stabililty • Design the architecture of your CNN model. Common architectures include
VGG, ResNet, Inception, or custom designs.
Resize the images to a consistent input size (eg: 224x224 pixel) • Specify the number of convolutional layers, filter sizes, pooling layers, and
fully connected layers.
• Add activation functions like ReLU (Rectified Linear Unit) after convolutional
layers.
• AlexNet was the first convolution Network which used GPU(Graphics Processing Unit) to boost
performance.
• A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate
and alter memory to accelerate the creation of images in a frame buffer intended for output
to a display device.
• AlexNet architecture consists of 5 convolutional layers, 3 max-pooling layers, 3 fully
connected layers, and 1 softmax layer.
• Each convolutional layer consists of convolutional filters and a nonlinear activation function
ReLU.
• The pooling layers are used to perform max pooling.
• Input size is fixed due to the presence of fully connected layers.
• The input size is mentioned at most of the places as 224x224x3 but due to some padding which
happens it works out to be 227x227x3
15-10-2023
Residual Block
• H(x) is underlying mapping
• F(x)+x can be realized by feedforward neural networks with “shortcut
connections” known as identity mapping
• shortcut allows the gradient to be directly backpropagated to earlier layers
• It not creates any extra parameters and computation
• Training achieved by backpropagation
F(x) := H(x) – x
Fitting Residual:
H(x) =F(x)+x
15-10-2023
Inception module
• design a good local network topology (network within a network) and then
stack these modules on top of each other
Naïve Inception module
• Used 9 Inception modules in the whole architecture
15-10-2023
Mathematically Disadvantages
1.When the input is negative, the output is always zero, which can lead to the “dead neuron” problem where the
neuron stops learning and does not contribute to the model’s performance
2. ReLU is not a smooth function, which can cause some optimization algorithms to fail
Activation functions –
Exponential rELU Activation functions –
Exponential rELU
It is a smooth and continuous function that allows ADVANTAGES:
1.It can help to reduce the bias shift and avoid overfitting in neural networks.
negative values. 2. It has been shown to outperform other activation functions like ReLU and its variants in cases such as
regression (where output should take negative values), with imbalanced data (ELU can help prevent the
Mathematically vanishing gradient problem when some inputs have very large positive or negative values)
3. It is a smooth and continuous function, which can help more in the convergence of gradient-based
optimization algorithms.
4. ELU can help to avoid the dead neuron problem that can occur with ReLU activation function.
DISADVANTAGES
1. The exponential function used in the function can be computationally expensive.
2. The value of alpha needs to be carefully chosen to balance the advantages of the function.
Loss Function
1. Regression loss
2. Classification Loss – Binary and Multi-class Classification
15-10-2023
2. Mean Square Logarithmic Error (MSLE) • Cross-entropy is the default loss function to use for binary classification problems. It is intended for use with
It measures the ratio between actual and predicted using logarithmic values. It is a good choice to predict the binary classification where the target values are in the set {0, 1}.
continuous data.
• 2. Hinge Loss
• An alternative to cross-entropy for binary classification problems is the hinge loss function, primarily developed for
3. Mean Absolute Error (MAE) use with Support Vector Machine (SVM) models. It is intended for use with binary classification where the target
Absolute Error for each training example is the distance between the predicted and the actual values, irrespective of values are in the set {-1, 1}.
the sign. Absolute Error is also known as the L1 loss:
• 3. Squared Hinge Loss
• It is an extension of Hinge Loss. It is mainly used for categorical prediction or yes/no kind of decision problems.