Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Object Detection With Deep Learning

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Object detection with deep learning

Object Detection with deep learning SUBMITTED TO: MR. B. SURESH SUBMITTED BY: .1 •
HIMANSHU MAURYA(9917102004) SUSHANT SHRIVASTAVA(9917102023) BHUVNESH
KUMAR BHARDWAJ(9917102028)
INTRODUCTION TO OBJECT DETECTION ○ Object detection is scanning and .1 .2 •
searching for an object in an image or a video. Fig. 1 Object detection
Literature Review. • Object detection is a common term for computer vision techniques .3 •
classifying and locating objects in an image. Modern object detection is largely based on use
of convolutional neural networks Some of the most relevant system types today are Faster
R-CNN, R-FCN, Multibox Single Shot Detector (SSD) and YOLO (You Only Look Once) [1].
Original R-CNN method worked by running a neural net classifier on samples cropped from
images using externally computed box proposals (=samples cropped with externally
computed box proposals; feature extraction done on all the cropped samples). This approach
was computationally expensive due to many crops. • Single Shot Multibox Detector (SSD)
differs from the R-CNN based approaches by not requiring a second stage per-proposal
classification operation. This makes it fast enough for real-time detection applications.
However, this comes with a price of reduced precision . “SSD with MobileNet” refers to a
.model where model meta architecture is SSD and the feature extractor type is MobileNet
Generic object detection ● Generic object detection aims at locating and classifying .2 .4 •
existing object in any one image and labelling them with rectangular BBs to show the
confidences of existences. Fig. 2 Generic object detection
Basic architecture of CNN Convolutional Neural Network (CNN) is a Deep Learning .3 .5 •
algorithm which can take in an input image, assign importance to various aspects/objects in
the image and be able to differentiate one from the other.[2] Fig. 3 Basic architecture of CNN
Building the CNN ● Convolution ● Polling ● Flattening .4 .6 •
Convolution ● Convolution preserves the spatial relationship between pixels by 4.1 .7 •
learning image features using small squares of input data. FIG. 4.1 Convolution
POOLING ● It reduces the dimensionality of each feature map but retains the most 4.2 .8 •
important information. FIG. 4.2 POOLING
FLATTENING ● Here the matrix is converted into a linear array so that to input it into 4.3 .9 •
the nodes of our neural network. FIG. 4.3 FLATTENING
Dataset & Preprocessing COCO stands for Common Objects in Context, this dataset .5 .10 •
contains around 330K labelled images. COCO is a large-scale object detection,
segmentation, and captioning dataset.[3] 5.1 Features of dataset · Object segmentation ·
Recognition in context · 330K images (>200K labeled) · 1.5 million object instances · 80
object categories · 91 stuff categories 5.2 Data Preprocessing ● Since the model is pre
.trained, there is no need for data Preprocessing
What is SSD? ● SSD(Single Shot Detector) is a is designed for object detection in .6 .11 •
.real-time. FIG 5. Single Shot Detector
Object detection using SSD algorithm. ● It is a three steps Process: 1. Region .7 .12 •
Proposal 2. Feature Generation 3. Classification FIG. 6 Object detection using SSD
SSD FRAMEWORK ● Multi-scale feature maps for detection. ● Convolutional .8 .13 •
predictors for detection. ● Default boxes and aspect ratios. FIG. 7 SSD FRAMEWORK
Feature extraction ● In this stage ,each region proposal is warped or cropped into a .9 .14 •
fixed resolution and the SSD module is utilized to extract features. FIG. 8 Feature extraction
Classification and Localization ● Classify each region using MobileNet V1 .10 .15 •
Architecture for each category by passing feature vector created from feature extraction and
scored region are then adjusted with bounding box regression. ● This architecture uses
depthwise separable convolutions which significantly reduces the number of parameters
when compared to the network with normal convolutions. FIG. 9 Depth Wise Separable
Convolution
MobileNet V1 Architecture ● It uses Separable Convolution to reduce the model .11 .16 •
size and complexity. ● Smaller model size: Fewer number of parameters. ● Smaller
complexity: Fewer Multiplications and Additions (Multi-Adds). Fig. 10 MobileNet V1
Architecture
Advantages of MobileNet V1 Architecture ● The main advantages is their accuracy .12 .17 •
in image recognition problem. ● It takes less time. ● Improve the quality of candidate
.bounding boxes
Tools And Libraries ● Anaconda — Anaconda is a free and open source distribution .13 .18 •
of the Python and R programming languages for data science and machine learning related
applications. ● Spyder — Spyder is an open source cross-platform IDE for scientific
programming in the Python language. ● Tensorflow — TensorFlow is an open-source
software library for dataflow programming across a range of tasks. ● NumPy- NumPy is a
Python package which stands for ‘Numerical Python’. It is the core library for scientific
computing, which contains a powerful n-dimensional array object, provide tools for
integrating C, C++ etc. ● Matplotlib- Matplotlib is a Python 2D plotting library which produces
publication quality figures in a variety of hardcopy formats and interactive environments
across platforms. ● Urllib - Urllib is a Python module that can be used for opening URLs. It
defines functions and classes to help in URL actions. With Python you can also access and
.retrieve data from the internet like XML, HTML, JSON, etc
References 1. Zhong-Qiu Zhao , Member, IEEE, Peng Zheng, Shou-Tao Xu, and .19 •
Xindong Wu , Fellow, IEEE(2016) 2. https://medium.com/@RaghavPrabhu/understanding-of-
convolutional-neural-network-cnn-deep-learning- 99760835f148 3.
http://cocodataset.org/#home LINKS TO FIGURES:- 1. 2.
https://towardsdatascience.com/going-deep-into-object-detection-bed442d92b34 3.
https://medium.com/datadriveninvestor/convolutional-neural-network-cnn-simplified-
ecafd4ee52c5 4. https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-
neural-networks-the-eli5-way-3bd2b1164a53 5. https://www.researchgate.net/figure/The-
architecture-of-Single-Shot-Multibox-Detector-SSD-It-considers- only-two- stage-
by_fig9_327491507 6. Wei Liu1, Dragomir Anguelov2, Dumitru Erhan3, Christian Szegedy3,
Scott Reed4, Cheng-Yang Fu1, Alexander C. Berg1(2016) 7.
Sermanet,P.,Eigen,D.,Zhang,X.,Mathieu,M.,Fergus,R.,LeCun,Y.: Overfeat:Integrated
recognition, localization and detection using convolutional networks. In: ICLR. (2014) 8.
https://towardsdatascience.com/cnn-application-on-structured-data-automated-feature-
extraction-8f2cd28d9a7e 9. https://towardsdatascience.com/a-comprehensive-guide-to-
convolutional-neural-networks-the-eli5-way- 3bd2b1164a53
10.https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-
/cnn-deep-learning- 99760835f148 https://machinethink.net/blog/object-detection
.THANK YOU .20 •

You might also like