Project Review - 2
Project Review - 2
Project Review - 2
MACHINE LEARNING
Step2: Resize each region to specific size (224x224) and run independently through CNN
to predict class scores and bounding box transform
Introduction
Faster RCNN is an object detection
architecture presented by Ross Girshick,
Shaoqing Ren, Kaiming He and Jian
Sun in 2015, and is one of the famous
object detection architectures that uses
convolution neural networks like YOLO
(You Look Only Once) and SSD
( Single Shot Detector).
Let’s explain how this architecture
works,
Faster RCNN is composed from 3 parts
• Part 1 : Convolution layers
• In this layers we train filters to extract the appropriate features the image, for example let’s
say that we are going to train those filters to extract the appropriate features for a human
face, then those filters are going to learn throught training shapes and colors that only exist
in the human face.
• so we can assimilate convolution layers to coffee filters , coffee filter don’t let the coffee
powder pass to the cup so our convolutions layer that learn the object features and don’t let
anything else pass, only the desired object.
• Coffee powder + Coffee liquid = Input image
• Coffee filter = CNN filters
• Coffee liquid = Last feature map of the CNN
• Let’s talk more about Convolution neural networks,
• Convolution networks are generally composed of Convolution layers, pooling layers and a
last component wich is the fully connected or another extended thing that will be used for
an appropriate task like classification or detection.
We compute convolution by sliding filter all along our input image and the result is a
two dimension matrix called feature map.