Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
74 views

Object Detection Using AI

This internship report discusses object detection using artificial intelligence. The intern worked under the guidance of Mrs. Pallavi H V at the Government Engineering College in Hassan. The report describes conducting single object and multiple object detection on vehicle datasets using CNN and YOLOv3 models. Model performance was analyzed using metrics like accuracy, precision, and mAP. Objects were also tracked across video frames using YOLOv3 and SORT algorithms. The algorithms demonstrated real-time, accurate identification of objects suitable for traffic applications.

Uploaded by

Prarthana A D
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

Object Detection Using AI

This internship report discusses object detection using artificial intelligence. The intern worked under the guidance of Mrs. Pallavi H V at the Government Engineering College in Hassan. The report describes conducting single object and multiple object detection on vehicle datasets using CNN and YOLOv3 models. Model performance was analyzed using metrics like accuracy, precision, and mAP. Objects were also tracked across video frames using YOLOv3 and SORT algorithms. The algorithms demonstrated real-time, accurate identification of objects suitable for traffic applications.

Uploaded by

Prarthana A D
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

BELGAUM- 590018

An Internship Report on

“OBJECT DETECTION USING AI”


Submitted in partial fulfillment of requirement for the award of the degree of

BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by

NACHITHA Y K (4GH18EC025)

Under the Guidance of


Mrs. PALLAVI H V B.E, M.Tech
Associate Professor
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
GOVERNMENT ENGINEERING COLLEGE,
DAIRY CIRCLE, HASSAN-573201
2021-22
GOVERNMENT ENGINEERING COLLEGE, HASSAN-573201
Department of Electronics and Communication Engineering

CERTIFICATE
This is to certify that the internship work entitled “OBJECT DETECTION USING AI” is
bonafide work carried by NACHITHA Y K(4GH18EC025). In partial fulfillment for the award
of degree Bachelor of Engineering in Electronics and Communication Engineering in
Government Engineering College, Hassan Visveswaraya Technological University,
Belagaum-590014 during the year 2020-21.
It is certified that all the corrections/suggestions indicated for the internal assessment has been
approved as it satisfies the academic requirements with respect of internship work prescribed for
the said degree.
……………………….. ………………………….. .........……………….
Guide HOD Principal
Mrs. Pallavi H V Mr. Neelappa Dr. Prashanth S

Associate Professor Head of the Department Principal


Dept. of E&CE Dept. of E&CE GEC, Hassan
GEC, Hassan GEC, Hassan

External Viva-Voce

Name Of the Examiner Signature

1)

2)
ABSTRACT

Data is the new oil in current technological society. The impact of efficient data has changed
benchmarks of performance in terms of speed and accuracy. The enhancement is visualizable
because the processing of data is performed by two buzzwords in industry called Computer Vision
(CV) and Artificial Intelligence (AI). Two technologies have empowered major tasks such as object
detection and tracking for traffic vigilance systems. As the features in image increases demand for
efficient algorithm to excavate hidden features increases. Convolution Neural Network (CNN)
model is designed for urban vehicle dataset for single object detection and YOLOv3 for multiple
object detection on KITTI and COCO dataset. Model performance is analyzed, evaluated and
tabulated using performance metrics such as True Positive (TP), True Negative (TN), False Positive
(FP), False Negative (FN), Accuracy,
Precision, confusion matrix and mean Average Precession (mAP). Objects are tracked across the
frames using YOLOv3 and Simple Online Real Time Tracking (SORT) on traffic surveillance video.
This paper upholds the uniqueness of the state of the art networks like DarkNet. The efficient
detection and tracking on urban vehicle dataset is witnessed. The algorithms give real-time, accurate,
precise identifications suitable for realtime traffic applications.

(i)
ACKNOWLEDGEMENT
I present with an immense pleasure, this work titled “OBJECT DETECTION USING AI”. I
express our heartful thanks to our beloved Principal, Dr. Prashanth S, GEC, Hassan for his
encouragement throughout our studies.

At the outset I express our most sincere thanks to Mr. Neelappa ,Head of the department ,
Department of E&CE, for his continuous support and advice not only during the course of our
internship work but also during the period of our stay in GECH.

I express my gratitude towards our internship guide and internship head Mrs. Pallavi H V,
Associate Professor, Department of E&CE, for her encouragement and support throughout our
work.

Finally I express my thanks to all teaching staff of Dept. of E&CE, fellow classmates and my
parents for their timely support and suggestions.

I was conscious of the fact that I received co-operation in many ways from the teaching and non-
teaching staff of the Department of Electronics and Communication Engineering and grateful to
all their co-operation and their guidance in completing our task well in time. We thank one and
all who have been helped me one way or the other in completing our internship on time.

NACHITHA Y K (4GH18EC025)

(iii)
TABLE OF CONTENTS

ABSTRACT ................................................................................................................................ i

ACKNOWLEDGEMENT ............................................................................................................. ii

TABLE OF CONTENTS .............................................................................................................. iii

Chapter 1 : Company Profile ......................................................................................................... 1

1.1 Vision ................................................................................................................................ 1

1.2 Mision ............................................................................................................................... 1

1.3 Services .............................................................................................................................. 1

CHAPTER 2 : About the Department ........................................................................................... 2

CHAPTER 3 .................................................................................................................................. 3

Preamble ...................................................................................................................................... 3

3.1 Introduction ........................................................................................................................ 3

3.2 Methodology ...................................................................................................................... 4

Chapter 4 ........................................................................................................................................ 5

Overveiw of the project ................................................................................................................ 9

4.1 Block Diagram .................................................................................................................... 9

4.2 Types Of Object Detection ...............................................................................................10

Chapter 5 ...................................................................................................................................... 15

Simulation results and analysis ..................................................................................................... 15

5.1 Single object detection ..........................................................................................................16

5.2 Results .................................................................................................................................. 19

Chapter 6 .................................................................................................................................... 20

(iii)
6.1 Applications ........................................................................................................................ 20

6.2 Advantages and Disadvantages........................................................................................... 20

Conclusion ................................................................................................................................ 21

Reference ................................................................................................................................. 22

(iii)
Object Detection Using AI 2021-2022

CHAPTER 1

COMPANY PROFILE
Loginware Softtec Pvt. Ltd is an emerging startup established in the year 2016 and
based in Hassan, tier II city of Karnataka State. Loginware is a knowledge-driven company
that values cutting edge technology practices and provides comprehensive solutions to help our
customers achieve their goals. Loginware is changing the world by changing the way
knowledge can be shared. Loginware has the dedicated young minds striving to connect
individuals with each other and with technology. Loginware Sofftec Pvt. Ltd. is a proactive
player covering the full spectrum of software services, from design, development,
implementation, Validation, support and corporate training.

Loginware is supported by a strong, committed team delivering quality work. As a


diverse end-to-end solutions provider, offers a range of expertise aimed at helping customers
re-engineer and re-invent their businesses to compete successfully in an ever-changing
marketplace, with the final objective of giving clients the competitive edge in the marketplace.

1.1 Vision

To be a leading technology company, transforming creative ideas into reality.

1.2 Mission
Bringing out the best in everyone we touch, motivate, inspire and empower each other
to do things they never thought were possible.

1.3 Services
Loginware is the one stop partner for all the technology needs of tier II and tier III cities.
An in-depth knowledge of various technology areas enables us to provide end to end
solutions and services. With our 'Web of Participation', we maximize the benefits of
our depth, diversity and delivery capability.

Dept Of E&CE. GEC Hassan 1


Object Detection Using AI 2021-2022

CHAPTER 2

ABOUT THE DEPARTMENT


The training program designed and delivered by Loginware Softtec Pvt. Ltd
simulates a near-real work environment across various sectors: both IT and non-IT.
Students in their final leg of engineering studies or qualified candidates looking for
placement in reputed organizations can make use of this program to get trained to deliver
their best in the selection processes of organizations .the participants will be trained
thoroughly in the following areas:

1. Skilling, Up skilling and Reskilling program

2. People Proficiency

3. Seminars & Workshops

4. Internship Program

5. Project Guidance

6. Certification Oriented Training Programs

7. Placement Support

8. Recruitment drives and Industry Tie-r

Dept Of E&CE. GEC Hassan 2


Object Detection Using AI 2021-2022

CHAPTER 3

PREAMBLE

3.1 INTRODUCTION

Over the past years domains like image analysis and video analysis has gained a wide scope
of applications. CV and AI are two main technologies dominating technical society.
Technologies try to depict the biology of human. Human vision is the sense through which a
perception of outer 3D world is perceived. Human Intelligence is trained over years to
distinguish and process scene captured by eyes. These intuitions acts as a crux to budding
new technologies. Rich resource is now accelerating researchers to excavate more details
form the images These developments are due TO State of the-art methods like CNN.
Applications from Google, Facebook, Microsoft, and Snapchat are all results of tremendous
improvement in Computer vision and Deep learning. During time, the vision-based
technology has transformed from just a sensing modality to intelligent computing systems
which can understand the real world. Computer vision applications like vehicle navigation,
surveillance and autonomous robot navigation find Object detection and tracking as
important challenges. For tracking vehicles and other real word objects, video surveillance
is a dynamic environment. In this paper, efficient algorithm is designed for object detection
and tracking for video Surveillance in complex environment.

Object detection and tracking goes hand in hand for computer vision applications. Object
detection is identifying object or locating the instance of interest in-group of suspected frames.
Object tracking is identifying trajectory or path; object takes in the concurrent frames. Image
obtained from dataset is, collection of frames. Basic block diagram of object detection and
tracking is shown in Fig. 1. Data set is divided into two parts. 80 % of images in dataset are
used for training and 20 % for testing. Image is considered to find objects in it by using
algorithms CNN and YOLOv3. A bounding box is formed across object with Intersection over
union (IoU) > 0.5. Detected bounding box is sent as references for neural networks aiding them
to perform Tracking. Bounded box is tracked in concurrent frames using Multi Object Tracking
(MOT). Importance of this research work is used to estimate traffic density in traffic junctions,

Dept Of E&CE. GEC Hassan 3


Object Detection Using AI 2021-2022

in autonomous vehicles to detect various kinds of objects with varying illumination, smart city
development and intelligent transport systems [18]. Organization of paper is, Section II

identifies research gap through extensive literature survey. Section III covers Fundamental
Concepts of Object detection and Tracking. Section IV describes design, implementation
details and specifications. Section V discusses simulation results and analysis. Section VI
describes conclusions and future scope

3.1 METHODOLOGY

A. Convolutional Neural Networks (CNN)

CNN is widely used neural network architecture for computer vision related tasks. Advantage
of CNN is that it automatically performs feature extraction on images i.e. important features
are detected by the network itself.

Fig. 3.1. Object Detection Tasks

Fig.3.2:Overview of CNN Architecture

CNN is made up of three important components called Convolutional Layer, Pooling layer,
fully connected Layer as shown in Fig. 3.2. Considering a gray scale image of size 32*32

Dept Of E&CE. GEC Hassan 4


Object Detection Using AI 2021-2022

would have 1024 nodes in multi-layer approach. This process of flattening pixels loses spatial
positions of the image. Spatial relationship between picture elements is retained by learning
internal feature representation using small squares of input data.

• Convolutional layer: Convolutional Layer encompasses filters and feature maps.


Filters are processors of a particular layer. These filters are distinct from one
another. They take pixel value as input and gives out feature Map. Feature map is
output of one filter layer. Filter is traversed all along the image, moving one pixel
at a time. Activation of few neurons takes place resulting in a feature map.
• Pooling layer: Pooling layer is employed to reduce dimensionality. Pooling layers
are included after one or two convolutional layer to generalize features learnt from
previous feature maps. This helps in reducing chances of over fitting from training
process.
• Fully connected layer: Fully connected layer is used at the end to assign the feature
to class probability after extracting and consolidating features from Convolutional
Layer and pooling later respectively. These layers use linear activation functions
or softmax activation function.

B. You Only Look Once (YOLOv3)


YOLO version 1 and 2 applies softmax functions convert score into probabilities. This
approach is feasible when objects are mutually exclusive only. YOLOv3 employs multi label
classification. Independent logistic classifier is used to calculate likeliness of input belong to
a specific label. Loss is calculated using binary-cross entropy of each label. Since we omit the
softmax function complexity is reduced.

1) Optimization of Bounding Boxes: By using Logistic, regression YOLO v3 predicts


the score of presence of object. A ground truth box is defined to all objects, if anchor box
overlaps the most with ground truth box then objectness score is said to be 1. For the anchor
boxes whose overlap is greater than the preselected threshold, the anchor box incurs null cost.
Every ground truth box is mapped with only one anchor box. If anchor box is not selected and
assigned to bounding box then no classification and localization loss is considered, only
confidence loss is calculated.

Dept Of E&CE. GEC Hassan 5


Object Detection Using AI 2021-2022

The anchor box is regressed to the ground truth box by gradual optimization as shown in Fig.
4. Coordinate parameters are now defined as

( (1)

(2)

(3)

(4)

Where, are the predictions made by YOLO. is top left corner of grid cell of
the anchor. are the width and height of anchor. are predicted boundary box.
is box confidence score.

2) Feature pyramid Network (FPN): YOLOv3 makes three predictions in every


point of image. The prediction includes a bounding box, score of objectness followed by 80
class score hence we have S*S*[3*(4+1+80)] predictions. This approach is similar to feature
pyramid networks.

Fig.3.3:Anchor Box Regression

Fig.3.4: Feature Pyramid Network

Dept Of E&CE. GEC Hassan 6


Object Detection Using AI 2021-2022

Predictions are made at 3 different scales as in Fig. 5. The initial prediction is made at last
feature map layer. Then feature map is up sampled by factor of 2. YOLOv3 merges feature
map with up sampled feature using element wise addition. Convolutional layer is applied to
obtain second predictions. Repeating second prediction will yield high semantic information.

Two stage algorithms from Region Proposal networks family of algorithms have two different
networks for proposing regions and extracting features. FPS of RCNN is 7, which is quite low
to handle real-time applications. One stage algorithm overcomes this drawback by employing
single shot detectors. Single Shot detectors face trade-off between accuracy and real-time
processing. The algorithm faces issues in identifying small objects or objects that are too close.
Though SSD networks are equally in boom as much as YOLO, algorithm might outperform
YOLO in terms of speed, but spatial resolution has dropped significantly and hence missing
out in locating small objects. Solution to challenge is increasing image resolutions. YOLO
family upgrades its accuracy, latency. YOLOv3 has DarkNet-53 has its backbone. The network
has less BFLOP (Billion floatingpoint operations) compared to residual Network-512. The
inclusion of Feature Pyramid network (FPN) helps in detecting objects that are small. FPN uses
both bottom-down and a topdown pathway. Bottom-up approach is used for feature extraction.
As we propagate through this approach, spatial resolution minimizes. Semantic value for each
layer increases. -----

C. Object Tracking
Internet is the main network connecting millions of people in world. Main entertainment factor
and the source of greater knowledge is video. Video is collection of frames. The negligible time
gap between frames makes the stream of photos looks like flow of scenes. When designing
algorithm for video processing. Videos are classified into two classes. Video stream is an
ongoing process for video analysis. The processor is not aware of future frames. Video
sequence is video of fixed length. All the consecutive frames are obtained prior to processing
of current frame. Motion is distinct factor that differentiates video form frame. Motion is a
powerful visual Que. Object properties and action can be realized by noticing only sparse points
in the image.

Dept Of E&CE. GEC Hassan 7


Object Detection Using AI 2021-2022

D. Simple Online Real Time Tracking (SORT)


SORT is a realistic approach to achieve Multi Object Tracking (MOT). Performance of SORT
is enhanced by ques such as appearance; this association of appearance to SORT enhances the
performance of SORT and increases performance during Scenario like longer periods of
occlusion. SORT is a framework that has Kalman filtering has its crux. Image by image data
association is achieved by Hungarian method over an association metric like appearance that
measures bounding box overlap.

1) Track Handling and state estimation: The assignment problem maps prediction of
Kalman filter to that of newly arrived measurements. The task of associating two vectors is
performed by Hungarian algorithm. Adding additional information like motion and appearance
parameters in conjunction with association helps in better mappings.

(5)

Unlikely association can be removed by thresholding at 95% confidence interval. The decision
is given with an indicator.

(6)

When the motion uncertainty is large mahalanobis distance is not suitable, hence another metric
to aid in association. Metric computes appearance descriptor for each bounding box detection
dj.

(7)

Combination of both metrics is

(8)

Dept Of E&CE. GEC Hassan 8


Object Detection Using AI 2021-2022

CHAPTER 4
OVERVIEW OF THE PROJECT
4.1 BLOCK DIAGRAM

Fig 4.1: Block Diagram of Object and Tracking

There is a wide range of computer vision tasks benefiting society such as object classification,
detection, tracking, counting, Semantic Segmentation, Captioning image, etc. Process of
identifying objects in an image and finding its position is known as object detection.

Various object detection tasks as shown in Fig. 2. With advancements in field of computer
vision assisted by AI, realization of tasks was realizable along t time scale. Semantic
segmentation task of clustering pixels based on similarities. Classification + Localization and
object detection method of identifying class of object and drawing a bounding box around it to
make it distinct. Instance segmentation is semantic segmentation applied to multi objects. The
general intuition to perform the task is to apply CNN over the image. CNN works on image
patches to carry out the task many such salient regions can be obtained by Region-Proposal
Networks like Region Convolution Neural network (RCNN), Fast- Region
Convolutional Neural Network (Fast-RCNN), Faster- Region Convolutional Neural Network
(Faster-RCNN). To perform selective search for object recognition Hierarchal Grouping

Dept Of E&CE. GEC Hassan 9


Object Detection Using AI 2021-2022

Algorithm is used. Few bottlenecks by these approaches are mitigated by state-of the-art
algorithms like You Only Look Once (YOLO), Single shot Detector (SSD). The efficient object
detection algorithm is one which assures to give bounding box to all objects of vivid size to be
recognized, with great computational capabilities, faster processing. YOLO and SSD assure to
render promising results, but have a tradeoff between speed and accuracy. Hence, selection of
algorithm is application specific.

4.2 TYPES OF OBJECT DETECTION

• Single Object Detection


Fig. 8 shows flow chart of single object detection. Necessary libraries are imported first and
training data is given input via the Google drive. Google-Colab, an online simulation tool for
python and Tensor-flow algorithms was used. The algorithm then compiles data and learns
form it in a supervised manner.

Fig 4.2: Flow Chart Of Single Object detection

Dept Of E&CE. GEC Hassan 10


Object Detection Using AI 2021-2022

This algorithm can be described as supervised classification algorithm. Data flows through
CNN layers and various operations are performed on data. The learning rate and callbacks are
defined. Number of epochs and batch size is also defined. The epochs are then executed through
which algorithm learns through training data. Training accuracy and training losses are
constantly monitored. If training accuracy starts falling below a threshold, the callback function
is invoked and epochs are stopped. Confusion matrix is then plotted using training and testing
data. Various performance parameters can be defined and observed using the confusion matrix.

Single object detection

Dataset used – On-Road Vehicle Dataset

IMAGES USED FOR SINGLE OBJECT DETECTION

Number of classifiers Three ( Autos, Heavy,


Light)
Total number of Input 12480
Images
Training images 9984
Testing images 2496
Day images 7590
Evening images 4518
Night images 372

Fig 4.3: Sample of Evening Images

Dept Of E&CE. GEC Hassan 11


Object Detection Using AI 2021-2022

Fig 4.4: Sample Of Day Images

Fig. 4.5: Sample Of Night Images

Dept Of E&CE. GEC Hassan 12


Object Detection Using AI 2021-2022

• Multiple Object Detection


This describes working of YOLOv3 multiple object detection algorithm. An image is given as
the input to algorithm and transformation is done using CNN. These transformations are done
so that, input image is compatible to specifications of algorithm. Following this, flattening
operation is performed. Flattening is converting data into a 1dimensional array for inputting it
to next layer. Flattening of output of convolutional layers is to create a single long feature
vector and it is connected to final classification model, which is called a fully connected layer.
By changing the score threshold, one can adjust how the ML model assigns these labels.

Object detection pipeline has one component for generating proposals for classification.
Proposals are nothing but candidate regions for object of interest. Most of approaches employ
a sliding window over feature map and assigns foreground/background scores depending on
features computed in that window. The neighborhood windows have similar scores to some
extent and are considered as candidate regions. This leads to hundreds of proposals. As the
proposal generation method should have high recall, we keep loose constraints in this stage.
However, processing these many proposals all through the classification network is
cumbersome. This leads to a technique, which filters proposals based on some criteria called
Non-Maximum Suppression. IOU calculation is actually used to measure the overlap between
two proposals.

Number of classifiers 80
5 (Car, Bus, Truck,
Classifiers used Motor Cycle and
Train)
Total number of Input 11682
Images
Training images 9736
Testing images 1946

Dept Of E&CE. GEC Hassan 13


Object Detection Using AI 2021-2022

Fig 4.6 : Flow Chart Of Multiple Object Detection

Fig 4.7:Sample Images of KITTI Object Detection Dataset

Dept Of E&CE. GEC Hassan 14


Object Detection Using AI 2021-2022

CHAPTER 5
SIMULATION RESULTS AND ANALYSIS
This section describes simulation results and performance parameters observed are accuracy,
precision and recall. It also underlines the confusion matrices of different datasets and
convolution layers of the algorithms.
5.1 SINGLE OBJECT DETECTION
CNN is designed for single object detection. The layers and each layer information as shown
in Fig below.

It encompasses the parameters that were included in each step, layer progression and output
image size of every layer. Each layer divides the image matrix into its components and performs
an operation on image. The output image size of various layers is different due to manipulations
by each layer such as initially the output image size is 28×28 which then reduces to 14×14 due
to the max pooling layer which chooses the max valued pixel from the surrounding pixels. It
then reduces to 7×7 due to the second max pooling layer. This pixel is then flattened into
7×7×64 which are 3136 sized vector. This vector is then reduced to a less sized vector by
proceeding layers and final calculation parameters are displayed.

Designed neural network was trained and tested. Obtained training accuracy and loss as shown
in Figures . Obtained 82% training accuracy through training this model. The loss and accuracy
are inversely proportional to each other. As the number of epochs increases, learning rate
increases and hence loss decreases. Each time epochs is run, the model trains it itself and
weights of the convolution networks gets updated to a more accurate value.

The CNN is successfully able to classify the given object as truck and car with an accuracy of
75.68% and 84.409% respectively as shown in Fig below.

Upon simulation, it is able to correctly classify the vehicles by classifying a car with 79.853%
accuracy and about 78.122% accuracy for the detection of an auto as shown in Fig below.

Upon simulation, it is able to correctly classify the vehicles by classifying it into a car with
about 79.036 % accuracy and auto with about 80.064 % accuracy as shown in Fig below.

Dept Of E&CE. GEC Hassan 15


Object Detection Using AI 2021-2022

Fig 5.1: Convolution Layers used in CNN.

Accuracy and loss

Fig 5.2 : Training Accuracy

Dept Of E&CE. GEC Hassan 16


Object Detection Using AI 2021-2022

Fig 5.3 : Training Loss.

--

Fig 5.4 : Sample Simulation Results of Day Images.

Fig 5.5 : Sample Simulation Results of Evening Images.

Fig 5.6 : Sample Simulation Results of Night Images.

Dept Of E&CE. GEC Hassan 17


Object Detection Using AI 2021-2022

Confusion matrix for day images is tabulated in Table. The performance parameters are
extracted from confusion matrix and tabulated in Table. Accuracy, precision and recall data
is evident for autos, cars and heavy type of vehicles as shown in Fig. The accuracy of autos
and cars is almost identical while that of heavy vehicles is slightly better than that of others.
Since the number of training images is more for the Day images, the results obtained are better
than that of Evening and Night Dataset images. High precision indicates that, the algorithm
returned substantially more relevant results than irrelevant ones while high recall means that
an algorithm returned most of the relevant results.

PREDICTED Autos cars Heavy All


Autos 1456 28 22 1506
cars 16 1480 10 1506
Heavy 42 34 1430 1506
Table 5.1 : Confusion Matrix For Day Images

TP TN FP FN Precision Accuracy Recall


Auto 2369 4918 161 161 0.936 0.957 0.936
Cars 2413 4853 206 136 0.921 0.955 0.946
Heavy 2346 4965 114 184 0.953 0.960 0.927

Table 5.2 : Performance Metrics Of Day Images

Fig 5.7 : Performance Analysis of Day Images.

Dept Of E&CE. GEC Hassan 18


Object Detection Using AI 2021-2022

5.2 RESULT

Fig 5.8 : Result

Dept Of E&CE. GEC Hassan 19


Object Detection Using AI 2021-2022

CHAPTER 6

6.1 APPLICATIONS

1. Vehicle detection with AI in Transportation.

2. Object detection in Retail.

3. Autonomous Driving.

4. People detection in Security.

5. Animal detection in Agriculture.

6.2 ADVANTAGES AND DISADVANTAGES

1. Image processing techniques generally don't require historical data for training and are
unsupervised in nature.

Pro's: Hence, those tasks do not require annotated images, where humans labeled data
manually (for supervised training).

Con's: These techniques are restricted to multiple factors, such as complex scenarios
(without unicolor background), occlusion (partially hidden objects), illumination and
shadows, and clutter effect.

2. Deep Learning methods generally depend on supervised training. The performance is


limited by the computation power of GPUs that is rapidly increasing year by year.

Pro's: Deep learning object detection is significantly more robust to occlusion, complex
scenes, and challenging illumination.

Con's: A huge amount of training data is required; the process of image annotation is
labor-intensive and expensive. For example, labeling 500'000 images to train a custom DL
object detection algorithm is considered a small dataset. However, many benchmark datasets
(MS COCO, Caltech, KITTI, PASCAL VOC, V5) provide the availability of labeled data.

Dept Of E&CE. GEC Hassan 20


Object Detection Using AI 2021-2022

CONCLUSION
The inclusion of Artificial Intelligence to solve Computer vision tasks has outperformed the
image processing approaches of handling the tasks. The CNN model trained to on road vehicle
dataset for single object detection, achieved a validation accuracy of 95.7 % for auto, 95.5%
for car and 96 % for heavy vehicles for day images. The high validation accuracy is because of
huge amount of data on which it is trained from each class. Performance metrics are tabulated
for day, evening and NIR images. Multiple object detection is implemented using YOLOv3 for
KITTI and COCO dataset. Performance metrics is tabulated for YOLOv3 on considered classes
of images. Higher the precession value of class greater will be mAP value. The mAP value
depends on image chosen for calculation. IoU of 0.5 is ideal for detection and tracking. mAP
values can be enhanced by increasing true positive values. Results of performance metrics is
totally dependent on image data set used. Further objects are detected in video based on region
of interest. The performance measures measured such as speed and color of vehicle, type of
vehicle, direction of vehicle movement and the number of vehicles in ROI. Multiple object
tracking is implemented for traffic surveillance video using YOLOv3 and OpenCV. Multiple
objects are detected and tracked on different frames of a video. Further training the models on
powerful GPUs and by increasing the number of images evaluate the models on other datasets
and modify the design if required to make the model more robust and suitable for real-time
applications.

Dept Of E&CE. GEC Hassan 21


Object Detection Using AI 2021-2022

REFERENCE
• V. D. Nguyen et all., “Learning Framework for Robust Obstacle Detection,
Recognition, and Tracking”, IEEE Transactions on Intelligent Transportation Systems,
vol. 18, no. 6, pp. 1633-1646, June 2017
• Zahraa Kain et all, “Detecting Abnormal Events in University Areas”, 2018
International conference on Computer and Applications(ICCA),pp. 260-264, 2018.
• P. Wang et all., “Detection of unwanted traffic congestion based on existing
surveillance system using in freeway via a CNN-architecture trafficnet”, IEEE
Conference on Industrial Electronics and Applications (ICIEA), Wuhan, 2018, pp.
1134-1139.
• Q. Mu, Y. Wei, Y. Liu and Z. Li, “The Research of Target Tracking Algorithm Based
on an Improved PCANet”, 10th International Conference on Intelligent Human-
Machine Systems and Cybernetics (IHMSC), Hangzhou, 2018, pp. 195-199.
• H. C. Baykara et all., “Real-Time Detection, Tracking and Classification of Multiple
Moving Objects in UAV Videos”, 29th IEEE International Conference on Tools with
Artificial Intelligence (ICTAI), Boston, MA, 2017, pp. 945-950.
• W. Wang, M. Shi and W. Li, “Object Tracking with Shallow Convolution Feature”, 9th
International Conference on Intelligent Human-Machine Systems and Cybernetics
(IHMSC), Hangzhou, 2017, pp. 97-100.
• K. Muhammad et all., “Convolutional Neural Networks Based Fire Detection in
Surveillance Videos”, IEEE Access, vol. 6, pp. 1817418183, 2018.
• D. E. Hernandez et all., “Cell Tracking with Deep Learning and the Viterbi Algorithm”,
International Conference on Manipulation, Automation and Robotics at Small Scales
(MARSS), Nagoya, 2018, pp. 1-6.

Dept Of E&CE. GEC Hassan 22

You might also like