Real Time Object Detection Using Deep Learning
Real Time Object Detection Using Deep Learning
https://doi.org/10.22214/ijraset.2022.45355
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com
Abstract: Visually impaired people have difficulty moving safely and independently, which interferes with normal indoor and
outdoor work and social activities. Similarly, they have a hard time identifying the basics of the environment. This paper presents
a model for detecting the brightness and key colors of real-time images using the RGB method with an external camera and
identifying basic objects and face recognition from human datasets.[2]. Object detection is a department of pc imaginative and
prescient that appears for times of lexical entities in photographs and videos. The gadget makes use of the ESP-32 Cam's digital
digicam to continuously seize severa frames, which can be sooner or later converted to audio segments. In this project, we use
the You Only Look Once V3 (YOLO v3)algorithm, which runs thru a version of a really complex Convolutional Neural Network
structure with OpenCV. Then with the aid of using the usage of Google Text to Speech, we convert the photo to textual content
and afterwards textual content - to - speech for the visually impaired individual. Thus, the Visually Impaired individual receives
the place of the gadgets withinside the digital digicam's view through audio. Distance calculation is aided with the aid of using
an ultrasonic sensor. By the usage of The amassed consequences show that the proposed prototype is a hit in presenting visually
impaired customers with the cappotential to realise surprising settings the usage of a user-pleasant machine that integrates this
unique item detection Model.[1]
I. INTRODUCTION
A big variety of people stay on this global with the inadequacies of know-how nature due to visible weakness. In spite of the reality
that they could create optional approaches to deal and manipulate every day schedules, they revel in positive course problems in
addition to social clumsiness. For example, it's far tough for them to find a particular room in a brand new situation. Furthermore,
dazzle and outwardly debilitated people assume that it is tough to inform whether or not an person is conversing with them or
another.
Object recognition was noteworthy Direction and focus of computer research Vision applicable to automatic vehicles, Robotics,
video surveillance and pedestrians recognition. Disclosure of deep learning
Technology has changed the traditional way Object identification and object recognition. Depth Neural networks have powerful
feature representations Image processing capacity, usually used as follows: Object recognition feature extraction module. No special
model is required for deep learning models Handmade features and can be designed that way Classifier and regression device.
therefore, Deep learning technology is very important With object recognition. Problem of Object detection is designed to determine
where an object is It's actually in a specific frame (object) Localization) and detect. So pipeline Mainly shared traditional object
recognition model In three stages: Beneficial area selection, Feature extraction and recognition.
1251 1251
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com
IV. METHODOLOGY
The first step in the use of ESP32-CAM along with Tensorflow.js is to become aware of the gadgets that make up the net web page
wherein the belief occurs. To use the Tensorflow Javascript library, we want to comply with those steps: first import the Tensorflow
JavaScript libraries, then load the version, on this assignment the COCO-SSD educated ML version might be make used And create
labels for processed gadgets, that are displayed the use of the COCO-SSD version at the enter video of the recognized gadgets
through drawing rectangles across the gadgets.
A. Components
1) ESP32-CAM
Fig.2.ESP32-CAM
The board is powered via way of means of an ESP32-S SoC from Espressif, a powerful, programmable MCU with out-of-the-
container WIFI and Bluetooth.It’s the cheapest (around $7) ESP32 dev board that gives an onboard digital digicam module,
MicroSD card support, and 4MB PSRAM on the identical time.Adding an outside Wifi antenna for sign boosting calls for greater
soldering.
2) Ultrasonic Sensor
Fig.3.Ultrasonic Sensor
HCSR04 Ultrasonic Sensor is used in this project. The distance of the object is calculated with the time delay between the
transmitter and the receiver.
1252 1252
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com
3) FTDI232 Module
Fig.4.FTDI232
FTDI USB to TTL serial converter modules are used for widespread serial applications. It is popularly used for communique to and
from microcontroller improvement forums which includes ESP-01s and Arduino micros, which do now no longer have USB
interfaces.
V. PROPOSED SYSTEM
The system will convert image to text and then text to speech by using COCO-SSD algorithm that runs through the Convolutional
Neural Network architecture called the Darknet with TensorFlow.JS and Google Text to Speech. Then it converts the annotated text
into audio responses and give the location of the objects in the camera’s view. The system will continuously capture multiple frames
using a camera on ESP32-CAM and the frames then converted to audio segment. The ultrasonic sensor detects the distance of the
object from the device. For better communication we have to add small antenna whichprovides us better WIFI range and stability.
VII. RESULT
The performance of the object detection model is
Based on the precision and recall of each person being evaluated The best bounding box for known objects in images.
1253 1253
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com
VIII. CONCLUSION
The proposed model will prove to be highly beneficial for VI people. Further refinements in the project will bring even more
accurate results while achieving its main goal of being cheaper and user-friendly.
REFERENCES
[1] An Assistive Model for Visually Impaired People using YOLO and MTCNN Proceedings of the 3rd International Conference on Cryptography, Security and
Privacy - ICCSP '19, 2019 Ferdousi Rahman, Israt Jahan Ritun, Nafisa Farhin, Jia Uddin
[2] Object Detection With Deep Learning: A Review January 2019 IEEE Transactions on Neural Networks and Learning Systems PP(99):1-21 Zhong-Qiu Zhao ,
Peng Zheng, Shou-Tao Xu, and Xindong Wu Impaired U.S. Patent No. 9,488,833.8 Nov. 2016.
[3] Ce Li, Yachao Zhang and YanyunQu, “Object Detection Based on Deep Learning of Small Samples,” International Conference, pp.1-6, March 2018.
[4] Cong Tang, YunsongFeng, Xing Yang, Chao Zheng and Yuanpu Zhou, “The Object Detection Based on Deep Learning,” International Conference, pp.1-6,
2017.
[5] Christian Szegedy, Alexander Toshev and Dumitru Erhan, “Deep Neural Networks for Object Detection,” IEEE, pp.1-9, 2007
[6] Xiaogang Wang, “Deep Learning in Object Recognition, Detection, and Segmentation,” IEEE, pp.1-40, Apr. 2014.
[7] Shuai Zhang, Chong Wang and Shing-Chow Chan, “New Object Detection, Tracking, and Recognition Approaches for Video Surveillance Over Camera
Network,” IEEE SENSORS JOURNAL, vol. 15, no.69, pp. 1-13, May 2015.
[8] Malay Shah and Prof. RupalKapdi, “Object Detection Using Deep Neural Networks,” International Conference, IEEE, pp.1-4, 2017.
[9] Xiao Ma, Ke Zhou and JiangfengZheng, “Photo Realistic Face Age Progression/Regression Using a Single Generative Adversarial Network,”
Neurocomputing, Elsevier B.V., pp.1-16,July 2019.
[10] Zhong-Qiu Zhao, PengZheng, Shou-taoXu and Xindong Wu, “Object Detection with Deep Learning: A Review,” IEEE, pp.1-21, 2019.
[11] Sandeep Kumar, AmanBalyan and ManviChawla, “Object Detection and Recognition in Images,” IJEDR, pp.1-6, 2017.
[12] Adami Fatima Zohra, SalmiKamilia, Abbas Faycal and SaadiSouad, “Detection And Classification Of Vehicles Using Deep Learning,” International Journal of
Computer Science trends and technology(IJCST), vol. 6, pp. 1-7, 2018.
1254 1254
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 |