Image Detection and Segmentation Using YOLO v5 For
Image Detection and Segmentation Using YOLO v5 For
DOI: 10.54254/2755-2721/8/20230109
2
mohanapriyas.cse@kongu.edu
Abstract. Segmentation an advancement of object detection where bounding boxes are placed
around object in object detection whereas segmentation is used to classify every pixel in the
given image. In Deep Learning, Yolov5 algorithm can be used to perform segmentation on the
given data. Using YOLOv5 algorithm objects are detected and classified by surrounding the
objects with the bounding boxes. Compared to the existing algorithms for segmentation,
YOLOv5 algorithm has improved time complexity and accuracy. In this paper YOLOv5
algorithm is compared with the existing CNN algorithm.
1. Introduction
© 2023 The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(https://creativecommons.org/licenses/by/4.0/).
160
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230109
information may be preserved. The final findings suggest that the upgraded YOLO V5 approach
enhances performance.[3]
1.3. Segmentation
Segmentation an advancement of object detection is used to detect and classify objects in the image.
Instance segmentation is applied in many real time applications like self driving cars, agriculture,
medical systems etc. CNN one of the important object detection frameworks is used for detecting
objects in the image. All the object detection and segmentation frameworks were developed based on
this CNN algorithm. One such detection algorithm is the YOLOv5 algorithm. YOLOv5 algortihm is
proved to be the state-of-the-art algortihm for segmentation of objects in the image.[4]
2. Literature review
Image segmentation has gotten a lot of attention recently as one of the successful applications of
categorising objects and applying masks for the item present in the image. The study addresses over a
hundred deep learning-based segmentation algorithms proposed through 2019 and includes the most
recent research on instance segmentation [5]. We present a thorough examination and analysis of
several elements of these approaches, including training data, network architecture selection, loss
functions, training state, and major contributions. We give a comparison of the performance of the
approaches under consideration, as well as many obstacles and possible future directions for deep
learning-based instance segmentation models. YOLO established a single unified architecture for
breaking go picture into bounding boxes and calculating class probabilities for each box, in
comparison to object identification approaches that came before it, such as R-CNN. As a result,
YOLO was able to execute significantly faster and with greater precision. It may also properly
anticipate artwork. [6]
Object Detection aims to construct a general object recognition network, complex degradation
methods including noise, blurring, rotating, and cropping of images were applied. The model's
generalisation and robustness were improved by employing degraded training sets during training. The
study found that the model's generalisation and resilience when used on damaged images were weak
when trained on standard sets. After training the model with damaged images, average accuracy
increased. It was demonstrated that the wide degenerative model outperformed the conventional model
in terms of average accuracy for degraded images.
The YOLO Network Model says an improved network model is developed and a new network
structure known as YOLO-R has been proposed to boost the network's capacity to extract information
from superficial pedestrian characteristics by including pass through layers into the original YOLO
network. The INRIA data collection's test set had been used to assess YOLO v2 and YOLO-R network
models. Compared to YOLO v2 network model , YOLO-R network model performs better. The real-
time performance criterion was met when the detecting frame rate increased to 25 frames per second.
Solder Joint Recognition and Detection in Automotive Door Panels, a solder joint recognition
method based on the YOLO algorithm that gives the kind and location of solder joints in real time for
automobile door panels. In order to more easily identify tiny patch crossings, this study applies the
YOLO approach, which employs staggered forecasts, expecting on many size highlight guides, and
merging the expectation outcomes to form the final conclusion. The proposed YOLO approach
successfully locates solder connections in real time. This increases the productivity of the production
line and is crucial for the flexible and real-time welding of vehicle door panels.
Though many works have been proposed to address the problem of object detection and
segmentation, still a research gap available to improve accuracy in this area. This paper focuses to use
YOLO v5 algorithm for object detection and segmentation to improve that gap.
161
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230109
3. Proposed work
Input Images
Image Extraction
Yolov5 Algortihm
162
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230109
4.1. Dataset
A COCO dataset of nearly 10-20 lakhs that has already been trained by using predefined functions is
used to assess the proposed work. The dataset images are frame-by-frame trained. From the COCO
dataset, we took 5000 images for testing. The MS COCO dataset offers a sizable dataset for object
recognition and instance segmentation, both of which were used to test several deep learning
techniques. Figure 2 demonstrate an example input image and output from the dataset.
4.2. Accuracy
Accuracy is used to measure how the model performs for different classes of objects. It is the ratio
between total number of correct predictions to the total number of predictions made. The Yolo V5 and
CNN algorithms' degrees of accuracy are displayed in Table 1.
163
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230109
Figure 3 shows the accuracy level comparison of Yolo V5 and CNN. In the figure, we can see that
Yolo V5 performs better than CNN.
1000 93 37
2000 54 43
3000 84 18
4000 95 25
The comparison of time complexity between Yolo V5 and CNN is shown in Figure 4. The figure
shows.
164
Proceedings of the 2023 International Conference on Software Engineering and Machine Learning
DOI: 10.54254/2755-2721/8/20230109
References
[1] Sathishkumar, V. E., Cho, J., Subramanian, M., & Naren, O. S. (2023). Forest fire and smoke
detection using deep learning-based learning without forgetting. Fire Ecology, 19(1), 1-17.
[2] Subramanian, M., Cho, J., Sathishkumar, V. E., & Naren, O. R. (2023). Multiple types of
Cancer classification using CT/MRI images based on Learning without Forgetting powered
Deep Learning Models. IEEE Access.
[3] Kogilavani, S. V., Sathishkumar, V. E., & Subramanian, M. (2022, May). AI Powered COVID-
19 Detection System using Non-Contact Sensing Technology and Deep Learning
Techniques. In 2022 18th International Conference on Distributed Computing in Sensor
Systems (DCOSS) (pp. 400-403). IEEE.
[4] Shanmugavadivel, K., Sathishkumar, V. E., Kumar, M. S., Maheshwari, V., Prabhu, J., &
Allayear, S. M. (2022). Investigation of Applying Machine Learning and Hyperparameter
Tuned Deep Learning Approaches for Arrhythmia Detection in ECG Images. Computational
& Mathematical Methods in Medicine.
[5] Krishnamoorthy, N., Prasad, L. N., Kumar, C. P., Subedi, B., Abraha, H. B., & Sathishkumar,
V. E. (2021). Rice leaf diseases prediction using deep neural networks with transfer learning.
Environmental Research, 198, 111275.
[6] Easwaramoorthy, S., Sophia, F., & Prathik, A. (2016, February). Biometric Authentication
using finger nails. In 2016 international conference on emerging trends in engineering,
technology and science (ICETETS) (pp. 1-6). IEEE.
165