Fire Detection and Segmentation Using Yolov5 and U-Net: Abstract-The Environmental Crisis The World Faces
Fire Detection and Segmentation Using Yolov5 and U-Net: Abstract-The Environmental Crisis The World Faces
Fire Detection and Segmentation Using Yolov5 and U-Net: Abstract-The Environmental Crisis The World Faces
and U-NET
Wided Souidene Mseddi 1,2 Rafik Ghali Marwa Jmal Rabah Attia
1
L2TI, Institut Galilée SERCOM Laboratory, Telnet Innovation Labs SERCOM Laboratory,
Université Sorbonne Paris Nord Ecole Polytechnique de Telnet Holding Ecole Polytechnique de Tunisie
Villetaneuse, France Tunisie Ariana, Tunisia University of Carthage
2
SERCOM Laboratory University of Carthage marwa.jmal@groupe-telnet.net Tunis, Tunisia
University of Carthage Tunis, Tunisia rabah.attia@ept.u-carthage.tn
Tunis, Tunisia. rafik.ghali@ept.rnu.tn
wided.mseddi@univ-paris13.fr
Abstract—The environmental crisis the world faces Machine-learning techniques also are employed to
nowadays is a real challenge to Human Beings. One notable increase the reliability of fire detection systems. Numerous
hazard for humans and nature is the increasing number of models are used such as SVM [8] and neural networks [9].
forest fires. Thanks to the fast development of sensors and
technologies as well as computer vision algorithms, new Recently, thanks to their great performance in detecting and
approaches for fire detection are proposed. However, these identifying objects, Deep Learning (DL) approaches have
approaches face several limitations that need to be resolved, been investigated to detect and localize forest fire.
precisely, the presence of fire-like objects, high false alarm rate, Deep Learning techniques helped researchers a lot to
detection of small size fire objects, and high inference time. An extract relevant features that best represent the fire to be
important step for vision-based fire analysis is the segmentation described. Indeed, these models have been successfully used
of fire pixels. Hence, we propose, in this paper, a novel in several fields such as image classification, self-driving
architecture, combining YOLOv5 and U-net architectures, for cars, speech recognition, pedestrian detection, face
fire detection and segmentation. Using a dataset of wildland
fires mixed with fire-like object images, the experimental results
recognition, cancer detection, etc. [10-12]. For all these
proved that the novel architecture is reliable for forest fire applications, DL proved its efficiency in detecting and
detection without false alarms. segmenting different classes of objects [10-12].
For the task of detection, the newly introduced algorithm
Keywords— Forest Fires, Fire detection, Fire segmentation, named YOLOv5 [13] has proved an excellent tradeoff
Deep learning, YOLOv5, U-Net between accuracy and inference time. For the task of
segmentation, U-Net [14] has given excellent results and
I. INTRODUCTION performance on segmenting medical images.
Forest Fires (FF) are one of the most dangerous and In this paper, we present a novel fire detection system
challenging natural disasters today that can threaten based on DL using pre-trained YOLOv5 and U-Net models
humanity and the environment. FF that are not controlled, concatenated sequentially. For this purpose, we first feed the
can make huge damage with disastrous effects to human original images of fires with their annotations to the
properties and areas of vegetation. Fires affect more than 350 YOLOv5 model, then, we crop the fire class using the
million hectares every year worldwide [21]. bounding boxes obtained via the detection. Finally, we feed
To avoid this dangerous disaster, systems for detecting these cropped images to a trained U-Net model using original
and monitoring Forest Fire at the early stages are very images with their annotations and we obtain our segmented
crucial. images with their bounding lines in the same image.
The earliest fire detection systems used to detect fire More specifically, this paper makes two main
using numerous sensors such as gas detectors, smoke contributions. Firstly, we propose a novel architecture
detectors, flame detectors and temperature detectors, but capable of detecting and segmenting fires in an operationally
these techniques are not efficient in the case of forest or and time-efficient manner aiming to overcome the
wildland Fire detection. Indeed, they have smaller coverage limitations of state-of-the art techniques. Secondly, our
areas, and they do not respond in real-time. To overcome model has demonstrated its high performance on big and
these limitations, vision sensors (embedded or fixed) are the small-size fire objects and its ability to distinguish between
most useful to detect fires with high accuracy, high coverage fire and fire-like objects.
area, and less error. The remainder of the paper is organized as follows: section
Through the years, researchers have proposed many 2, briefly describes the related works of DL techniques for
techniques that allowed them to detect and segment fires fire detection. Section 3 describes the proposed deep learning
with high accuracy using image processing and computer architectures. In section 4, the implementation and the
vision methods. First, fire color features have been widely experimental results are presented. Finally, section 5
used to distinguish fire. These techniques transform the concludes the paper.
image into another color space, such as YCbCr [1] or YUV
[2], and then classify its pixels into fire or non-fire through I.RELATED WORKS
comparing pixel values to some thresholds. Nonetheless, Several fire detection methods have been proposed and
these methods are limited by the complexity to identify fire have been presented in numerous reviews [5, 15].
characteristics in the image. In this related works section, we choose to highlight recent
advances in Deep Learning techniques.
742
down sampling phase. At the final, a 1*1 layer constructs a It is important to use data augmentation techniques to
binary mask. improve the performance of our model and avoid overfitting.
This model uses a set of input fire images and their Mainly, it consists of applying transformations on the image
corresponding binary masks. such as geometrical transformations (rotation, scaling,
padding, cropping, image translation and flip translation),
During training and based on the binary mask as the desired photometric transformation (brightness, contrast, and shear),
output, the model learns how to classify each pixel of the image occlusion techniques (Mixup) and Mosaic data
images into the different object labels. For our task, we create augmentation, which combine numerous transformations for
two classes that are fire and non-fire. a single image [7].
III. IMPLEMENTATION AND RESULTS As for our problem, we chose the data augmentation
techniques based on characteristics of flame. For instance, we
In this section, we detail the implementation settings did not consider techniques using color space adjustments to
adopted to train and test our proposed techniques. Namely, keep the fire color information. Besides, we excluded rotation,
the data preparation step: collecting dataset and performing because for example we cannot find a 90 degree rotated fire in
data augmentation, the deployment of the overall real life. In addition, it is reasonable to flip the fire image
architecture, the Test Time Augmentation (TTA) module horizontally, but it would not be reasonable to flip it vertically.
and finally the results collection and the performance As in the real world, we would not be seeing many images of
analysis. fire flipped upside-down. In conclusion, we used image
translation, image scale, mosaic, mix up, and horizontal flip as
A. Data Preparation
augmentation techniques.
For fire detection problems, there is no benchmark C. Training
dataset, which makes a comparative study between DL We trained the two models using the Pytorch framework
approaches in the field a bit critical. We create our dataset, on a machine with GPU NVIDIA Tesla P100 16 GB.
which contains the Corsican fire dataset [3] and fire like Moreover, we divided our dataset into two subsets as
object images. Corsican fire database is the dataset of forest presented in Table I.
fire images collected from different research teams in the
world. It includes wildfire image sequences acquired in
TABLE I. DATASET SUBSETS
various areas, under numerous conditions like climatic
conditions, burning vegetation type, distance to fire and the Number of Number of Number of
brightness of fire. positive negative annotated bounding
images images boxes
To diversify our dataset and improve the model capability of Training set 883 107 1367
distinguishing between fire and fire-like objects, we added to
Validation set 100 15 237
the Corsican Fire database numerous images that include fire
like colored objects, such as lights, sunrise, sunset, and
firefighters clothing, in various resolutions. The newly 1. Detection Training
created dataset contains about 1300 images. They include
To train YOLOv5, the input data are PNG images and
fire, non-fire, and fire-like images with different resolutions
TXT files containing details of annotated objects. Our
and different sizes.
training was conducted using Binary Cross-Entropy with
Logits Loss function from PyTorch for loss calculation of
Fig.2 depicts a sample of the Corsican Fire dataset and
class probability and object score, a learning rate set to 0.01,
fire-like objects images containing objects having some fire
a batch size of 8, a number of epochs set to 300, and an image
characteristics like sunset, sunrise and lights.
size set to 416x416 or 1024*1024. Note that the training time
changes depending on the models since we trained the small
YOLO and the Large YOLO.
2. Segmentation Training
743
Test Time Augmentation is yet another data In table II, we present the performance of both YOLOv5s
augmentation method. While data augmentation is done (Small version) and YOLOv5x (Extra-large version) with
before or while the training of the model, this one is done and without TTA.
during the inference time. It is a simple but effective way to
avoid over fitting and optimize results. The idea is to show TABLE II. FIRE DETECTION RESULTS
different versions of the same image to the same model, take
Models TTA Recall MAP
the different outputs and extract the detected bounding boxes
and then combine the results. This is a very fast way to ON 0.842 0.732
improve the model performance (confidence of the output)
YoloV5s
without losing a precious time for data augmentation. OFF 0.805 0.686
Where TP: True positive, FN: False Negative. In fig. 3, we can see that Yolov5 has accurately detected and
localized forest fire, precisely small fire. Accordingly, the model
• MAP (Mean Average Precision) is the mean of AP. The
overcomes the confusion with fire like-objects like sunrise and
AP value of the different classes is calculated as follows: sunset.
where r and p are the recall and the precision at the mth
threshold.
2. Fire segmentation metrics
• Dice Coefficient (DC): The Dice Coefficient is a
statistical indicator that measures the similarity of two
images (predicted and ground truth images).
2 ∗ 𝑇𝑃
𝐷𝐶 = (4)
2 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁
where TP: True positive, FP: False positive and FN: False Fig. 3: YOLOv5 results
negative.
• Accuracy: is the fraction of correct predictions over the 2. Segmentation Results
number of total predictions achieved by the network.
The table III presents the best scores that we got for our
𝑇𝑃 + 𝑇𝑁 model.
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (5)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑁 + 𝐹𝑃
TABLE III. EXPERIMENTAL RESULTS OF U-NET
where TP: True positive, TN: True Negative, FP: False Model Dice coefficient Accuracy
positive and FN: False negative.
U-net 92% 99.6%
F. Experimental Results
We can see that the U-Net model achieved a great
performance (dice coefficient of 92% and accuracy of 99.6
In this section, we discuss the experimental results to %) to segment forest fire. The YOLO v3 applied to our
demonstrate the performance of the proposed method. First, database provides an accuracy of 96.8%.
we discuss the detection results. Then we present the We could attest that the strength of U-Net is its ability, not
segmentation results. Our test set consists of 185 images only to confirm the presence of the forest fire but also to
containing 221 bounding boxes. detect the precise shape of flame. In fig.4, we can see that the
1. Detection Results network accurately and precisely detects the fire and its
shape. By combining the two architectures, we achieved a
744
robust and precise forest fire detector to solve forest fire ACKNOWLEDGEMENT
detection and recognition problems.
This project is carried out under the MOBIDOC scheme, funded by
EU through the EMORI program and managed by the ANPR.
REFERENCES
[1] T. Celik and H. Demirel, “Fire detection in video sequences using a
generic color model,” Fire Safety Journal, vol. 44, no. 2, pp. 147–158,
2009.
[2] G. Marbach, M. Loepfe, and T. Brupbacher, “An image processing
technique for fire detection in video images,” Fire Safety Journal, vol.
41, no. 4, pp. 285–289, 2006.
[3] T. Toulouse, L. Rossi, A. Campana, T. Celik, M. Akhloufi.:” Computer
vision for wildre research: An evolving image dataset for processing
and analysis,” Fire Safety Journal 92, 188-194, 2017.
[4] S. Wu and L. Zhang, “Using popular object detection methods for real
time forest fire detection,” in 11th International Symposium on
Computational Intelligence and Design, 1, 280–284, IEEE, 2018.
[5] R. Ghali, M. Jmal, W. Souidene Mseddi, and R. Attia, “Recent
advances in fire detection and monitoring systems: A review,” in the
International Conference on the Sciences of Electronics, Technologies
Fig. 4: U-Net results: (a) cropped images (b) predicted images of Information and Telecommunications. Springer, 2018.
[6] Z. Jiao, Y. Zhang, J. Xin, et al., “A , 2019 based forest fire detection
approach using uav and yolov3,” in 1st International Conference on
As an example, we can see in fig. 5 that our proposed model Industrial Artificial Intelligence (IAI), 1–5, IEEE, 2019.
performs very well in detecting fire pixels and segmenting [7] C. Shorten and T. M Khoshgoftaar, “A survey on image data
fire surfaces, especially small areas (figure in the middle), augmentation for deep learning,” Journal of Big Data, vol. 6, no. 1, pp.
and it has successfully overcome the confusion with fire-like 1–48, 2019.
objects (figure in the right). These results outperform the [8] Y. Chunyu, F. Jun, W. Jinjun, et al., “Video fire smoke detection using
state-of-the-art fire detection techniques. motion and color features,” Fire Technology, 46 (3), 651–663, 2010.
[9] M. A. I. Mahmoud and H. Ren, “Forest fire detection and identification
using image processing and svm,” Journal of Information Processing
Systems, 15(1), 159–168, 2019.
[10] Z.-Q. Zhao, P. Zheng, S.-t. Xu, et al., “Object detection with deep
learning: A review,” IEEE transactions on neural networks and
learning systems, 30(11), 3212–3232, 2019.
[11] L. Liu, W. Ouyang, X. Wang, et al., “Deep learning for generic object
detection: A survey,” International Journal of Computer Vision, 128
(2), 261–318, 2020.
[12] S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz & D.
Terzopoulos. “Image segmentation using deep learning: A survey,”
arXiv preprint arXiv:2001.05566, 2020.
[13] G. Jocher, A. Stoken, J. Borovec, N. Code, C. STAN, L.
C. Laughing, T. YxNONG, A. Hogan, L. mammana, A.
Wang, A. Chaurasia, L. D. Marc, W. Haoyang, D. Durgesh, F.
Ingham, F. Guil-hen, A. Colmagro, H. Ye, J. Solawetz, J. Poznanski, J.
Fang, J. Kim, K. Doan, and L.
Yu, “ultralytics/yolov5:v4.0, PyTorch Hubintegration,” Jan. 2021
[14] O. Ronneberger, P. Fischer and T. Brox, “U-net: Convolutional
networks for biomedical image segmentation,” Springer , 2015.
[15] A. Gaur, A. Singh, A. Kumar, A. Kumar,K. Kapoor.: “Video flame and
smoke based re-detection algorithms: A literature review,” FIRE
TECHNOLOGY, 2020.
[16] W. Lee, S. Kim, Y.-T. Lee, et al., “Deep neural networks for wildfire
detection with unmanned aerial vehicles,” in IEEE International
Conference on Consumer Electronics (ICCE), 252–253, IEEE ,2017.
[17] Y. Zhao, M. Jiale, L. Xiaohui and Z. Jie, “Saliency detection and deep
Fig. 5: Results: (a) input mages (b) output models learning-based wildfire identification in UAV imagery,” Sensors, vol.
18, no. 3, p. 712, 2018.
IV. CONCLUSION [18] Z. Qingjie, X. Jiaolong, X. Liang and G. Haifeng, “Deep convolutional
neural networks for forest fire detection,” in International Forum on
Management, Education and Information Technology Application.
In this paper, we introduced a novel method of forest fire Atlantis Press, 2016.
detection and segmentation based on YOLOv5 and U-net. [19] Q.-x. Zhang, G.-h. Lin, Y.-m. Zhang, G. Xu and J.-j. Wang, “Wildland
Using Corsican Fire dataset and various fire-like objects forest fire smoke detection based on faster R-CNN using synthetic
images, we evaluated our methods. Experimental results smoke images”, Procedia engineering, vol. 211, pp. 441-446, 2018.
proved that this method is able to detect forest fires precisely [20] D. Shen, X. Chen, M. Nguyen and W. Q. Yan, “Flame detection using
deep learning,” in 4th International Conference on Control,
small fires (with small flames) in different acquisition Automation and Robotics (ICCAR). IEEE, 2018.
conditions. For future work, we aim to introduce a smoke/fire [21] D. Stav. "Fighting fire with science". Nature, vol. 576, no 7786, p. 328-
detection method based on CNN in order to identify and 329, 2019
localize both fire and smoke without false alarms.
745