Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

MASK R-CNN FOR FIRE DETECTION

2021, IRJCS:: AM Publications,India

Object detection has an increasing amount of attention in recent years due to its wide range of applications and recent technological breakthroughs. Deep learning is the state-of-art method to perform object detection. This task is under extensive investigation in both academics and real-world applications such as security monitoring, autonomous driving, transportation surveillance, drone scene analysis, robotic vision, etc., It is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images or videos. It not only provides the classes of the objects in an image but also localizes them in that particular image. The location is given in the form of bounding boxes or centroids. Instance segmentation may be defined as the technique that gives fine inference separately for each object by predicting labels for every pixel of that object in the input image. Each pixel is labeled according to the object class within which it is enclosed. We deal with Mask Region-Based Convolutional Neural Network (Mask R-CNN) to implement instance segmentation and detection of fire in a video or an image which can be used in real-world such as automatic fire extinguisher and alert systems. The training was done using Mask R-CNN for object detection with ResNet-101 backbone, with a 0.001 learning rate and 2 images per GPU. With this, the proposed framework can detect fire using Mask Region-Based Convolutional Neural Network and can send immediate alert to the user if fire is detected

International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives MASK R-CNN FOR FIRE DETECTION Sk Razeena Begum, Assistant Pr ofessor CSE Dept Andhr a Loyola Institute of Engineer ing and Technology, Jaw ahar lal Nehr u Technological Univer sity - Kakinada r azeenabegum@gmail.com S Yogananda Datta , M S V Manoj Depar tment of Computer Science and Engineer ing, Andhr a Loyola Institute of Engineer ing and Technology, Jaw ahar lal Nehr u Technological Univer sity - Kakinada datta.seethepalli557@gmail.com, saimanoj9542@gmail.com Publication History Manuscr ipt Refer ence No: IRJCS/ RS/ Vol.08/ Issue07/ JLCS10082 Received: 13, July 2021 Accepted: 20, July 2021 Published: 23, July 2021 DOI: https:/ / doi.or g/ 10.26562/ ir jcs.2021.v0807.003 Citation: Begum, S. R., Datta, S. Y. & Manoj, M. S. V. (2021). MASK R-CNN FOR FIRE DETECTION. Inter national Resear ch Jour nal of Computer Science, VIII, 145-151. doi: https:/ / doi.or g/ 10.26562/ ir jcs.2021.v0807.003 Peer -r eview : Double-blind Peer -r eview ed Editor : Dr .A.Ar ul Law r ence Selvakumar , Chief Editor , IRJCS, AM Publications, India Copyr ight: © 2021 This is an open access ar ticle distr ibuted under the ter ms of the Cr eative Commons Attr ibution License; w hich Per mits unr estr icted use, distr ibution, and r epr oduction in any medium, pr ovided the or iginal author and sour ce ar e cr edited. Abstract: Object detection has an incr easing amount of attention in r ecent year s due to its w ide r ange of applications and r ecent technological br eakthr oughs. Deep lear ning is the state-of-ar t method to per for m object detection. This task is under extensive investigation in both academics and r eal-w or ld applications such as secur ity monitor ing, autonomous dr iving, tr anspor tation sur veillance, dr one scene analysis, r obotic vision, etc., It is a computer technology r elated to computer vision and image pr ocessing that deals w ith detecting instances of semantic objects of a cer tain class (such as humans, buildings, or car s) in digital images or videos. It not only pr ovides the classes of the objects in an image but also localizes them in that par ticular image. The location is given in the for m of bounding boxes or centr oids. Instance segmentation may be defined as the technique that gives fine infer ence separ ately for each object by pr edicting labels for ever y pixel of that object in the input image. Each pixel is labeled accor ding to the object class w ithin w hich it is enclosed. We deal w ith Mask Region-Based Convolutional Neur al Netw or k (Mask R-CNN) to implement instance segmentation and detection of fir e in a video or an image w hich can be used in r eal-w or ld such as automatic fir e extinguisher and aler t systems. The tr aining w as done using Mask R-CNN for object detection w ith ResNet-101 backbone, w ith a 0.001 lear ning r ate and 2 images per GPU. With this, the pr oposed fr amew or k can detect fir e using Mask Region-Based Convolutional Neur al Netw or k and can send immediate aler t to the user if fir e is detected Keywords: detection cnn mask fir e based I. INTRODUCTION Like many other computer vision pr oblems, ther e still isn’t an obvious or even “best” w ay to appr oach object detection pr oblems, meaning ther e’s still much r oom for impr ovement. Deep lear ning has been a r eal game-changer in machine lear ning, especially in computer vision. In a similar w ay, those deep lear ning models have cr ushed other classical models on the task of image classification. Deep lear ning models ar e now state of the ar t in object detection as w ell. Ther e ar e few models in deep lear ning pr eviously used for object detection. Most notably is the R-CNN, or Region-Based Convolution Neur al Netw or ks, and the most r ecent technique called Mask R-CNN that is capable of achieving state-of-the-ar t r esults on a r ange of object detection tasks. Instance Segmentation is the task of pixels identification of an object in an image With r apid economic development, the incr easing scale and complexity of constr uctions has intr oduced gr eat challenges in fir e contr ol. Ther efor e, ear ly fir e detection and alar m w ith high sensitivity and accur acy ar e essential to r educe fir e losses. How ever , tr aditional fir e detection technologies, like smoke and heat detector s, ar e not suitable for lar ge spaces, complex buildings, or spaces w ith many distur bances. Due to the limitations of the above detection technologies, missed detections, false alar ms, detection delays, and other pr oblems often occur , making it even mor e difficult to achieve ear ly fir e w ar nings. ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-145 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives Recently, image fir e detection has become a hot topic of r esear ch. The technique has many advantages such as ear ly fir edetection, high accur acy, flexible system installation, and the capability to effectively detect fir es in lar ge spaces and complex building str uctur es. It pr ocesses image data fr om a camer a by algor ithms to deter mine the pr esence of a fir e or fir e r isk in images. Ther efor e, the detection algor ithm is the cor e of this technology, dir ectly deter mining the per for mance of theimage fir e detector Labeling the specific pixels in the image that comes to each distinguished object instead of using coar se bounding boxes dur ing object r ecognition and localization is an extension of object detection. This har der type of pr oblem is commonly r efer r ed to as object segmentation. The R-CNN, or Region-Based Convolutional Neur al Netw or k, is cr eated by Ross Gir shick, et al. And it is a family of convolutional neur al netw or k models intended for object detection. Accor ding to Ross Gir shick, Mask R-CNN cover s Faster R-CNN pr edicting segmentation masks is an added br anch for each (ROI) Region of Inter est, similar to the cur r ent br anch for bounding box r egr ession and classification show n in (Fig 1). One of the paper s using MASK R-CNN is the Fr uit detection for str aw ber r y har vesting r obot, accor ding to Yang Yu the paper deter mines enhanced r obustness and univer sality for hidden and over lapping fr uits, and those under changeable illumination. Fig. 1. Mask R-CNN Fir e detection and segmentation Fig.2. Fir e Accident Fig. 3. Mask R-CNN detection and instance segmentation II. METHODOLOGY Mask R-CNN (Fig 3) adopts the same tw o-stage pr ocess, the fir st stage is (RPN) Region Pr oposal Netw or k, and the second stage is similar to the pr ediction of class and box offset, each r egion of inter est (RoI) is given a binar y mask by Mask R-CNN. ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-146 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives A. Data gathering Fir e images w er e gather ed fr om sever al images and videos uploaded to the inter net. Captur ing Fir e accident images w as taken from CCTV footage. We gather ed over 200 Fir e images in the mor ning and after noon per iods. The Fir e images w er e conver ted in JPEG for mat. B. Annotation and Construction of data set 200 Fir e images w er e gather ed and selected for validation and tr aining. 80% of the images ar e for the tr aining set and for the validation set is 20%. To cir cumvent individualities w ithin the Fir e images, it w as guar anteed that the data set compr ised Fir e images under var ious natur al settings. Another 30 Fir e testing images w er e gather ed for the model evaluation for ver ifying the tr ained model stability and r eliability. Gener ating masking imagesof a fir e w as done using the VGG Image annotation tool. Masking images w er eused in calculating the r ever se loss in the tr aining and model optimization. Tr ained model per for mance for instance segmentation w as assessed by associating the estimated outcome of the mask w ith the annotated masking images. Fig.3 show s the labeled Fir e images. C. Training the Model We used Mask R-CNN for Fir e detection w ith a ResNet-101 backbone, pr e- tr ained on the COCO dataset While tr aining the images to under go the pr ocess of augmentation so that ther e is no pr oblem in an insufficient dataset. The tr aining w as done w ith 0.001 lear ning r ate and 2 images per GPU, dur ing10 epochs. Fig 4 show s that ever y after of an epoch the tr aining loss, Mask loss, RPN loss ar e decr easing. Fig. 4. Model Tr aining Loss and Validation Loss D. Mask R-CNN detection model Fig. 3 show s that the fr amew or k of Mask R-CNN is divided into thr ee stages. Fir st, the extr acted input Fir e images featur e maps to the suppor t netw or k. Second, the r egion pr oposal netw or k (RPN) that pr oduces the r egion of inter est (ROIs) coming fr om the featur e maps for med by the backbone. Thir d, the fully convolutional netw or k (FCN) that gets the extr acted cor r esponding tar get featur es coming fr om the r egion pr oposal netw or k then per for ms tar get classification and segmentation. The outputs of this stage ar egener ating classification scor es, segmentation masks, and bounding boxes. 1. Mask RCNN Prediction Running the pr ediction of the model as show n in fig 5.The image is fr om the tr aining images and pick r andomly. Fig. 5. Mask Region-Based CNN Model Pr ediction ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-147 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives 2. Region Proposal Network It gives object scor es and a lot of anchor s (boxes) over the images that r un a lightw eight binar y classifier . Anchor s w ith high or positive object scor es ar e under gone to stage tw o to be classified. 3. RPN Targets It is the tr aining values for the RPN. To make the tar gets, the RPN begins w ith a gr id of anchor s that cover the full image at differ ent scales show s in Fig. 6, and then it calculates the Inter section-over -Union of the anchor s w ith gr ound tr uth object. Fig. 6. RPN Pr ediction 4. Proposal Classification The classifier heads on pr oposals to gener ate class pr obabilities and bounding box r egr essions. Fig. 7 show s that w e detect 1000 Valid pr oposals out of 1000, 23 PositiveROIs and [('BG', 977), (‘Fir e’, 23)]. Fig. 7. Region Pr oposal Netw or k Pr ediction Fig 8 show s applying bounding box r efinement. Applying filter low coincidence for detection and Non-Max Suppr ession figur e 5 w ill be the final output of detection. Fig. 8. Bounding Box ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-148 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives 5. Generating MASK Detections take in this stage; it is a r efined bounding boxesand class IDs coming fr om the pr evious layer . It cr eates segmentation masks show s in Fig. 9 for ever y instance. Fig. 9. MASK III. RESULT AND DISCUSSION The study w as per for med on Google Colab, having a single 12GB NVIDIA Tesla K80 GPU and under the deep lear ning development fr amew or k of Matter por t. Dur ing the tr aining, Over 200 Fir e images w er e gather ed and selected for validation and tr aining. 80% of the images ar efor the tr aining set and for the validation set is 20%. Another 30 Fir e testing images w er e gather ed for the model evaluation to ver ify the tr ained model stability and r eliability. All Fir e images w er e detected, mar ked w ith tar get categor y scor es, separ ate using segmentation masks and dr ew a bounding line. Fig. 10 show s Fir e w as able to detect using the Mask R-CNN. TableI w as the summar y table of the 30 testing images if the Mask R-CNN w ill detect the Fir e. Testing and Result table fir st column is thetr ial number , the second column tells if Detected or Not Detected and the thir d column is the accur acy of detection. TABLE I. Testing and Result Tr ial Detected Accur acy 1 2 Yes Yes 99.6% 99.6% 3 4 Yes Yes 95.6% 99.8% 5 Yes 6 7 8 9 10 Yes Yes Yes Yes Yes 99.8% 99.7% 99.5% 99.5% 99.8% 99.5% 11 Yes 97.5% 12 Yes 97.6% 13 Yes 98.6% 14 Yes 99.9% 15 16 Yes Yes 99.3% 99.6% 17 Yes 99.8% 18 19 20 21 22 23 24 25 26 27 28 29 30 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes 98.0% 99.7% 99.8% 98.3% 99.6% 98% 99.8% 99.7% 99.9% 96.1% 98.5% 99.7% 97.4% ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-149 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives Fig. 10. Testing Images CONCLUSION Object Detection using Mask R-CNN and instance segmentation can be applied to the Fir e. Mask R-CNN w itha ResNet101 backbone, the tr aining w as done using 0.001 lear ning r ate and 2 images per GPU, dur ing 10 epochs. Based on the study, 200 pedestr ian cr ossw alk images w er e gather ed and selected for validation and tr aining. 80% of the images ar e for the tr aining set and for the validation set is 20%. Another 30 Fir e testing images w er e gather ed for the model evaluation to ver ify the tr ained model stability and r eliability. All 30 testing images had been detected and the accur acy of the detections is gr eater than 97%. If ther e is 2or mor e Fir e occur r ences in an image, it w ill make the color of MASK differ ent to each detection show n in figur e 9. The summar y test r esults ver ified that all gather ed data w as higher than 97% to be able to detect Fir e. With this, thepr oposed fr amew or k can detect Fir e using MASK R-CNN. ACKNOWLEDGMENT This r esear ch pr oject w as par tially suppor ted by the Depar tment of Computer Science and Engineer ing, Andhr a Loyola Institute of Engineer ing and Technology, Jaw ahar lal Nehr u Technological Univer sity. We ar e gr ateful to Associate Pr ofessor Mr s.SK. Razeena begum for leading us to develop and contr ibute a paper to the confer ence. REFERENCES Beer y, Sar a, et al. "Context r-cnn: Long ter m tempor al context for per -camer a object detection." Pr oceedings of the IEEE/ CVF Confer ence on Computer Vision and Patter n Recognition. 2020. 2. Felzenszw alb, Pedr o F., et al. "Object detection w ith discr iminatively tr ained par tbased models." IEEE tr ansactions on patter n analysis and machine intelligence 32.9 (2009): 1627-1645. 3. Malbog, Mon Ar jay. "MASK R-CNN for Pedestr ian Cr ossw alk Detection and Instance Segmentation." 2019 IEEE 6th Inter national Confer ence on Engineer ing Technologies and Applied Sciences (ICETAS). IEEE, 2019. 4. Yu, Yang, et al. "Fr uit detection for str aw ber r y har vesting r obot in non-str uctur al envir onment based on MaskRCNN." Computer s and Electr onics in Agr icultur e 163 (2019): 104846. 5. Gir shick, Ross, et al. "Region-based convolutional netw or ks for accur ate object detection and segmentation." IEEE tr ansactions on patter n analysis and machine intelligence 38.1 (2015): 142-158. 6. Cholakkal, Hisham, et al. "Object counting and instancesegmentation w ith imagelevel super vision." Pr oceedings of the IEEE/ CVF Confer ence on Computer Vision and Patter n Recognition. 2019. 7. Benjdir a, Bilel, et al. "Car detection using unmanned aer ial vehicles: Compar ison betw een faster r -cnn and yolov3." 2019 1st Inter national Confer ence on Unmanned Vehicle Systems-Oman (UVS). IEEE, 2019. 8. Cr oitor u, Ioana, Simion-Vlad Bogolin, and Mar ius Leor deanu. "Unsuper vised lear ning fr om video to detect for egr ound objects in single images." Pr oceedings of the IEEE Inter national Confer ence on Computer Vision. 2017. 9. Cholakkal, Hisham, et al. "Object counting and instance segmentation w ith imagelevel super vision." Pr oceedings of the IEEE/ CVF Confer ence on Computer Vision and Patter n Recognition. 2019. 10. Gir shick, Ross, et al. "Region-based convolutional netw or ks for accur ate object detection and segmentation." IEEE tr ansactions on patter n analysis and machine intelligence 38.1 (2015): 142-158. 11. Ren, Shaoqing, et al. "Faster R-CNN: tow ar ds r eal-time object detection w ith r egion pr oposal netw or ks." IEEE tr ansactions on patter n analysis and machine intelligence 39.6 (2016): 1137-1149. 12. Yan, Junjie, et al. "Object detection by labeling super pixels." Pr oceedings of the IEEE Confer ence on Computer Vision and Patter n Recognition. 2015. 1. ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-150 International Research Journal of Computer Science (IRJCS) Issue 07, Volume 08 ( July 2021) ISSN: 2393-9842 https:/ / www.irjcs.com/ archives 13. Wang, Guangting, et al. "Cascade mask gener ation fr amew or k for fast small object detection." 2018 IEEE Inter national Confer ence on Multimedia and Expo (ICME). 14. IEEE, 2018 15. Xu, Lele, et al. "Leaf instance segmentation and counting based on deep object detection and segmentation netw or ks." 2018 Joint 10th Inter national Confer ence on Soft Computing and Intelligent Systems (SCIS) and 19th Inter national Symposium on Advanced Intelligent Systems (ISIS). IEEE, 2018. 16. Dhyakesh, S., et al. "Mask R-CNN for Instance Segmentation of Water Bodies fr om Satellite Image." 2nd EAI Inter national Confer ence on Big Data Innovation for Sustainable Cognitive Computing. Spr inger , Cham, 2021. ----------------------------------------------------------------------------------------------------------------------------------------© 2014-21, IRJCS-All Rights Reser ved Page-151