Abstract
Wearing of surgical face masks has become the new norm of our daily life in the context of the COVID-19 pandemic. Under many conditions at various public places, it is necessary to check or monitor whether the face mask is worn properly. Manual judgement of mask wearing not only wastes manpower but also fails to monitor it in a way of all-time and real-time, posing the urge of an automatic mask wearing detection technology. Earlier automatic mask wearing methods uses a successive means in which the face is detected first and then the mask is determined and judged followingly. More recent methods take the end-to-end paradigm by utilizing successful and well-known CNN models from the field of object detection. However, these methods fail to consider the diversity of face mask wearing, such as different kinds of irregularity and spoofing. Thus, we in this study introduce a comprehensive mask wearing detection dataset (named as Diverse Masked Faces) by distinguishing a total of five different classes of mask wearing. We then adapt the YOLOX model for our specific task and further improve it using a new composite loss which merges the CIoU and the alpha-IoU losses and inherits both their advantages. The improved model is referred as YoloMask. Our proposed method was tested on the new dataset and has been proved to significantly outperform other SOTA methods in the literature that are either successive or end-to-end.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
BalAzy, A., Toivola, M., Adhikari, A., Sivasubramani, S.K., Reponen, T., Grinshpun, T.: Do n95 respirators provide 95 viruses, and how adequate are surgical masks? Am. J. Infect. Control 34(2), 51–57 (2006)
MacIntyre, C.R., Cauchemez, S., Dwyer, D.E., Seale, H., Cheung, P., Ferguson, N.M.: Face mask use and control of respiratory virus transmission in households. Emerg. Infect. Dis. 15(2), 233–241 (2009)
WHO: The WHO coronavirus (COVID-19) dashboard [EB/OL] (2022). https://covid19.who.int/
Feng, S., Shen, C., Xia, N., Song, W., Fan, M., Cowling, B.J.: Rational use of face masks in the covid-19 pandemic. Lancet Resp. Med. 8(5), 434–436 (2020)
Abboah-Offei, M., Salifu, Y., Adewale, B., Bayuo, J., Ofosu-Poku, R., Opare-Lokko, E.B.A.: A rapid review of the use of face mask in preventing the spread of covid-19. Int. J. Nurs. Stud. Adv. 3, 100013 (2021)
Spitzer, M.: Masked education? the benefits and burdens of wearing face masks in schools during the current corona pandemic. Trends Neurosci. Educ. 20, 100138–100138 (2020)
Sabetian, G., et al.: Covid-19 infection among healthcare workers: a cross-sectional study in Southwest Iran. Virol. J. 18(1), 58 (2021)
Wang, B., Zheng, J., Chen, C.L.P.: A survey on masked facial detection methods and datasets for fighting against covid-19. IEEE Trans. Artif. Intell. 3(3), 323–343 (2022)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: 2001 CVPR, vol. 1, pp. 511–518 (2001)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 CVPR, vol. 1, pp. 886–893 (2005)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 CVPR, pp. 1–8 (2008)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 CVPR, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: 2015 ICCV, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE TPAMI 39(6), 1137–1149 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 2017 ICCV, pp. 2980–2988 (2017)
Pramanik, A., Pal, S.K., Maiti, J., Mitra, P.: Granulated rcnn and multi-class deep sort for multi-object detection and tracking. IEEE Trans. Emerg. Topics Comput. Intell. 6(1), 171–181 (2022)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020)
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding yolo series in 2021 (2021)
Yang, Q., Lan, Z.: Mask wearing specification detection based on cascaded convolutional neural network. In: 7th International Conference on Systems and Informatics, pp. 1–6 (2021)
Zhao, Y., Geng, S.: Object detection of face mask recognition based on improved faster RCNN. In: bin Ahmad, B.H., Cen, F. (eds.) 2nd International Conference on Computer Vision, Image, and Deep Learning, vol. 11911, pp. 145–152. International Society for Optics and Photonics, SPIE (2021)
Nithin, A., Jaisharma, K.: A deep learning based novel approach for detection of face mask wearing using enhanced single shot detector (ssd) over convolutional neural network (cnn) with improved accuracy. In: 2022 International Conference on Business Analytics for Technology and Security (ICBATS), pp. 1–5 (2022)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv e-prints (2018)
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: CSPNet: a new backbone that can enhance learning capability of cnn. In: 2020 CVPR Workshops, pp. 1571–1580 (2020)
Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE TPAMI 37(9), 1904–16 (2014)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: 2018 CVPR, pp. 8759–8768 (2018)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: fully convolutional one-stage object detection. In: 2019 ICCV, pp. 9626–9635 (2019)
Ge, Z., Liu, S., Li, Z., Yoshie, O., Sun, J.: Ota: optimal transport assignment for object detection. In: 2021 CVPR, pp. 303–312 (2021)
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: faster and better learning for bounding box regression. In: AAAI Conference on Artificial Intelligence, pp. 12993–13000 (2020)
HE, J., Erfani, S., Ma, X., Bailey, J., Chi, Y., Hua, X.S.: \({\backslash }\)alpha-iou: a family of power intersection over union losses for bounding box regression. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) NIPS, vol. 34, pp. 20230–20242. Curran Associates, Inc. (2021)
Ge, S., Li, J., Ye, Q., Luo, Z.: Detecting masked faces in the wild with lle-cnns. In: 2017 CVPR, pp. 426–434 (2017)
Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: 2016 CVPR, pp. 5525–5533 (2016)
Tzutalin: Labelimg (2015). https://github.com/tzutalin/labelImg Git code
Singh, S., Ahuja, U., Kumar, M., Kumar, K., Sachdeva, M.: Face mask detection using YOLOv3 and faster R-CNN models: COVID-19 environment. Multimedia Tools Appl. 80, 1–16 (2021)
Acknowledgments
We greatly acknowledge the financial supports from the Natural Science Foundation of China (NSFC No. 61906149), the Natural Science Basic Research Program of Shaanxi (Program No. 2021JM-136), the Natural Science Foundation of Chongqing (cstc2021jcyj-msxmX1068), the Xi’an Science and Technology Program (No. 21RGSF0011) and the Fundamental Research Funds for the Central Universities (No. QTZX22072).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cao, Z., Li, W., Zhao, H., Pang, L. (2022). YoloMask: An Enhanced YOLO Model for Detection of Face Mask Wearing Normality, Irregularity and Spoofing. In: Deng, W., et al. Biometric Recognition. CCBR 2022. Lecture Notes in Computer Science, vol 13628. Springer, Cham. https://doi.org/10.1007/978-3-031-20233-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-20233-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20232-2
Online ISBN: 978-3-031-20233-9
eBook Packages: Computer ScienceComputer Science (R0)