Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (2 votes)
366 views

Report For Face Mask Detection Using Python and Deep Learning

This project report describes a face mask detection system using Python and deep learning. The system was developed by three students as part of their Bachelor of Technology degree in Electronics and Communication Engineering. It uses pre-trained deep learning models like ResNet50 and detects faces in images and determines if the faces are wearing masks or not. The system aims to help enforce mask wearing in public places and thereby help control the spread of COVID-19 when effective antiviral treatments are limited. It achieves high accuracy of 98.2% for mask detection when using ResNet50 and outperforms other baseline models in precision and recall.

Uploaded by

Tharun Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
366 views

Report For Face Mask Detection Using Python and Deep Learning

This project report describes a face mask detection system using Python and deep learning. The system was developed by three students as part of their Bachelor of Technology degree in Electronics and Communication Engineering. It uses pre-trained deep learning models like ResNet50 and detects faces in images and determines if the faces are wearing masks or not. The system aims to help enforce mask wearing in public places and thereby help control the spread of COVID-19 when effective antiviral treatments are limited. It achieves high accuracy of 98.2% for mask detection when using ResNet50 and outperforms other baseline models in precision and recall.

Uploaded by

Tharun Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Project Report

on

FACE MASK DETECTION USING PYTHON AND


DEEP LEARNING
Submitted in the partial fulfilment of the requirements for the award of the degree of

BACHLEOR OF TECHNOLOGY
in
Electronics and Communication Engineering [ECE]
by
J. TANVI [Roll No.18311A04L0]
K. SREEJA [Roll No.18311A04L2]
K. LIKHITHA [Roll No.18311A04L8]

UNDER THE GUIDANCE OF


Dr.P. VIKRAM
Assistant professor, Dept. of ECE

UNDER THE SUPERVISION OF


Dr. S. RAMANI
Assistant professor, Dept. of ECE

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING


SREENIDHI INSTITUTE OF SCIENCE & TECHNOLOGY
Yamnampet, Ghatkesar, Hyderabad – 501 301
2021

1
SREENIDHI INSTITUTE OF SCIENCE & TECHNOLOGY

Yamnampet, Ghatkesar, Hyderabad – 501 301

CERTIFICATE

This is to certify that the Project Work entitled “FACE MASK DETECTION USING

PYTHON AND DEEP LEARNING” being submitted by

J. TANVI [Roll No.18311A04L0]


K. SREEJA [Roll No.18311A04L2]
K. LIKHITHA [Roll No.18311A04L8]

in fulfilment for the award of Bachelor of Technology in Electronics and Communication


Engineering [ECE], Sreenidhi Institute of Science and Technology, an autonomous
institute under Jawaharlal Nehru Technological University, Telangana, is a record of
bonafide work carried out by them under our guidance and supervision. The results embodied
in the report have not been submitted to any other University or Institution for the award of
any degree or diploma.

Dr.P. Vikram Dr.S. Ramani


Internal guide Project co-ordinator
Assistant prof, ECE Assistant prof, ECE
Head of the department

Dr.S.P.V. SUBBA RAO

Professor, Department of ECE

Signature of the External Examiner

2
DECLARATION

We hereby declare that the work described in this report, entitled “FACE MASK

DETECTION USING PYTHON AND DEEP LEARNING” which is being submitted by

me in partial fulfilment for the award of Bachelor of Technology in Electronics and

Communication Engineering [ECE], Sreenidhi Institute Of Science & Technology

affiliated to Jawaharlal Nehru Technological University Hyderabad, Kukatpally, Hyderabad

(Telangana) -500085 is the result of investigations carried out by us under the Guidance of

Dr.P.Vikram, Assistant Professor, ECE Department, Sreenidhi Institute Of Science And

Technology, Hyderabad. The work is original and has not been submitted for any

Degree/Diploma of this or any other university.

Place: Hyderabad

Date: 05/01/2022 Signature:

J. TANVI [Roll No.18311A04L0]


K. SREEJA [Roll No.18311A04L2]
K. LIKHITHA [Roll No.18311A04L8]

3
ACKNOWLEDGEMENT

We hereby declare that the work described in the Project report, entitled “FACE MASK
DETECTION USING PYTHON AND DEEP LEARNING” which is being submitted by
us in partial fulfilment for the award of Bachelor of Technology in the Dept. of Electronics&
Communication Engineering, Sreenidhi Institute of Science & Technology affiliated to
Jawaharlal Nehru Technological University Hyderabad, Kukatpally, Hyderabad (Telangana)
is the work on our own effort and has not been submitted elsewhere.

We are very thankful to MR.P. VIKRAM, ECE Dep., Sreenidhi Institute of Science and
Technology, Ghatkesar for providing the necessary guidance to this group project and giving
valuable timely suggestions over the work.

We are very thankful to DR.S. RAMANI, ECE Dep., Sreenidhi Institute of Science and
Technology, Ghatkesar for providing an initiative to this group project and giving valuable
timely suggestions over the work.

We convey our sincere thanks to Dr. S.P.V. SUBBA RAO, Head of the Department (ECE),
Sreenidhi Institute of Science and Technology, Ghatkesar, for his kind cooperation in the
completion of this work.
We even convey our sincere thanks to Dr. CHAKKALAKAL TOMY, Executive Director
and Dr.T.CH.SHIVA REDDY, Principal, Sreenidhi Institute of Science and Technology,
Ghatkesar for their kind cooperation in the completion of the group project.
Finally, we extend our sense of gratitude to all our friends, teaching and non-teaching faculty,
who directly or indirectly helped us in this endeavor.

Place: Hyderabad Name of candidates

Date: 26-02-2022 J. TANVI [Roll No.18311A04L0]


K. SREEJA [Roll No.18311A04L2]
K. LIKHITHA [Roll No.18311A04L8]

4
ABSTRACT
Effective strategies to restrain COVID-19 pandemic need high attention to mitigate
negatively impacted communal health and global economy, with the brim-full horizon yet to
unfold. In the absence of effective antiviral and limited medical resources, many measures
are recommended by WHO to control the infection rate and avoid exhausting the limited
medical resources. Wearing a mask is among the non-pharmaceutical intervention measures
that can be used to cut the primary source of SARS-CoV2 droplets expelled by an infected
individual. Regardless of discourse on medical resources and diversities in masks, all
countries are mandating coverings over the nose and mouth in public. To contribute towards
communal health, this paper aims to devise a highly accurate and real-time technique that can
efficiently detect non-mask faces in public and thus, enforcing to wear mask. The proposed
technique is ensemble of one-stage and two-stage detectors to achieve low inference time and
high accuracy. We start with ResNet50 as a baseline and applied the concept of transfer
learning to fuse high-level semantic information in multiple feature maps. In addition, we
also propose a bounding box transformation to improve localization performance during
mask detection. The experiment is conducted with three popular baseline models viz.
ResNet50, Alex Net and Mobile Net. We explored the possibility of these models to plug-in
with the proposed model so that highly accurate results can be achieved in less inference
time. It is observed that the proposed technique achieves high accuracy (98.2%) when
implemented with ResNet50. Besides, the proposed model generates 11.07% and 6.44%
higher precision and recall in mask detection when compared to the recent public baseline
model published as Retina Facemask detector. The outstanding performance of the proposed
model is highly suitable for video surveillance devices.

5
CONTENTS

CHAPTER 1: INTRODUCTION
1.1 Motivation
1.2 Flow
1.3 Image processing

CHAPTER 2. BACKGROUND OF FACE MASK DETECTION


2.1 incorporated packages
2.2 TensorFlow
2.3 Keras
2.4 Open CV

CHAPTER 3: BLOCK DIAGRAM OF FACE MASK DETECTION

CHAPTER 4: SOFTWARE REQUIRED


4.1 Open CV
4.2 features of OpenCV
4.2 face detection using OpenCV

CHAPTER 5: DESIGN IMPLEMENTATION DETAILS


5.1 data pre-processing
5.2 conversion of rgb image to gray image
5.3 image reshaping

CHAPTER 6: SIMULATION AND DESIGN VERIFICATIONS


CHAPTER 7: CONCLUSION AND FUTURE WORKS
CHAPTER 8: REFERENCES

6
INTRODUCTION
COVID-19 or Corona virus is responsible for producing an atmosphere of terror as it can
transmit through the respiratory system. Currently, there is neither medicine nor vaccine to
fight against this virus. Therefore, the only options people have to maintain are the social
distancing, wash hands regularly, and wear a mask. According to the World Health
Organization (WHO)’s, official Situation Report – 205, Corona virus disease 2019 (COVID-
19) has globally infected over 20 million people causing over 0.7 million deaths. Individuals
with COVID-19 have had a wide scope of symptoms reported like shortness of breath or
difficulty in breathing. Elder people having lung disease are at higher risk of getting corona
virus than most. The importance of wearing masks lie in reducing vulnerability of risk from a
noxious individual during the “pre-symptomatic” period to restrain the spreading of the virus.
WHO stresses on prioritizing medical masks and respirators for health care assistants?
Therefore, face mask detection has become a crucial task in the present situation. Face mask
detection involves detection of the location of the face and then determines whether it has a
mask on it or not. The issue is proximately close to general object detection to detect the
classes of objects. Face identification deals with distinguishing a specific group of entities,
i.e., face. It has numerous applications, such as autonomous driving, education, surveillance,
and so on . Deep learning has been used to find out who is not wearing the facial mask using
Convolutional neural network (CNN).

FIGURE 1: Faces with and without mask

7
RELATED WORK
In face detection method, a face is detected from an image that has several attributes on it.
According to, research into face detection requires expression recognition, face tracking, and
pose estimation. Given a solitary image, the challenge is to identify the face from the picture.
Face detection is a difficult errand because the faces change in size, shape, colour, etc. and
they are not immutable. It becomes a laborious job for opaque image impeded by some other
thing not confronting camera, and so forth. Authors in think occlusive face detection comes
with two major challenges: first, unavailability of sizably voluminous datasets containing
both masked and unmasked faces, second, exclusion of facial expression in the covered area.
Utilizing the locally linear embedding (LLE) algorithm and the dictionaries trained on an
immensely colossal pool of masked faces, synthesized mundane faces, several mislaid
expressions can be recuperated and the ascendancy of facial cues can be mitigated to great
extent. According to the work reported in, convolutional neural network (CNNs) in computer
vision comes with a strict constraint regarding the size of the input image. The prevalent
practice reconfigures the images before fitting them into the network to surmount the
inhibition. In, a robust and efficient technique for liveness detection was proposed. The
authors used the deep learning Deb Net approach for feature extraction and classification. In,
the authors used SVM for proposing a machine learning based face detection and recognition
system. The proposed model was used to detect the faces of students for monitoring their
activities during online examinations. The proposed system used feature vectors from the
input images for detecting the faces in a faster manner. In, a multi-task deep learning method
called F-DR Net for recognizing and detecting was used.

FIGURE 2: Faces with and without mask

8
Image Processing

Image processing is a method to perform some operations on an image, in order to get an


enhanced image or to extract some useful information from it.

Steps to Perform Image Processing:

● Load images using Python.

● Convert images into array.

● And finally apply some algorithm on that array.

9
BACKGROUND OR LITERATURE REVIEW
Two datasets have been used in the model. Dataset 1 consists of 1376 images in which 690
images with people wearing face masks and the rest 686 images without face masks. mostly
contains front face pose with single face and with same type and colour of mask (white only).
Dataset 2 from Kaggle consists of 853 images and its countenances are clarified either with
or without a mask. some face collections are head turn, tilt and slant with multiple faces in
the frame with different types and colours of masks.
INCORPORATED PACKAGES
TensorFlow
TensorFlow, an interface for expressing machine learning (ML) algorithms, is utilized for
implementing ML systems into various areas of computer science, including sentiment
analysis, voice recognition, geographic information extraction, computer vision, text
summarization, information retrieval, computational drug discovery and flaw detection to
pursue research. The proposed model, the whole Sequential CNN architecture (consists of
several layers) uses TensorFlow at backend. It is also used to reshape the data in the data
processing.
Keras
Keras gives fundamental reflections and building units for creation and transportation of ML
arrangements with high iteration velocity. It takes full advantage of the scalability and cross-
platform capabilities of TensorFlow. The core data structures of Keras are layers and models.
All the layers used in the CNN model are implemented using Keras, the conversion of the
class vector to the binary class matrix in data processing, helps to compile the overall model.
OpenCV
OpenCV (Open-Source Computer Vision Library), is an open-source computer vision and
ML software library, is utilized to differentiate and recognize faces, recognize objects, group
movements in recordings, trace progressive modules, follow eye gesture, track camera
actions, expel red eyes from pictures taken utilizing flash, find comparative pictures from an
image database, perceive landscape and set up markers to overlay it with increased reality
and so forth. The proposed method makes use of these features of OpenCV in resizing and
color conversion of data images.

10
BLOCK DIAGRAM

11
OpenCV (Open-Source Computer Vision Library)
It is an open-source computer vision and machine learning software library.
● The library has more than 2500 optimized algorithms.
● It has C++, Python, Java and MATLAB interfaces and supports Windows, Linux, Android
and Mac OS.
● Will help us to load images in Python and convert them into array.

Features of OpenCV
● Face Detection
● Geometric Transformations
● Image Thresholding
● Smoothing Images
● Canny Edge Detection
● Background Removals
● Image Segmentation

FACE DETECTION USING OPEN CV


Face detection algorithm was introduced by Viola and Jones in 2001. All human faces share
some similar properties. These regularities may be matched using Haar Features.
A few properties common to human faces:
● The eye region is darker than the upper-cheeks.
● The nose bridge region is brighter than the eyes.

Composition of properties forming matchable facial features:


● Location and size: eyes, mouth, bridge of nose
● Value: oriented gradients of pixel intensities

12
FIGURE 3: Features for face detection

13
FIGURE 4: An example for face detection

The Proposed Model


The proposed method consists of a cascade classifier and a pre-trained CNN which contains
two 2D convolution layers connected to layers of dense neurons. The algorithm for face mask
detection is as follows:

14
Data Pre-Processing
Data pre-processing involves conversion of data from a given format to much more user
friendly, desired and meaningful format. It can be tables, images, videos, graphs, etc. This
organized information fit in with information model and captures relationship between
different entities [6]. The proposed method deals with image and video data using NumPy
and OpenCV.

Conversion of RGB image to gray image


Modern descriptor-based image recognition systems regularly work on gray scale images,
without conversion from color to gray scale. This is because converting the color to gray
scale method is of little consequence when using robust descriptors. Introducing nonessential
information could increase the size of training data required to achieve good performance. As
gray scale rationalizes the algorithm and diminishes the computational requisites, it is utilized
for extracting descriptors instead of working on color images.

FIGURE 5: Conversion of RGB image to gray scale


We use the function cv2.cvtColor(input image, flag) for changing the color space. Here flag
determines the type of conversion. In this case, the flag cv2.COLOR BGR2GRAY is used for
gray conversion. Deep CNNs require a fixed-size input image. Therefore, we need a fixed
common size for all the images in the dataset. Using cv2.resize() the gray scale image is
resized into 100 x 100.

IMAGE RESHAPING
The input during relegation of an image is a three-dimensional tensor, where each channel
has a prominent unique pixel. All the images must have identically size corresponding to 3D
feature tensor. However, neither images are customarily coextensive nor their corresponding
feature tensors [10]. Most CNNs can only accept fine-tuned images. This introduces several
problems throughout data collection and implementation of model that can be solved by
reconfiguring the input images before augmenting them into the network [11]. The images
are normalized to converge the pixel range between 0 and 1. Then they are converted to 4
dimensional arrays using data=np. reshape (data, (data. Shape [0], img size, img size,1))
where 1 indicates the Gray scale image. As, the final layer of the neural network has 2
outputs with mask and without mask i.e., it has categorical representation, the data is
converted to categorical labels.

15
TRAINING OF MODEL
Data Mapping
Data visualization is the process of transforming abstract data to meaningful representations
using knowledge communication and insight discovery through encodings. It is helpful to
study particular pattern in the dataset [7]. The total number of images in the dataset is
visualized in both categories – ‘with mask’ and ‘without mask’. The statement categories=os.
listdir(data path) categorizes the list of directories in the specified data path. The variable
categories now looks like: [‘with mask’, ‘without mask’] Then to find the number of labels,
we need to distinguish those categories using labels=[i for i in range(Len(categories))]. It sets
the labels as: [0, 1] Now, each category is mapped to its respective label using label dict=dict
(zip(categories, labels)) which at first returns an iterator of tuples in the form of zip object
where the items in each passed iterator is paired together consequently. The mapped variable
label dict looks like: (‘with mask’: 0, ‘without mask’: 1)
Splitting the data and training the CNN model
After setting the blueprint to analyse the data, the model needs to be trained using a specific
dataset and then to be tested against a different dataset. A proper model and optimized train
test split help to produce accurate results while making a prediction. The test size is set to 0.1
i.e., 90% data of the dataset undergoes training and the rest 10% goes for testing purposes.
The validation loss is monitored using Model Checkpoint. Next, the images in the training set
and the test set are fitted to the Sequential model. Here, 20% of the training data is used as
validation data. The model is trained for 20 epochs (iterations) which maintains a trade-off
between accuracy and chances of overfitting. Fig.4 depicts visual representation of the
proposed model.

BUILDING THE MODEL USING CNN ARCHITECTURE


The current method makes use of Sequential CNN. The First Convolution layer is followed
by Rectified Linear Unit (ReLU) and MaxPooling layers. The convolution layer learns from
200 filters. Kernel size is set to 3 x 3 which specifies the height and width of the 2D
convolution window. As the model should be aware of the shape of the input expected, the
first layer in the model needs to be provided with information about input shape. The
following layers can perform instinctive shape reckoning [13]. In this case, input shape is
specified as data.shape which returns the dimensions of the data array from index 1. Default
padding is “valid” where the spatial dimensions are sanctioned to truncate and the input
volume is non-zero padded. The activation parameter to the Conv2D class is set as “relu”. It
represents an approximately linear function that possesses all the assets of linear models that
can easily be optimized with gradient-descent methods. Considering the performance and
generalization in deep learning, it is better compared to other activation functions. Max
Pooling is used to reduce the spatial dimensions of the output volume. Pool size is set to 3 x 3
and the resulting output has a shape (number of rows or columns) of: shape of output = (input

16
shape - pool size + 1) / strides), where strides has default value (1,1). As shown in Fig. 5, the
second Convolution layer has 100 filters and Kernel size is set to 3 x 3. It is followed by
ReLu and MaxPooling layers. To insert the data into CNN, the long vector of input is passed
through a Flatten layer which transforms matrix of features into a vector that can be fed into a
fully connected neural network classifier. To reduce overfitting a Dropout layer with a 50%
chance of setting inputs to zero is added to the model. Then a Dense layer of 64 neurons with
a ReLU activation function is added. The final layer with two outputs for two categories uses
the SoftMax activation function
The learning process needs to be configured first with the compile method. Here “adam”
optimizer is used categorical cross entropy which is also known as multiclass log loss is used
as a loss function (the objective that the model tries to minimize). As the problem is a
classification problem, metrics is set to “accuracy”.

MACHINE LEARNING
Machine learning is a method of data analysis that automates analytical model building. It is a
branch of artificial intelligence based on the idea that systems can learn from data, identify
patterns and make decisions with minimal human intervention.
The types of machine learning algorithms are mainly divided into four categories:
● Supervised learning
● Un-supervised learning
● Semi-supervised learning
● Reinforcement learning

17
FIGURE 6: Algorithms for machine learning

SCIKIT-LEARN
Scikit-learn is the most useful and robust library for machine learning in Python. It features
various algorithms like support vector machine, random forests, and neighbours, and it also
supports Python numerical and scientific libraries like NumPy and SciPy.

18
FIGURE 7: An output with mask

19
FIGURE 9: An output without mask

Let’s dive into the code for face mask detector project:

We are going to build this project in two parts. In the first part, we will write a python script
using Keras to train face mask detector model. In the second part, we test the results in a real-
time webcam using OpenCV.

Make a python file train.py to write the code for training the neural network on our dataset.
Follow the steps:

1. Imports:

Import all the libraries and modules required.

1. from keras.optimizers import RMSprop


2. from keras.preprocessing.image import ImageDataGenerator
3. import cv2
4. from keras.models import Sequential

20
5. from keras.layers import Conv2D, Input, ZeroPadding2D, BatchNormalization,
Activation, MaxPooling2D, Flatten, Dense,Dropout
6. from keras.models import Model, load_model
7. from keras.callbacks import TensorBoard, ModelCheckpoint
8. from sklearn.model_selection import train_test_split
9. from sklearn.metrics import f1_score
10. from sklearn.utils import shuffle
11. import imutils
12. import numpy as np

2. Build the neural network:

This convolution network consists of two pairs of Conv and MaxPool layers to extract
features from the dataset. Which is then followed by a Flatten and Dropout layer to convert
the data in 1D and ensure overfitting.

And then two Dense layers for classification.

1. model = Sequential([
2. Conv2D(100, (3,3), activation='relu', input_shape=(150, 150, 3)),
3. MaxPooling2D(2,2),
4. Conv2D(100, (3,3), activation='relu'),
5. MaxPooling2D(2,2),
6. Flatten(),
7. Dropout(0.5),
8. Dense(50, activation='relu'),
9. Dense(2, activation='softmax')
10. ])
11. model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

3. Image Data Generation/Augmentation:

1. TRAINING_DIR = "./train"
2. train_datagen = ImageDataGenerator(rescale=1.0/255,
3. rotation_range=40,
4. width_shift_range=0.2,
5. height_shift_range=0.2,
6. shear_range=0.2,
7. zoom_range=0.2,
8. horizontal_flip=True,
9. fill_mode='nearest')
10. train_generator = train_datagen.flow_from_directory(TRAINING_DIR,
11. batch_size=10,
12. target_size=(150, 150))
13. VALIDATION_DIR = "./test"

21
14. validation_datagen = ImageDataGenerator(rescale=1.0/255)
15. validation_generator =
validation_datagen.flow_from_directory(VALIDATION_DIR,
16. batch_size=10,
17. target_size=(150, 150))

4. Initialize a callback checkpoint to keep saving best model after each epoch while training:

checkpoint = ModelCheckpoint('model2-
{epoch:03d}.model',monitor='val_loss',verbose=0,save_best_only=True,mode='auto')
5. Train the model:

history = model.fit_generator(train_generator,
epochs=10,
validation_data=validation_generator,
callbacks=[checkpoint])

22
Now we will test the results of face mask detector model using OpenCV.

Make a python file “test.py” and paste the below script.

1. import cv2
2. import numpy as np
3. from keras.models import load_model
4. model=load_model("./model-010.h5")
5. results={0:'without mask',1:'mask'}
6. GR_dict={0:(0,0,255),1:(0,255,0)}
7. rect_size = 4
8. cap = cv2.VideoCapture(0)
9. haarcascade = cv2.CascadeClassifier('/home/user_name/.local/lib/python3.6/site-
packages/cv2/data/haarcascade_frontalface_default.xml')
10. while True:
11. (rval, im) = cap.read()
12. im=cv2.flip(im,1,1)
13. rerect_size = cv2.resize(im, (im.shape[1] // rect_size, im.shape[0] // rect_size))
14. faces = haarcascade.detectMultiScale(rerect_size)
15. for f in faces:
16. (x, y, w, h) = [v * rect_size for v in f]
17. face_img = im[y:y+h, x:x+w]
18. rerect_sized=cv2.resize(face_img,(150,150))
19. normalized=rerect_sized/255.0
20. reshaped=np.reshape(normalized,(1,150,150,3))
21. reshaped = np.vstack([reshaped])
22. result=model.predict(reshaped)
23. label=np.argmax(result,axis=1)[0]
24. cv2.rectangle(im,(x,y),(x+w,y+h),GR_dict[label],2)
25. cv2.rectangle(im,(x,y-40),(x+w,y),GR_dict[label],-1)
26. cv2.putText(im, results[label], (x, y-10),cv2.FONT_HERSHEY_SIMPLEX,0.8,
(255,255,255),2)
27. cv2.imshow('LIVE', im)
28. key = cv2.waitKey(10)
29. if key == 27:
30. break
31. cap.release()
32. cv2.destroyAllWindows()

23
FIGURE 10: A sample video with and without mask

RESULTS AND ANALYSIS


The model is trained, validated and tested upon two datasets. Corresponding to dataset 1, the
method attains accuracy up to 95.77% as shown in Fig. 7. Fig. 6 depicts how this optimized
accuracy mitigates the cost of error. Dataset 2 is more versatile than dataset 1 as it has
multiple faces in the frame and different types of masks with different colours. Therefore, the
model attains an accuracy of 94.58% on dataset 2 as shown in Fig. 9. Fig. 8 depicts the
contrast between training and validation loss corresponding to dataset 2. One of the main reasons
behind achieving this accuracy lies in MaxPooling. It provides rudimentary translation invariance to
the internal representation along with the reduction in the number of parameters the model has to
learn. This sample-based discretization process down-samples the input representation consisting of
image, by reducing its dimensionality. Number of neurons has the optimized value of 64 which is not
too high. A much higher number of neurons and filters can lead to worse performance. The optimized
filter values and pool size help to filter out the face in order to detect the existence of mask correctly
without causing over-fitting.

24
The system can efficiently detect faces that are partially occluded (either with a mask or hair
or hand). Based on the occlusion degree of four regions (nose, mouth, chin and eye) it
differentiates between annotated mask and face covered by hand. Therefore, a mask covering
the face fully including nose and chin will only be treated as “with mask” by the model.

25
The main challenges faced by the method mainly comprise of varying angles and lack of
clarity. The movement of indistinct faces in the video stream makes it more difficult.
However, following the trajectories of several frames of the video helps to create a better
decision – “with mask” or “without mask”.

26
BENEFITS

Manual Monitoring is very difficult for officers to check whether the peoples are wearing
mask or not. So, in our technique, we are using web cam to detect people’s faces and to
prevent from virus transmission.

 It has fast and high accuracy

 This system can be implemented in ATMs, Banks etc

 We can keep peoples safe from our technique.

 It provides buzzer sound to wear mask.

FUTURE SCOPE
In this work, a deep learning-based approach for detecting masks over faces in public places
to curtail the community spread of Coronavirus is presented. The proposed technique
efficiently handles occlusions in dense situations by making use of an ensemble of single and
two-stage detectors at the pre-processing level.
The ensemble approach not only helps in achieving high accuracy but also improves
detection speed considerably. Furthermore, the application of transfer learning on pre-trained
models with extensive experimentation over an unbiased dataset resulted in a highly robust
and low-cost system. The identity detection of faces, violating the mask norms further,
increases the utility of the system for public benefits.
Finally, the work opens interesting future directions for researchers. Firstly, the proposed
technique can be integrated into any high-resolution video surveillance devices and not
limited to mask detection only. Secondly, the model can be extended to detect facial
landmarks with a facemask for biometric purposes.

CONCLUSION
Wearing a face mask all the time is difficult and exhausting task but is obligatory since
Covid-19 crisis because face mask can help in controlling the outspread of the virus. Many
public service providers ask the customers to wear masks in order to fulfil their services. In
this paper, we briefly explained the motivation of the work at first. Then, we illustrated the
learning and performance task of the model. Using basic ML tools and simplified techniques
the method has achieved reasonably high accuracy. In future, the model can be extended to
detect if a person will wear the mask properly (as instructed by WHO) and also to detect the
type of mask.

27
REFERENCES
1. World Health Organization et al. Coronavirus disease 2019 (covid-19): situation report, 96.
2020. - Google Search.
(n.d.). https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200816-
covid-19-sitrep-209.pdf?sfvrsn=5dde1ca2_2.
2. Social distancing, surveillance, and stronger health systems as keys to controlling COVID-
19 Pandemic, PAHO Director says - PAHO/WHO | Pan American Health Organization.
(n.d.). https://www.paho.org/en/news/2-6-2020-social-distancing-surveillance-and-stronger-
health-systems-keys-controlling-covid-19.
3. Garcia Godoy L.R. Facial protection for healthcare workers during pandemics: a scoping
review, BMJ. Glob. Heal. 2020;5(5) doi: 10.1136/bmjgh-2020-002553. [PMC free
article] [PubMed] [CrossRef] [Google Scholar]
4. Eikenberry S.E. To mask or not to mask: Modeling the potential for face mask use by the
general public to curtail the COVID-19 pandemic. Infect. Dis. Model. 2020;5:293–308.
doi: 10.1016/j.idm.2020.04.001. [PMC free article] [PubMed] [CrossRef] [Google Scholar]
5. Wearing surgical masks in public could help slow COVID-19 pandemic’s advance: Masks
may limit the spread diseases including influenza, rhinoviruses and coronaviruses --
ScienceDaily. (n.d.). https://www.sciencedaily.com/releases/2020/04/200403132345.htm.
6. Nanni L., Ghidoni S., Brahnam S. Handcrafted vs. non-handcrafted features for computer
vision classification. Pattern Recogn. 2017;71:158–172.
doi: 10.1016/j.patcog.2017.05.025. [CrossRef] [Google Scholar]
7. Y. Jia et al., Caffe: Convolutional architecture for fast feature embedding, in: MM 2014 -
Proceedings of the 2014 ACM Conference on Multimedia, 2014, doi:
10.1145/2647868.2654889.
8. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. Lecun, OverFeat:
Integrated Recognition, Localization and Detection using Convolutional Networks, 2014.
9. Erhan D., Szegedy C., Toshev A., Anguelov D. Proceedings of the IEEE conference on
computer vision and pattern recognition. 2014. Scalable Object Detection using Deep Neural
Networks; pp. 2147–2154. [CrossRef] [Google Scholar]
10. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time
object detection, in: Proceedings of the IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, 2016, vol. 2016-Decem, pp. 779–788, doi:
10.1109/CVPR.2016.91.
11. M. Jiang, X. Fan, and H. Yan, RetinaMask: A Face Mask detector, 2020,
http://arxiv.org/abs/2005.03950.
12. Inamdar M., Mehendale N. Real-Time Face Mask Identification Using Facemasknet
Deep Learning Network. SSRN Electron. J. 2020
doi: 10.2139/ssrn.3663305. [CrossRef] [Google Scholar]
13. Qiao S., Liu C., Shen W., Yuille A. Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition. 2018. Few-Shot Image
Recognition by Predicting Parameters from Activations. [CrossRef] [Google Scholar]

28
14. Kumar A., Zhang Z.J., Lyu H. Object detection in real time based on improved single
shot multi-box detector algorithm. J. Wireless Com. Netw. 2020;2020:204.
doi: 10.1186/s13638-020-01826-x. [CrossRef] [Google Scholar]
15. Morera Á., Sánchez Á., Moreno A.B., Sappa Á.D., Vélez J.F. SSD vs. YOLO for
detection of outdoor urban advertising panels under multiple variabilities. Sensors
(Switzerland) 2020 doi: 10.3390/s20164587. [PMC free article] [PubMed]
[CrossRef] [Google Scholar]
16. Girshick R., Donahue J., Darrell T., Malik J. Region-based Convolutional Networks for
Accurate Object Detection and Segmentation. IEEE Trans. Pattern Anal. Mach.
Intell. 2015;38(1):142–158. doi: 10.1109/TPAMI.2015.2437384. [PubMed]
[CrossRef] [Google Scholar]
17. He K., Zhang X., Ren S., Sun J. Spatial Pyramid Pooling in Deep Convolutional
Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015
doi: 10.1109/TPAMI.2015.2389824. [PubMed] [CrossRef] [Google Scholar]
18. R. Girshick, Fast R-CNN, in: Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, 2015,
pp. 1440–1448, doi: 10.1109/ICCV.2015.169.
19. Nguyen N.D., Do T., Ngo T.D., Le D.D. An Evaluation of Deep Learning Methods for
Small Object Detection. J. Electr. Comput.
Eng. 2020;2020 doi: 10.1155/2020/3189691. [CrossRef] [Google Scholar]
20. Cai Z., Fan Q., Feris R.S., Vasconcelos N. A unified multi-scale deep convolutional
neural network for fast object detection. Lect. Notes Comput. Sci. (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2016
doi: 10.1007/978-3-319-46493-0_22. [CrossRef] [Google Scholar]
21. C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, A.C. Berg, DSSD : Deconvolutional Single Shot
Detector, 2017, arXiv preprint arXiv:1701.06659 (2017).
22. A. Shrivastava, R. Sukthankar, J. Malik, A. Gupta, Beyond Skip Connections: Top-Down
Modulation for Object Detection, 2016, arXiv preprint arXiv:1612.06851  (2016).
23. N. Dvornik, K. Shmelkov, J. Mairal, C. Schmid, BlitzNet: A Real-Time Deep Network
for Scene Understanding, in: Proceedings of the IEEE International Conference on Computer
Vision, 2017, doi: 10.1109/ICCV.2017.447.
24. Z. Liang, J. Shao, D. Zhang, L. Gao, Small object detection using deep feature pyramid
networks, in: Lecture Notes in Computer Science (including subseries Lecture Notes in
Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, vol. 11166 LNCS, pp.
554–564, doi: 10.1007/978-3-030-00764-5_51.
25. K. He, G. Gkioxari, P. Dollar, R. Girshick, Mask R-CNN, in: Proc. IEEE Int. Conf.
Comput. Vis., vol. 2017-Octob, 2017, pp. 2980–2988, doi: 10.1109/ICCV.2017.322.

29
30

You might also like