Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Noname manuscript No.

(will be inserted by the editor)

Real-time social distancing detector using SocialdistancingNet-


19 deep learning network
Rinkal Keniya · Ninad Mehendale

Received: date / Accepted: date

Abstract With no doubt, the COVID-19 pandemic For the first three days, the infection is the most in-
has put the world to a halt. The world we lived in a fectious. Many typical symptoms include nausea, dry
few months prior is completely different than what it cough, and fatigue. Severe and harmful human conse-
is now. The virus is spreading quickly and is a dan- quences have contributed to a worldwide halt. Many
ger to the human race. Seeing the necessity of the hour such signs may include sore throat and headache. It
one must always take certain precautions of which one takes a fortnight for a person with mild symptoms to get
being social distancing. Maintaining social distancing healed. The duration of recovery for individuals with se-
during COVID-19 is a must to ensure a slowdown in vere symptoms depends on the extent, along with an in-
the growth rate of new cases. Our manuscript focuses dividual’s immune capability. The main diagnostic ap-
on detecting if the people around are maintaining social proach is from a nasopharyngeal swab by a real-time
distancing or not. Using our own self developed model reverse transcription-polymerase chain reaction (RRT-
named SocialdistancingNet-19 for detecting the frame PCR). Chest CT imaging is also useful for the diagno-
of a person and displaying labels, they are marked as sis of people with an elevated probability of infection
safe or unsafe if the distance is less than a certain value.
based on signs and risk factors. Seeing the devastating
This system can be used for monitoring people via video spread of the disease, the World Health Organization
surveillance in CCTV. Our model achieved an accuracy (WHO) suggested favoring the term social distancing.
of 92.8 %. To slow down the rate of spread of the disease it is
necessary to maintain physical distance. Maintaining
Keywords Social distancing · Object detection ·
a distance of two meters between two individuals is a
COVID
must to remain safe and get back to the world we lived
a few months back. After the COVID-19 pandemic, the
1 Introduction CDC changed the concept of social distancing as keep-
ing out of congregate environments, preventing public
Coronavirus is an infectious disease caused by the corona meetings, and preserving, when appropriate, a gap of
virus-2 extreme acute respiratory syndrome. The dis- around six feet or two meters from everyone. Recent
ease was first detected in Wuhan, China in December, findings have shown that droplets from a sneeze or a
which has contributed to a spread across the world. deep breath will fly more than six meters during ex-
When in close contact, the virus spreads mainly be- ercise. And hence maintaining the norm of social dis-
tween individuals, including by tiny droplets formed tancing is a necessity and also in our benefit to live a
when sneezing or coughing. Droplets falling on the ground safer and healthier life. Our work proposes to determine
will pass through the air through the body of a human. whether or not an individual is following the rule of so-
cial distancing. The findings are verified using both a
* Corresponding author live stream as well as a video feed. By measuring the gap
N. Mehendale
B-412, K. J. Somaiya College of Engineering, Mumbai, India of two frames of people from the centroids, we can un-
Tel.: +91-9820805405 derstand whether or not a person is maintaining social
E-mail: ninad@somaiya.edu distancing. Also, they are labelled as safe and unsafe.

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
2 Rinkal Keniya, Ninad Mehendale

Fig. 1 A video stream or an image is fed as an input to our self developed model named SocialdistancingNet-19. The people
are detected as maintaining social distancing or not depending on the distance maintained between two individuals. They are
marked in frames of different colours and also labels are marked for each of them.

Fig. 2 The training of the model is first carried out by loading the dataset into the model and then trained. Later, the model
is loaded and then objects are detected in the image and video stream. Further depending on the distance frames are marked
on the people along with labels indicating the marking as maintaining or violating social distancing.

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
Real-time social distancing detector using SocialdistancingNet-19 deep learning network 3

Fig. 3 SocialdistancingNet-19 has an architecture of 19 layers. The network is fed with an input image. Then it is further
passed through a convolution, batch normalization and ReLU (Rectification Linear Unit) layers. After that it is passed through
a single max pooling layer, two convolution layers, two batch normalization layers, two ReLU layers and a single addition layer.
It is then passed through single convolution, batch normalization and ReLU (Rectification Linear Unit) layers. And at the end
was finally passed through a fully connected and a softmax layer. And then we received the classification output.

Model Accuracy (%)


Yadav et al. [1] 91
Sener et al. [2] 93.3
Liu et al. [3] (SSD300) 74.3
Liu et al. [3] (SSD512) 76.8
ResNet-50 86.5
ResNet-18 85.3
SocialdistancingNet-19 (Proposed method) 92.8
Table 1 Comparison of the accuracy values of the different methodologies. The SocialdistancingNet-19 model gave the highest
accuracy as compared to the other models.

2 Literature review the detection of a social distance violation by individu-


als was detected continuously in threshold time, there
rings an alarm that instructs people to maintain social
Various research work has been carried out on social
distance and a critical alert is sent to the control cen-
distancing using different techniques. Yadav et al. [1]
ter of the State Police Headquarters for further action.
proposed a system that used raspberry pi4 with a cam-
They achieved an accuracy of 91 %. Singh Punn et al.
era to automatically track public spaces in real-time
[4] proposed a real-time based deep learning to moni-
to prevent the spread of Covid-19. The trained model
tor social distancing using object detection and track-
with the custom data set was installed in the raspberry
ing approaches. The number of violations was given by
pi4, and the camera was attached to it. The camera is
computing the number of groups formed and the vio-
fed with real-time videos of public places to the model
lation index term computed as the ratio of the number
in the raspberry pi4, which continuously and automat-
of people to the number of groups. Different object de-
ically monitors public places and detects whether peo-
tection models were used like Faster RCNN, SSD, and
ple keep safe social distances and also checks whether
YOLO v3, where YOLO v3 with balanced performance
or not those people wear masks. Their method operates
of FPS and mAP score. An AI monocular camera-based
in two stages: first, when a person identified without a
real-time system to monitor social distancing was pro-
mask his photo was taken and sent to a control cen-
posed by Yang et al. [5]. The proposed method uses a
ter at the State Police Headquarters; and second, when

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
4 Rinkal Keniya, Ninad Mehendale

Fig. 4 Results when the input video and images were given to the model. In (a) and (b), the people were detected as
maintaining social distancing or not depending on the distance maintained between two individuals. They were marked in
frames of different colours. Green colour is marked for violating social distancing and labelled as unsafe. The purple frame was
marked for those maintaining social distancing and labelled as safe.In (c) and (d), the people are detected and frames were
marked as per the distance between two individuals. Along with this the number of violations was also counted.

critical social density to avoid overcrowding by modu- ing process of multiple instances. Experimental find-
lating inflow to the region of interest. The method was ings on two benchmark datasets validate that the use
verified using 3 different pedestrian crowd datasets. But of two-person visual descriptors along with multiple-
there were some missing detections in the train station instance spatial learning provides an efficient way to
dataset, as in some areas the density of pedestrians is infer the form of interaction. They achieved an accu-
very high and occlusion happens. However, after some racy of 93.3 %. Bielecki et al. [6] did a study of 508 male
analysis, they concluded that the maximum pedestrians soldiers with average age of 21years. They followed the
were captured and the idea of social density is valid. In number of soldiers into two groups. For the 354 sol-
the proposed method by Sener et al. [2] the motion diers affected before social distancing was introduced,
of the communicating people was extracted from each COVID-19 caused 30 % to become sick. While no sol-
region of the detected individual. Then, visual descrip- dier in a population of 154, in which infections occurred
tors for two persons are created. As the relative spatial after social distancing had been introduced. An innova-
positions of communicating people are likely to com- tive localization method was proposed to by Nadikattu
plement the visual descriptors, we propose to use em- et al. [7] to track humans’ positions in the surround-
bedding of spatial multiple instances, which implicitly ing based on sensors. This AI smart device is not only
integrates the distances between people into the learn- handy for maintaining social distancing but also detects

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
Real-time social distancing detector using SocialdistancingNet-19 deep learning network 5

symptoms of COVID in and person if any. The system spread in India. The model is an age-structured com-
will warn the user if anyone is near him within the vi- partment based approach to explore different modes of
tal six-foot radius. Ghorai et al. [8] proposed a deep disease propagation, greatly extending from the tradi-
learning solution that would alert the person as soon tional SEIR approach. The model was adapted for India
as on violates social distancing. A video stream is cap- using the correct population ladder, matrices for touch
tured from the CCTV camera and with the PoseNet levels, external arrivals. They also specifically moni-
model the people are detected and then kept a rack of tored the results of models like touch recording, seg-
the number of people present in the video stream. If regation of COVID-positive patients, quarantining, use
the distance between 2 frames of people is less then the of masks, better grooming procedures, social distancing
authorities in-charge are alerted. Using deep learning by and touch levels in various places of home, college,
techniques a drone was proposed by Ramadass et al. school, and other locations. Results of the simulation
[9] for inspection of social distancing and also to check suggest that any non-trivial number of pathogens will
if a person is wearing a mask or not. In the camera be left even after a prolonged lockout and the pandemic
of the drone is installed the qualified yolov3 algorithm will resurface. Liu et al. [3, ?] presented a method for
with the custom data collection. The drone camera runs detecting objects in images using a single deep neural
the yolov3 algorithm and determines whether or not network. The model named single shot multibox de-
social space is preserved and whether the individuals tector (SSD), discretized the output space of bounding
wearing masks are in the crowd. The drone is made fit boxes into a set of default boxes over different aspect
to operate automatically. Reluga et al. [10] proposed a ratios and scales per feature map location. At predic-
differential-game for determining whether persons dur- tion time, the network generates scores for the presence
ing an outbreak can use social distancing and associated of each object category in each default box and pro-
self-protective behaviors. The differential game is used duces adjustments to the box to better match the ob-
as a mitigating tool to research the possible utility of ject shape. The results on the PASCAL VOC, COCO,
social distancing by measuring the equilibrium actions and ILSVRC datasets showed that SSD has competi-
under several cost functions. Following outbreak detec- tive accuracy to methods that utilize an additional ob-
tion, computational techniques are used to measure the ject proposal step and is much faster, while providing a
cumulative expense of an infection under equilibrium unified framework for both training and inference. The
practices as a result of the period until mass vaccina- accuracy for SSD300 was 74.3 % and for SSD512 was
tion. The main parameters in the study are the specific 76.8 %.
number of reproductions and the underlying efficacy of
social distancing. To slow the spread of the COVID-19
3 Methodology
virus via airborne transmission, a ”social distancing”
approach of around 1.83 m (6 feet) was recommended
We loaded 295 images from the dataset, where each
in the proposed method by Feng et al. [11]. It was also
image had single or multiple labels inside it which were
found that the wind effect on droplet transport and de-
used for training the model. Further, more images and
position is dynamic and highly dependent and localized
labels were generated using an auxiliary dataset. The
on the wake flow patterns. Secondary flow intensities
auxiliary dataset is a variation of the images in terms of
between the two simulated beings, and calm currents.
rotation(+5,-5), scaling (0.95 to 1), and cropping(0.95
High RH=99.5 % leads to higher deposition fractions on
to 1). The dataset was then stored into two different
both human bodies and the ground, which is not neces-
columns. First, the image file path and the second is the
sarily related to higher exposure risks. High RH=99.5
corresponding label. Later the dataset is split into train-
% can enhance the condensation effect, and the cough
ing and testing for validation and 60 % of the dataset
droplet sizes keep growing during their transport in the
is selected for training, 10 % for validation, and the re-
air until the partial pressure at the droplet surface is
maining 30 % for testing of trained detectors. We used
equal to the saturation pressure of water vapor. In con-
the SocialdistancingNet-19 architecture for the train-
trast, RH=40 % triggers the evaporation of the water
ing purpose. Box labels were used to create the data
in cough droplets, thereby leading to droplet size reduc-
for training and evaluation purposes. A rectangular box
tion, which may lead to a long time suspended in the
was used to mark the object. This network comprises
air. High RH=99.5 % results in higher percentages of
of 2 subnetworks- feature extraction and feature de-
deposition on both human bodies and the environment,
tection. The feature extraction was carried out by a
which are not generally correlated with a higher risk
pre-trained convolutional neural network (CNN) model.
of radiation. Venkateswaran et al. [12] proposed a Sys-
We also used a reduced ResNet-50, MobileNet-V2 and
tem Dynamics (SD) model of the Covid-19 pandemic
ResNet-18 network. The detection of sub-networks of

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
6 Rinkal Keniya, Ninad Mehendale

small CNN is compared to feature extraction and is processed data directly. Activation layer 40 of ReLU
composed of a few convolutional layers specific to the (Rectification Linear Unit) is generally selected for the
YOLO object detection model. The YOLO detection feature extraction layer and we refresh the activation
model is similar to the single-stage detector model. This layer with the detected sub-network. The feature ex-
algorithm views object recognition as a problem of re- traction layer outputs the feature maps and down sam-
gression, taking a given input image or video stream ples it by the factor of 16. The amount of downsampling
and concurrently knowing the bounding box coordi- was good to maintain the tread between the special res-
nates and the corresponding labels of class probabili- olution and strength of the extracted feature. This fea-
ties. YOLO has three tuning parameters, network in- ture extracted downs to the encoder with a stronger
put sizes, anchored box, and feature extraction net- image feature that was used to estimate the cost of
work. First, the frame is detected. We then compute the special resolution. Data augmentation was carried
bounding box coordinates and then derived the cen- out to improve the accuracy by randomly transform-
ter of the bounding box. Using the box coordinates the ing the data while training. Data augmentation added
top-left coordinates are derived. Afterwhich the frame is more variety during training. And actually, increases
pre-processed giving three results which are confidence, the number of labels in the training data samples. The
bounding box, and centroids of each person. The eu- use of transform augmentation during the training al-
clidean distance is calculated and used to find the dis- lows random keeping of images. The associated box la-
tance between centroids. After the comparison of the bels are also flipped horizontally. Augmentation is not
distance between the centroids of two individuals, it performed for the validation and test data and hence
is compared with the minimum distance in terms of evaluation can be carried out unbiasedly since the data
pixels. The pairs are marked as red or green depend- is unmodified. NVIDIA GPU- 1660, 1408 Cuda core
ing on if they have violated social distancing or not. with 6GB DDR5 RAM and 192 bits memory bus was
The user specifies the input size and number of classes used to train the network.
while choosing a network. With the minimum size for a
network, the size of the training image and the com-
putational cost was optimized. We tried to find the 4 Results and discussion
best model as per input size and set of training im-
ages and optimize it to handle larger data sets than The accuracy of developed model SocialdistancingNet-
the current dataset. SocialdistancingNet-19 has an ar- 19 was 92.8 %. The accuracy of the ResNet-50 network
chitecture of 19 layers. The network is fed with an in- was 86.5 %. For ResNet-18 the accuracy was 85.3 %.
put image of dimension 224x224x3. Then it is further We tested our model using a video stream and images.
passed through a convolution, batch normalization and Of which, we could see the proper detection of people
ReLU (Rectification Linear Unit) layer each of dimen- according to the distance between a pair. The frames
sion 112x112x64. After that it is passed through a single were also labelled as safe and unsafe accordingly. Also,
max pooling layer, two convolution layers, two batch the count of the violations made were counted and were
normalization layers, two ReLU layers and a single ad- constantly updating. While using the webcam, it is nec-
dition layer. Each of these layers were of dimension essary to have people moving continuously else the de-
56x56x64. Further it was passed through single con- tection goes incorrect. This could happen due to the
volution, batch normalization and ReLU (Rectification detection method, wherein the entire frame is detected,
Linear Unit) layers, each of dimension 56x56x32. Then and further, the distance calculation and comparison
it was passed through a global average pooling layer of between the centroids takes place. The results obtained
dimension 1x1x32. And at the end was finally passed by the model are displayed in fig 4. The purple and
through a fully connected and a softmax layer each green coloured images displayed along with the labels
of dimension 1x1x10. And then we received the clas- indicate if the person is maintaining social distancing
sification output. The reduced computational cost was or not. The table 1 shows the comparison with differ-
having 224x224x3 which was the bare minimum size ent models tested and found in the reviews and their
required to run any network. Image resizing was the respective accuracies. The maximum accuracy was 93.3
only pre-processing operation required before training. % and 74.3 % was the minimum accuracy.
Then, the estimated anchor boxes were used for ob-
ject training to account for resizing before the training.
5 Conclusions
Also, the estimated anchor resizes. This was done to
transform the process with the number of anchor boxes
Our work distinguishes the social distancing pattern
estimated in the resized images. And later stored in the
and classifies them as a violation of social distancing or

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311
Real-time social distancing detector using SocialdistancingNet-19 deep learning network 7

maintaining the social distancing norm. Additionally, 4. N. Singh Punn, S.K. Sonbhadra, S. Agarwal, Monitor-
it also displays labels as per the object detection. The ing covid-19 social distancing with person detection and
tracking via fine-tuned yolo v3 and deepsort techniques,
classifier was then implemented for live video streams
arXiv pp. arXiv–2005 (2020)
and images also. This system can be used in CCTV for 5. D. Yang, E. Yurtsever, V. Renganathan, K.A. Redmill,
surveillance of people during pandemics. Mass screen- U. Ozgüner, A vision-based social distancing and critical
ing is possible and hence can be used in crowded places density detection system for covid-19, arXiv e-prints pp.
arXiv–2007 (2020)
like railway stations, bus stops, markets, streets, mall
6. M. Bielecki, R. Züst, D. Siegrist, D. Meyerhofer, G.A.G.
entrances, schools, colleges, etc. By monitoring the dis- Crameri, Z.G. Stanga, A. Stettbacher, T.W. Buehrer,
tance between two individuals, we can make sure that J.W. Deuel, Social distancing alters the clinical course
an individual is maintaining social distancing in the of covid-19 in young adults: A comparative cohort study,
Clinical Infectious Diseases (2020)
right way which will enable us to curb the virus. 7. R.R. Nadikattu, S.M. Mohammad, P. Whig, Novel eco-
nomical social distancing smart device for covid-19, In-
ternational Journal of Electrical Engineering and Tech-
6 Acknowledgement nology (IJEET) (2020)
8. A. Ghorai, S. Gawde, D. Kalbande, Digital solution for
Authors would like to thank all colleagues from COVID enforcing social distancing, Available at SSRN 3614898
(2020)
research group. 9. L. Ramadass, S. Arunachalam, Z. Sagayasree, Apply-
ing deep learning algorithm to maintain social distance
in public place through drone technology, International
Compliance with Ethical Standards Journal of Pervasive Computing and Communications
(2020)
Conflicts of interest 10. T.C. Reluga, Game theory of social distancing in re-
sponse to an epidemic, PLoS Comput Biol 6(5), e1000793
(2010)
Authors R. Keniya, and N. Mehendale, declare that he 11. Y. Feng, T. Marchal, T. Sperry, H. Yi, Influence of wind
has no conflict of interest. and relative humidity on the social distancing effective-
ness to prevent covid-19 airborne transmission: A numer-
ical study, Journal of aerosol science p. 105585 (2020)
Involvement of human participant and animals 12. J. Venkateswaran, O. Damani, Effectiveness of testing,
tracing, social distancing and hygiene in tackling covid-
19 in india: A system dynamics model, arXiv preprint
This article does not contain any studies with animals arXiv:2004.08859 (2020)
or Humans performed by any of the authors. All the
necessary permissions were obtained from the Institute
Ethical Committee and concerned authorities.

Information about informed consent

No informed consent was required as the studies does


not involve any human participant.

Funding information

No funding was involved in the present work.

References

1. S. Yadav, Deep learning based safe social distancing and


face mask detection in public areas for covid-19 safety
guidelines adherence
2. F. Sener, N. Ikizler-Cinbis, Two-person interaction recog-
nition via spatial multiple instance embedding, Journal
of Visual Communication and Image Representation 32,
63 (2015)
3. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed,
C.Y. Fu, A.C. Berg, in European conference on computer
vision (Springer, 2016), pp. 21–37

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=3669311

You might also like