Irjet V7i61094

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 06 | June 2020 www.irjet.net p-ISSN: 2395-0072
Human Suspicious Activity Detection using Deep Learning

Rachana Gugale1, Abhiruchi Shendkar2, Arisha Chamadia3, Swati Patra4, Deepali Ahir5
1,2,3,4Student, Department of Computer Engineering, M. E. S. College of Engineering, Pune, India
5Assistant Professor, Department of Computer Engineering, M. E. S. College of Engineering
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Detecting suspicious activities in public places Models like OpenPose[1], PoseNet[2] give out the keypoint
has become an important task due to the increasing number of coordinates of the people in the image/video in real time. But
shootings, knife attacks, terrorist attacks, etc. happening in just obtaining the keypoints of the people without any
public places all around the world. This paper focuses on a background or surrounding objects information is not
deep learning approach to detect suspicious activities using enough to decide if an activity is suspicious. So, we use a CNN
Convolutional Neural Networks from images and videos. We approach in our system instead of using a keypoints based
analyze different CNN architectures and compare their approach.
accuracy. We give the architecture of our system which can
process video footage in real time from cameras and predict if 2. METHODOLOGY
the activity is suspicious or not. We also propose future
developments which can be made in this area of suspicious The first step was to decide which suspicious activities to
activity detection. focus on. We selected 5 suspicious activities to classify:
Shooting, punching, kicking, knife attack and sword fight.
Key Words: Suspicious Activity Detection, Convolutional These 5 activities formed 5 classes for our classifier model.
Neural Networks, FastAI, Deep Learning The non-suspicious activities were put in a 6th class.
1. INTRODUCTION The next step was to collect data for each of the
classes. Images were scraped from Google Images by using a
Suspicious human activity recognition from surveillance JavaScript code snippet. Once we collected enough images,
video is an active research area of image processing and we manually filtered the irrelevant images. This process was
computer vision. Through the visual surveillance, human repeated for each of the 6 classes. The total number of images
activities can be monitored in sensitive and public areas such in our dataset is 17,716.
as bus stations, railway stations, airports, banks, shopping
malls, school and colleges, parking lots, roads, etc. to prevent Once we had our data, we started the process of
terrorism, theft, accidents and illegal parking, vandalism, model selection. After researching about neural network
fighting, chain snatching, crime and other suspicious model architectures and which ones to use for real-time
activities. It is very difficult to watch public places tasks, we decided to use ResNet. We decided to experiment
continuously, therefore an intelligent video surveillance is with ResNet-18, 34 and 50. The numbers here stand for the
required that can monitor the human activities and number of neuron layers in the model architecture.
categorize them as usual and unusual activities; and can For training and evaluating the model, we used the deep
generate an alert. learning framework FastAI[3],[4] which is based on
PyTorch[5]. FastAI is organized around two main design
1.1 Previous Approaches goals: to be approachable and rapidly productive, while also
being deeply hackable and configurable. It has the clarity and
development speed of Keras[6] and the customizability of
For detecting suspicious human activity, it is important for
PyTorch. This goal of getting the best of both worlds has
the model to learn suspicious human poses. Human pose
motivated the design of a layered architecture for FastAI. A
estimation is one of the key problems in computer vision that
high-level API powers ready-to-use functions to train models
has been studied for more than 15 years. It is related to
in various applications, offering customizable models with
identifying human body parts and possibly tracking their
sensible defaults. The FastAI APIs choose intelligent default
movements. It is used in AR/VR, gesture recognition, gaming
values and behaviors based on all available information. For
consoles, etc. Initially, low cost depth sensors (motion
instance, FastAI provides a single Learner class which brings
sensors) were used to find human movement in gaming
together architecture, optimizer, and data, and automatically
consoles. However, these sensors are limited to indoor use,
chooses an appropriate loss function where possible. The use
and their low resolution and noisy depth information make it
of intelligent defaults – based on FastAI creators’ experience
difficult to estimate the human activity going on from depth
or best practices – extends to incorporating state-of-the-art
images. Hence, they are not a suitable option for suspicious
research wherever possible. For instance, transfer learning is
activity detection.
critically important for training models quickly, accurately,
and cheaply, but the details matter a great deal. FastAI
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 5819
automatically provides transfer learning, optimized batch- trained with our custom dataset. We divide the dataset into
normalization, training, layer freezing, and discriminative training and validation sets. Validation set contained 20% of
learning rates. In general, the library’s use of integrated the images which were randomly chosen from the dataset.
defaults means it requires fewer lines of code from the user
to re-specify information or merely to connect components. After the training phase, the model is deployed on
As a result, every line of user code tends to be more likely to computer systems used by the security teams in public
be meaningful, and easier to read. places. Our system is a desktop application which can take as
input live feed from a camera or an already stored video
FastAI APIs were used to divide the dataset into from the computer. This video is then preprocessed (which
training and validation set. 20% of the data was used for involves breaking the video into frames) and then fed into
validation while the rest of 80% was used for training. the ResNet-50 model. The model outputs if the video
contains any suspicious activity or not. If a suspicious
3. SYSTEM ARCHITECTURE activity is detected in the video, the model immediately
generates an alert on the system and also sends an email
The system is divided into 2 phases: training and alert on the registered email address along with pictures of
deployment. During the training phase, the ResNet model is the video where suspicious activity is ongoing.
Figure 1: System Architecture
4. RESULTS 5. CONCLUSION
Suspicious activity detection has become an important area According to the results obtained by the above architectures,
of study due to the increasing number of crimes happening. we conclude that ResNet-50 works the best for this task. A
We studied the previous approaches present and offered an learning rate in the range of 3×10^(-5) to 3×10^(-4) also
alternative approach to detect suspicious activities works the best.
happening in public places. Our approach used CNN for
finding if the activity was suspicious. The ResNet architecture ACKNOWLEDGEMENTS
was used to build the CNN model. We tried ResNet-18, It gives us great pleasure and satisfaction to have worked on
ResNet-34 and ResNet-50 approaches. We also tried to train “Human Suspicious Activity Detection using Deep Learning”.
our model with default learning rate and learning rate in the We are thankful to and fortunate enough to get constant
range of 3×10^(-5) to 3×10^(-4). The following results were encouragement, support, and guidance from our guide Prof.
obtained for each architecture: Deepali Ahir. She encouraged us to keep moving forward
under her guidance and vigilant support. We would also like
to extend our sincere thanks to all friends and family for their
motivation in times that we hit a wall. We would like to thank
all those, who have directly or indirectly helped us for the
completion of the work during this project.
FUTURE AREAS OF RESEARCH
Our model currently targets only 5 suspicious activities. It can
further be improved by targeting a greater number of
suspicious activities. More images can be added to the
current dataset, especially images extracted from the CCTV
footage of the suspicious activity. Such footage is currently
difficult to obtain as students but if this project is supported
by the civic administration, they can surely provide the
footage of criminal activities which have happened over this
Figure 2: Accuracy Comparison Graph past. This will vastly help in improving the model.
REFERENCES
[1] Cao, Zhe, Gines Hidalgo, Tomas Simon, Shih-En Wei,
and Yaser Sheikh. 2018. "OpenPose: Realtime Multi-
Person 2D Pose Estimation using Part Affinity
Fields." arXiv.
[2] Cipolla, Alex Kendall, Matthew Grimes, and Roberto.
2015. "PoseNet: A Convolutional Network for Real-
Time 6-DOF Camera Relocalization." arXiv.
Figure 3: Obtained Accuracies
[3] Howard, Jeremy, and S Gugger. 2020. Deep Learning
for Coders with fastai and PyTorch: AI Applications
Without a PhD. O’Reilly Media, Inc.
[4] Howard, Jeremy, and Sylvain Gugger. 2020. "fastai: A
Layered API for Deep Learning." arXiv.
[5] Paszke, Adam, Sam Gross, and Others. 2019.
"PyTorch: An Imperative Style, High-Performance
Deep Learning Library." In Advances in Neural
Information Processing Systems 32, 8024--8035.
Curran Associates, Inc.
[6] Chollet, François. 2015. "Keras." GitHub repository
(GitHub). https://github.com/fchollet/keras.
Figure 4: Obtained Confusion Matrix

Irjet V7i61094

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Irjet V7i61094

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Irjet V7i61094

Uploaded by

Copyright:

Available Formats

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 07 Issue: 06 | June 2020 www.irjet.net p-ISSN: 2395-0072

Human Suspicious Activity Detection using Deep Learning

Figure 1: System Architecture

Figure 4: Obtained Confusion Matrix

You might also like