Human Activity Recognition Using CNN

Human Activity Recognition using CNN
Submitted By:
Kaniz Fatema
ID:153402325
A final year project report submitted to the City University in partial fulfillment of
the requirements of the Degree of Bachelor of Computer Science and Engineering
Supervised By:
Sadia Islam
Lecturer Dept. of CSE
City University
CITY UNIVERSITY
DHAKA, BANGLADESH
Dec 2020
1
Certification
This is to certify that the work presented in this project entitled “Human Activity
Recognition” is the outcome of the work done by Kaniz Fatema under the
supervision of Sadia Islam, Lecturer, Department of Computer Science and
Engineering, City University, Dhaka, Bangladesh during 15 July to 24 Dec 2020. It
is also declared that neither this project/report nor any part it has been submitted
or is being currently submitted anywhere else for the award of any degree or
diploma.
Approved By:
--------------------------------- --------
Supervisor :
Sadia Islam
Lecturer Dept. of CSE
City University
2
Declaration
This is to certify that the work presented in this project entitled “Human Activity
Recognition using CNN” is the outcome of the work done by Kaniz Fatema under
the supervision of Sadia Islam, Lecturer, Department of Computer Science and
Engineering, City University, Dhaka, Bangladesh during 15 July to 24 Dec 2020. It
is also declared that neither this project/report nor any part it has been submitted
or is being currently submitted anywhere else for the award of any degree or
diploma.
………………………..
Kaniz Fatema
ID:153402325
Batch: 40th
City University,City Campus
Dhaka, Bangladesh
3
Acknowledgement
Project development is not an easy task. It requires co-operation and help of
various people. It always happen that word run out when We are really thankful
and sincerely want to inspire our feeling of gratitude towards the one when
helped in the completion of the project. We are deeply indebted to my supervisor
Sadia Islam, Lecturer of Department of Computer Science & Engineering, City
University, Dhaka, Bangladesh Without his help, guidance, sympathetic co-
operation, stimulating suggestions and encouragement, the planning and
development of this project would be very difficult for us. Our special thanks go to
the Head of the Department of CSE, Md. Safaet Hossain, who had given us the
permission and encouraged us to go ahead. We are bound to the Honorable Dean
of Department of Science Faculty, Prof. Dr. Md. Shawkut Ali Khan,for his endless
support. I am very grateful to all my faculty members who gave me their valuable
guides to complete my graduation. I am also very grateful to all those people who
have helped me to complete my project.
Kaniz Fatema
ID:153402325
Batch: 40th
City University,City Campus
Dhaka, Bangladesh
4
Abstract
We construct a Convolutional Neural Network (CNN) to identify human activities
using the data collected from the google images . The daily human activities that
are chosen to be recognized include walking, jogging, sitting, standing, upstairs
and downstairs. The Convolutional Neural Network is directly approached as the
input for training the CNN without any complex pretreatment. The data collected
from google images site like gettyImages, istock photos and dreamstime site .
There I train and test data, preprocessing and labeling the data. Then apply
algorithm like Random Forest, SVM, Decision Tree, Naïve Bayes, Logistic
Regression and KNN. Then the process going to apply in CNN. Thus produce the
output of Human Activity Recognition.
5
Table of Contents
Chapter No. Content Name Page No.
Certification i
……………………
Declaration ii
Acknowledgement iii
Abstract iv
Table of Contents v
Chapter 1 1.1Human Activity Recognition

Introduction 1.2 Applications of HAR
1.3 Organization of this Report
1.4 Idea and Concept
1.5 Background and Motivation
1.6 Scope and Objective
1.7 Features
Chapter 2 2.1 Detection

Literature Review 2.2 Classification
2.3 Related work
2.4 Comparison with existing work
Chapter 3 3.1 Introduction

Algorithm and 3.2 Classical ML algorithm
Techniques
6
3.3 Logistic Regression
3.3.1.1 Support Vector Machine
3.3.1.2 Naïve Bayes
3.3.1.3 K-Nearest Neighbor
3.3.1.4 Random Forest
Algorithm & 3.3.1.5 Proposed CNN model

Techniques 3.3.2 CNN process
3.3.2.1 CNN with multi label image

Classification
3.3.2.2 System model
3.3.2.3 Segmentation
3.3.3 Binarization
3.4 Trim & Segmentation
Chapter 4 4.1 Introduction
Train & Test 4.2 Human Activity Recognition

Reviews algorithms training & testing
4.3 Classification algorithms
Evaluation
4.3.1 Logistic Regression
4.3.2 Support Vector Machine
4.3.3 Naïve Bayes
4.4 K-nearest Neighbor
4.4.1 Random Forest

4.4.2 Decision Tree
4.4.3 Proposed CNN model
5.1 Introduction
Chapter 5
Proposed Scheme 5.2 Detection of HAR
5.3 Benefits
5.4 Limitations
6.1 Implementation
Chapter 6
Implementation 6.2 Implementation and Result
7.1 Summary
Chapter 7
Conclusion 7.2 Suggestions for future work
……………………… References
8
Chapter 1
Introduction
1.1Human Activity Recognition
Human activity recognition plays a significant role in human-to-human interaction
and interpersonal relations. Because it provides information about the identity of a
person, their personality, and psychological state, it is difficult to extract. The
human ability to recognize another person’s activities is one of the main subjects of
study of the scientific areas of computer vision and machine learning. As a result
of this research, many applications, including video surveillance systems, human-
computer interaction, and robotics for human behavior characterization, require a
multiple activity recognition system.
In order to detect activity of human, approaching convolutional neural network to

process the system. In deep neural networks, the activity has been recognized as
the system the process approaches Flatten, Dense, Activation layer to proceed the
system. Then produce the output.
Human activity recognition plays a significant role in human-to-human

interaction and interpersonal relations. Because it provides information about
the identity of a person, their personality, and psychological state, it is difficult
to extract. The human ability to recognize another person’s activities is one of
the main subjects of study of the scientific areas of computer vision and
machine learning. As a result of this research, many applications, including
video surveillance systems, human-computer interaction, and robotics for
human behavior characterization, require a multiple activity recognition
system.
Among various classification techniques two main questions arise: “What

action?” (i.e., the recognition problem) and “Where in the video?” (i.e., the
localization problem).
8
When attempting to recognize human activities, one must determine the
kinetic states of a person, so that the computer can efficiently recognize this
activity. Human activities, such as “walking” and “running,” arise very
naturally in daily life and are relatively easy to recognize. On the other hand,
more complex activities, such as “peeling an apple,” are more difficult to
identify. Complex activities may be decomposed into other simpler activities,
which are generally easier to recognize. Usually, the detection of objects in a
scene may help to better understand human activities as it may provide useful
information about the ongoing event (Gupta and Davis, 2007).
(a) Downstairs (b) Jogging
9
© Sitting (d) Standing
(e) Upstairs (f) Walking
Most of the work in human activity recognition assumes a figure-centric scene of

uncluttered background, where the actor is free to perform an activity. The
development of a fully automated human activity recognition system, capable of
classifying a person’s activities with low error, is a challenging task due to
problems, such as background clutter, partial occlusion, changes in scale,
viewpoint, lighting and appearance, and frame resolution. In addition, annotating
behavioral roles is time consuming and requires knowledge of the specific event.
Moreover, intra- and interclass similarities make the problem amply challenging.
10
That is, actions within the same class may be expressed by different people with
different body movements, and actions between different classes may be difficult
to distinguish as they may be represented by similar information. The way that
humans perform an activity depends on their habits, and this makes the problem of
identifying the underlying activity quite difficult to determine. Also, the
construction of a visual model for learning and analyzing human movements in
real time with inadequate benchmark datasets for evaluation is challenging tasks.
To overcome these problems, a task is required that consists of three components,

namely: (i) background subtraction (Elgammal et al., 2002; Mumtaz et al., 2014),
in which the system attempts to separate the parts of the image that are invariant
over time (background) from the objects that are moving or changing (foreground);
(ii) human tracking, in which the system locates human motion over time (Liu et
al., 2010; Wang et al., 2013; Yan et al., 2014); and (iii) human action and object
detection (Pirsiavash and Ramanan, 2012; Gan et al., 2015; Jainy et al., 2015), in
which the system is able to localize a human activity in an image.
1.2Applications of HAR
During the last decade, there was a significant growth of the number of
publications in the field of HAR; in particular, many researchers have proposed
application domains to identify specific activity types or behaviors to reach
specific goals in these domains. This section focuses on state-of-the-art
applications that use HAR methodologies to assist humans. This review, in
particular, discusses the application of the current AR approaches to AAL for
smart homes, healthcare monitoring solutions, security and surveillance
applications, and TI applications; these approaches are further classified along the
observation methodology used for recognizing human behavior, namely, into
approaches based on visual, non-visual, and multimodal sensor technology.
11
(a) (b)
Fig.1.2: Human Activity Recognition: (a) Downstairs (b) Jogging
Similarly, HERMES (http://www.fp7-hermes.eu/) aimed at providing cognitive

care for people who are suffering from mild memory problems. This is achieved
by combining the functional skills of a particular person with his or her age-
related cognitive in capabilities and assist them when necessary. HERMES used
visual and audio processing techniques, speech recognition, speaker
identification, and face detection techniques to guide the people. In a similar
manner, the universe AAL (http://www.universaal.org/index.php/en/)22 open
platform and reference specification for AAL was introduced to technically
produce cheaper products that are simple, configurable, and easily deployable at
a smart home to provide useful “AAL services.”
Moreover, smart home system proposed by the Center for Advanced Studies in
Adaptive Systems (CASAS)23 used machine learning and data mining
techniques to analyze the daily activity patterns of a resident to generate
automation policies based on the identified patterns to support the residents.
Automation policies were used to assist elder individuals in their urgent needs.
12
1.3Idea and Concept
 Human activity recognition will be able to interconnect human-computer
interaction.
 It also includes video surveillance system.
 Investigators also can detect criminal offenses.
For monitoring health caring system
1.4Background and Motivation

Our main motivation for building this project is to help both investigators and
cops. This project will also be important in healthcare system which reduce the
risk of many disease like cardiovascular, obesity and diabetes. Due to the
recent outstanding performance of artificial neural networks in human activity
recognition, this work aims to investigate the role of movement activity and its
combination for automatic human activity detection, analysis and recognition
using convolutional neural network. It detects human physical activity and
recognize human’s actual behavior.
13
1.5Scope & Objectives
 Model based on user activities which help detecting criminal offenses
based on their activities
 It can also apply business for targeted investigators.
 Emphasizes in healthcare system for the targeted patient who reduce risk
of many disease.
 Determine the kinetic states of person
1.6Features
 Human- Human interaction and interpersonal relations.
 Identify a person , their personality and psychological state..
 Video surveillance system.
 Human-Computer interaction to identify behaviors of a person which
encounter of the system.
 Robotics for human behavior characterization.
 Stop criminal offenses based on their physical activity.
 Must be enhanced a healthcare system of the targeted patients.
 Lots of data for training and testing manually
1.7Organization of this Report

The rest of the report is organized as follows:
 In Chapter 2, we briefly describe several supporting technologies and
 algorithms that we use in our proposed scheme for the HAR
 In Chapter 3 I will put literature review
 In Chapter 4 I will test all algorithms and evaluate them.
 In Chapter 5 I proposed a scheme, which we will use for our final
work
 In Chapter 6 I implement of this project
 In Chapter 7 we put some concluding remarks and suggestions for
future works
14
Chapter 2
Literature review
2.1 Detection
Human activity recognition, or HAR for short, is a broad field of study concerned
with identifying the specific movement or action of a person based on sensor data.
Movements are often typical activities performed indoors, such as walking, talking,
standing, and sitting. They may also be more focused activities such as those types
of activities performed in a kitchen or on a factory floor.
The sensor data may be remotely recorded, such as video, radar, or other wireless
methods. Alternately, data may be recorded directly on the subject such as by
carrying custom hardware or smart phones that have accelerometers and
gyroscopes.
FNN based Human Activity Recognition. The algorithms author includes are sobel
edge detection, highlight the licence plate region using image morphology,
horizontal and vertical extraction ,otsu method. Detection accuracy: 89.7% [4].
2.2 Classification
Given a training set of labeled observation sequences (features extracted from the
acceleration readings in the x, y, and z axes from three accelerometers placed on
different parts of the body), corresponding to each of the activities that we want to
classify, we first want to estimate the model parameters λ = (A, B, π), where A =
{aij} is the the state transition probability distribution, B = {bj(k)} is the
observation symbol probability distribution in state j, and π = {πi} is the initial
state distribution for each activity. Given a new set of observations we would like
to classify each sequence according to the model that gives the maximum
likelihood for that particular sequence.
15
We modeled each observation sequence as a 5-state leftto-right HMM with
continuous Gaussian observation vectors and two
hidden states. Each observation vector was formed by combining the mean and
variance in the x, y, and z axes from each accelerometer. These features were
previously used in [2]. An HMM was trained for each class (λ1, λ2, ..., λC), where
λc indicates the learned HMM model for class c, and C = 8 is the total number of
classes, using the labeled data from eight datasets as training dataset Tk. The ninth
data set was used as the validation dataset Vk. The HMM toolbox for Matlab
developed by [7] was used to train and test the different models. The log likelihood
of each model was calculated for each observation sequence in the ninth dataset.
Each observation sequence Ol = {Ol 1Ol 2...Ol T} (with T = 5 time slices) in the
validation dataset Vk = {Ol }L l=1 was classified according to the model that gave
the maximum likelihood.
comparison of classification accuracy when a single accelerometer was used for

activity classification. We are able to discern activities such as walking (65.68%)
and performing hand movements (56.30%) using only accelerometer A1 (right
wrist). Accelerometer A2 (left hip) played the most important role when
classifying activities such as sitting(66.05%), running (97.78%), crawling
(69.26%), and lying down (87.04%). Accelerometer A3 (chest) was best for
classifying activities such as squatting (75.8%) and standing (77.78%).
2.3 Related Work

 Deep Learning based Human Activity Recognition.
 LSTM and RNN based Human Activity Recognition.
 FNN based Human Activity Recognition
 SVM based Human Activity Recognition
 Decision Tree based Human Activity Recognition
 Random Forest based Human Activity Recognition
 Logistic Regression Based Human Activity Recognition
 Naïve Bayes based Human Activity Recognition
 KNN based Human Activity Recognition
16
2.4 Comparison with existing work
Name User Accuracy Real world My My
Method Application project project
Accuracy
1.Human Activity FNN 89.7% Yes No No
Recognition using FNN
2. Human Activity SVM 85.2% Yes Yes 91.13%
Recognition using svm
3. Human Activity Decision 82.7% Yes Yes 90.80%
Recognition using DT Tree
4. Human Activity KNN 87.7% Yes Yes 90.64%
Recognition using KNN
5.Human Activity Logistic 94.5% Yes Yes 9.46%
Recognition using Regression
Logistic Regression
6.Human Activity Naïve 82.7% Yes Yes 9.36%
Recognition using Bayes
Naïve Bayes
17
Chapter 3
Algorithm & Techniques
3.1 Introduction
We try a number of existing techniques and algorithms in our project, such as
for activity recognition detection I try algorithm that is, KNN, Naïve Bayes,
Decision Tree, SVM, Random Forest and Logistic Regression. Here we also
use some techniques for classification process, we use many stage of
segmentation process, for classification we try NN architecture also traditional
machine learning algorithm KNN.
3.2 Classical ML algorithm

In this section we discuss about 5 classical classification ML algorithms
3.3 Logistic Regression

Logistic Regression is a classification and not a regression algorithm. It
estimates discrete values (Binary values like 0/1, yes/no, true/false) based on a
given set of independent variable(s). Simply put, it basically, predicts the
probability of occurrence of an event by fitting data to a logit function. Hence,
it is also known as logit regression. The values obtained would always lie
within 0 and 1 since it predicts the probability
Odds= p/ (1-p) = probability of event occurrence / probability of event

occurrence ln (odds)
= ln (p/ (1-p)) logit (p)
=ln (p/ (1-p))
= (b0+b1X1+b2X2+b3X3....+bkXk)
In the equation given above, p is the probability of the presence of the
characteristic of interest. It chooses parameters that maximize the likelihood of
observing the sample values rather than that minimize the sum of squared
errors (like in ordinary regression).
18
Fig 3.1: plotting graph of logistic regression
3.3.1.1 Support Vector Machine

In this algorithm, we plot each data item as a point in n-dimensional space
(where n is a number of features we have) with the value of each feature being
the value of a particular coordinate. For example, if we only had two features
like Height and Hair length of an individual, we‘d first plot these two variables
in two-dimensional space where each point has two coordinates (these
coordinates are known as Support Vectors)
Fig 3.2: Hair data plotting(1)
19
Now, we will find some line that splits the data between the two differently
classified groups of data. This will be the line such that the distances from the
closest point in each of the two groups will be farthest way.
Fig 3.3: Hair data plotting(2)

In the example shown above, the line which splits the data into two differently
classified groups is the blue line, since the two closest points are the farthest
apart from the line. This line is our classifier. Then, depending on where the
testing data lands on either side of the line, that‘s what class we can classify the
new data as.
3.3.1.2 Naïve Bayes

This is a classification technique based on an assumption of independence
between predictors or what‘s known as Bayes’ theorem. In simple terms, a
Naive Bayes classifier assumes that the presence of a particular feature in a
class is unrelated to the presence of any other feature.
For example, a fruit may be considered to be an apple if it is red, round, and
about 3 inches in diameter. Even if these features depend on each other or upon
the existence of the other features, a Naive Bayes Classifier would consider all
of these properties to independently contribute to the probability that this fruit
is an apple. To build a Bayesian model is simple and particularly functional in
case of enormous data sets. Along with simplicity, Naive Bayes is known to
outperform sophisticated classification methods as well. Bayes theorem
provides a way of calculating posterior probability P(c|x) from P(c), P(x) and
P(x|c). The expression for Posterior Probability is as follows.
20
Here,
 P(c|x) is the posterior probability of class (target) given predictor (attribute).

 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability of predictor given class.
 P(x) is the prior probability of predictor.
Example: Let‘s work through an example to understand this better. So, here I have
a training data set of weather namely, sunny, overcast and rainy, and corresponding
binary variable ‗Play‘. Now, we need to classify whether players will play or not
based on weather condition. Let‘s follow the below steps to perform it.
Step 1: Convert the data set to the frequency table.
Step 2: Create a Likelihood table by finding the probabilities like Overcast

probability = 0.29 and probability of playing is 0.64.
Step 3: Now, use the Naive Bayesian equation to calculate the posterior probability
for each class. The class with the highest posterior probability is the outcome of
prediction.
21
Table 3.1: Frequency table of weather data
3.3.1.3 KNN(K-Nearest Neighbor)

K nearest neighbors is a simple algorithm used for both classification and
regression problems. It basically stores all available cases to classify the new cases
by a majority vote of its k neighbors. The case assigned to the class is most
common amongst its K nearest neighbors measured by a distance function
(Euclidean, Manhattan, Minkowski, and Hamming). While the three former
distance functions are used for continuous variables, Hamming distance function is
used for categorical variables. If K = 1, then the case is simply assigned to the class
of its nearest neighbor. At times, choosing K turns out to be a challenge while
performing KNN modeling.
Fig 2.10: Data plotting on KNN
22
3.3.1.4 Random Forest
Random forests or random decision forests are an ensemble learning method for
classification, regression and other tasks that operate by constructing a multitude of
decision trees at training time and outputting the class that is the mode of the
classes (classification) or mean prediction (regression) of the individual trees.
Random decision forests correct for decision trees' habit of over fitting to their
training set. The first algorithm for random decision forests was created by Tin
Kam Ho. using the random subspace method, which, in Ho's formulation, is a way
to implement the "stochastic discrimination" approach to classification proposed
by Eugene Kleinberg. An extension of the algorithm was developed by Leo
Breimanand Adele Cutler, who registered "Random Forests" as a trademark (as of
2019, owned by Minitab, Inc.). The extension combines Breiman's "bagging" idea
and random selection of features, introduced first by Ho and later independently by
Amit and Geman in order to construct a collection of decision trees with controlled
variance
Fig 3.11: Bunch of Random Forest
23
3.3.1.5 Decision Tree
Decision tree are a non-parametric supervised learning method used for
classification and regression. The goal is to create a model that predicts the value
of a target variable by learning simple decision rules inferred from the data
features. A tree can be seen as a piecewise constant approximation.
For instance, in the example below, decision trees learn from data to approximate a
sine curve with a set of if-then-else decision rules. The deeper the tree, the more
complex the decision rules and the fitter the model.
Fig 3.12: Decision tree curve
24
3.3.2 Proposed CNN model
Advantage of using CNN for image classification problems:
 The usage of CNNs are motivated by the fact that they can capture / are able
to learn relevant features from an image /video at different levels similar to a
human brain. This is feature learning! Conventional neural networks
cannot do this.
 Another main feature of CNNs is weight sharing. Let‘s take an example to

explain this. Say we have a one layered CNN with 10 filters of size 5x5.
Now you can simply calculate parameters of such a CNN, it would be
5*5*10 weights and 10 biases i.e. 5* 5*10 + 10 = 260 parameters
 In terms of performance, CNNs outperform NNs and other classical ML

algorithm on conventional image recognition tasks and many other tasks.
Look at the Inception model, Resnet50 and many others for instance.
 Has high statistical efficiency (needs few labels to reliably learn from)
 Has high computational efficiency (needs fewer operations to be able to

learn)
 Convolutional Neural Networks are the basis of all modern computer vision
models. Fully connected networks do not scale up past toy problems,
because they use far too many parameters. CNNs are a much less flexible
model compared to a fully connected network, and are biased toward
performing well on image, because in images we would like to
extract location invariant feature.
 CNNs are also useful for 1D problems like time series, and 3D image
classification, because they have the same structure where we would like
location invariant features. Convolutions are technically location equivariant
because they preserve the location of extracted features, but the important
thing is that the same features are extracted over the entire image.
25
Model Summary:
Fig 3.13: Summary of a proposed model
Fig 3.14: Epoch of a CNN proposed model

26
Fig 3.15: CNN Accuracy
Fig 3.16: CNN Loss
27
3.3.2.1 CNN Process
In deep learning, a convolutional neural network(ConvNet) is a class of deep
neural networks. It regularized version of multilayer perceptrons which means
fully connected layer. It connected to one layer to another layer. Then max pooling
layer which reduces the dimension of data by combining the output of layer to next
layer. Then dropout layer which refers to the dropping units of a neural network.
Then feature mapping and thus give output of this project.
Fig 3.17: Process of CNN

28
3.3.2.2 CNN Architecture:

Firstly processing Conv size and bits per minute 32 bit and 64 bit. Then the size of
max pooling layer reducing the dimension of data and then dense layer activated
and flatten and activation function sigmoid . Thus give the output.
Fig 3.18: Architecture of CNN
3.3.2.3 CNN with multi label image classification:

In multi label image classification, image firstly segmented part to part. Then
approaches one layer to another layer than resizing, zooming and shearing the
image . Next cnn applied which going one layer to another layer connected through
fully connected layer. Then processing dense, activation, dense and thus produce
the output of human activity.
Fig 3.19: Image segmentation
29
Fig 3.20: CNN with multi label image classification
3.3.3 System Model

There is a whole system procedure of human activity recognition. How this project
works. This is the diagram of a HAR and how it performs:
Fig 3.21: System model of HAR
30
3.4 Segmentation
Segmentation plays a very vital role in classification. The characters and digits
inside the activity recognition are segmented. Both binary and grey scale image
processing techniques are used to segment the characters. We do process our
segmentation in two steps
 Binarization
 Trim and segmentation
3.4.1 Binarization
In binarization stage we convert the plate into gray scale image then we apply
binarization operation. We apply a certain threshold value and if any pixel from
that gray scale image contain higher number than that threshold value then we
convert that pixel into white otherwise we convert it into black.
3.4.2 Trim and Segmentation

In this stage we trim the border of the licence plate, and after trimming we get the
final activity
Fig 3.22: Trim the border
After getting the final plate we segment the activity into two row, first row contain
all characters and second row contain all types of activity
31
Chapter 4
Train & Test Reviews
4.1 Introduction
In this chapter we are going to show training and testing result of all algorithms.
Here our training platform is Google Colab, we know that, to train a deep learning
technology we need very high performance computer, but in that case our local PC
is not able to maintain this types of huge computational operation, that‘s why we
choose Google Colab for training. To train and test our data we divide our data into
85% and 15% ratio. To train we choose 85% data and to test we choose 15% data.
4.2 Human Activity Recognition algorithms training & testing

To predict activity recognition, I try six types of classical algorithm and these are:
 Logistic Regression
 Support Vector Machine
 Naïve Bayes
 KNN(K-Nearest Neighbor)
 Random Forest
 Decision Tree
32
4.3 Classification algorithms Evaluation

For classification I try two types of algorithm, one is classical ML algorithms and
other is Based on CNN architecture. In this section we will evaluate all of these
algorithms
4.3.1 Logistic Regression

In Logistic Regression algorithm, the training_data is 26117 and accuracy is 9.46%
Accuracy: 9.46%
Training_Data: 26117
4.3.2 Support Vector Machine

In SVM algorithm, the training_data is 33134 and accuracy is 91.13%
33
Accuracy: 91.13%
Training_Data:33134
4.3.3 Naïve Bayes

In Naïve Bayes algorithm, the training_data is 15593 and accuracy is 9.36
Accuracy:9.36%
Training_Data:15593
34
4.4 KNN(K-Nearest Neighbor)
In KNN algorithm, the training_data is 23389 and accuracy is 90.64%
Accuracy: 90.64%
4.4.1 Random Forest

In Random Forest algorithm, the training_data is 35083 and accuracy is 90.41%
Accuracy: 90.41%, Training_Data: 35083
35
4.4.2 Decision Tree
In Decision Tree algorithm, the training_data is 33134 and accuracy is 90.80%
Accuracy: 90.80%
36
Chapter 5
Proposed Scheme
5.1 Introduction
In this chapter we are going to compare each and every algorithm and give a final
scheme and we will implement this final scheme in our project. In finalization
section of this chapter we will show how we synchronize both CNN based our own
classification CNN algorithm to get the final output.
5.2 Prediction of HAR

To predict activity recognition I try different classical algorithms and also apply
CNN and these are:
 Logistic Regression
 Support Vector Machine
 Naïve Bayes
 KNN(K-Nearest Neighbor)
 Random Forest
 Decision Tree
All of these algorithm are based on classical classification algorithm. Also
applying different types of CNN architecture .
5.2.1 Comparison
As we discuss different types of classical algorithm and now I also compare
different types of algorithm.
37
Fig 3.21: Comparison logistic regression, random forest and CNN
5.3 All machine learning algorithm comparison
Fig 3.22: All machine learning algorithm comparison
38
5.3.1 Benefits
 Can make a good business profit.
 For detecting criminal offenses
 Great impact on investigators and cop
 Great impact on healthcare system.
 Enhancing data security of the system
5.4 Limitations
 Rebooting the software when disconnected and process the full system again
and again
 It constrained the number, locations and nature of used sensors.
 Deployment, maintenance and costs of daily activities unimpeded.
 Not available the image dataset of human activity recognition.
 Manually the data preprocessing of the image observed
 Can not detect Yolo object detection because the testing and training data
already trained and tested.
 Can not detect object detect algorithm like Faster RCNN and SSD algorithm
 Software configuration GPU and TPU must be included
 Support Ram and Disk. That’s why not connected
 Existing activity recognition systems are constrained by practical
limitations such as the number, location, and nature of used sensors. Other
issues include ease of deployment, maintenance, costs, and the ability to
perform daily activities unimpeded.
 Sensors might vary for the same activity across different subjects and even
for the same individual .
 Errors can also cause to variability in sensor signals caused by differences in
sensor orientation and placement
 Reviewed different reported method addressing human activity as a
classification problem.
 Real time tested and trained image dataset not found
 Sometimes lot of problems arose in sensor of can not predict activity.
39
Chapter 6
Implementation
6.1 Implementation
There are some screenshots of this project :
Fig 6.1.1.1: Prediction Downstairs
My project is able to predict a person going downstairs.
40
Fig 6.1.1.2: Prediction Jogging
My project is able to predict a person is jogging
41
Fig 6.1.1.3: Prediction Sitting
My project is able to predict a person is sitting
42
Fig 6.1.1.4: Prediction Standing
My project is able to predict a person is standing
43
Fig 6.1.1.5: Prediction Upstairs
My project is able to predict a person is upstairs
44
Fig 6.1.2.1: Prediction Walking
My project is able to predict a person is walking
45
6.2 Result
In this project , I try to implement ML model in real time. I can apply several
algorithms which is classical classification algorithm like logistic regression,
support vector machine, random forest, decision tree, naïve bayes, snn(k-nearest
neighbor). I applied this project with algorithm got up to 90% accuracy of four
algorithms which is svm, knn, decision tree and random forest. So my project is fit
for this algorithms. But logistic regression and naïve bayes can not give fit
accuracy score. Here this project is also able to predict a person kinetic state which
is downstairs, jogging, sitting , standing, upstairs and walking. Given accuracy of
support vector machine, random forest, decision tree and k-nearest neighbor is like
91.13%, 90.40%, 90.80% and 90.64%.
46
Chapter 7
Conclusion
7.1 Summary
I presented a CNN-based human activity recognition method. Our method
outperformed the baseline random forest, decision tree, svm, logistic regression,
knn and naïve bayes algorithm in ternary human activity classification and
exhibited the best classification accuracy when longer length accelerometer data
was used for learning the neural network. We found that the dimension of the input
vector affects the activity recognition performance and that figuring out a way to
disambiguate the ‘walk’ signal in particular will likely lead to the improvement in
the activity recognition performance.
Deep learning models such as convoultional neural network and classical

classification algorithm given accuracy is up to 90% which is satisfactory with
given such fresh results of this project.
Also enriching data label as high frequency given result up to 97% which is almost
100% accuracy to fit the train and testing dataset prior to the classified almost
100% accuracy result .
47
6.2 Suggestions for future work
In our project activity recognition happen with almost 100% accuracy. But the
main problem we face in our project is in classification stage, we were able to
classify all very high accuracy but when we try to classify characters then it gives
us many misclassification result, so we assume that it happen due to lack of data
variation and lack of data amount. So in future we will collect more data and
evaluate our work further.
Also implement object detection algorithm like Faster RCNN, SSD and YoloV3
algorithm to predict human activity recognition
48
8.1 References
[1] Human Activity Recognition From Accelerometer Data Using
Convolutional Neural Network,2017Author (Song-Mi Lee, Sang Min Yoon,
Heeryon Cho).
[2] Human Activity Recognition Based On Convolutional Neural Network,2018,

Author (Wenchao Xu, Yuxin Pang, Yanqin Yang).
[3]Comparing CNN and Human Crafted Features for Human Activity

Recognition,2017, Author(Federico Cruciani).
[4] A Review of Human Activity Recognition Methods,2017,Author(Michalis

Vrigkas1).
49

Human Activity Recognition Using CNN

Uploaded by

Copyright:

Available Formats

Human Activity Recognition Using CNN

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Human Activity Recognition Using CNN

Uploaded by

Copyright:

Available Formats

Human Activity Recognition using CNN

Lecturer Dept. of CSE

Lecturer Dept. of CSE

City University,City Campus

City University,City Campus

Chapter 1 1.1Human Activity Recognition

Chapter 2 2.1 Detection

Chapter 3 3.1 Introduction

3.3.1.1 Support Vector Machine

3.3.1.2 Naïve Bayes

3.3.1.3 K-Nearest Neighbor

3.3.1.4 Random Forest

Algorithm & 3.3.1.5 Proposed CNN model

3.3.2.1 CNN with multi label image

3.4 Trim & Segmentation

Chapter 4 4.1 Introduction

Train & Test 4.2 Human Activity Recognition

4.3.2 Support Vector Machine

4.3.3 Naïve Bayes

4.4 K-nearest Neighbor

4.4.1 Random Forest

4.4.3 Proposed CNN model

In order to detect activity of human, approaching convolutional neural network to

Human activity recognition plays a significant role in human-to-human

Among various classification techniques two main questions arise: “What

(a) Downstairs (b) Jogging

(e) Upstairs (f) Walking

Most of the work in human activity recognition assumes a figure-centric scene of

To overcome these problems, a task is required that consists of three components,

Fig.1.2: Human Activity Recognition: (a) Downstairs (b) Jogging

Similarly, HERMES (http://www.fp7-hermes.eu/) aimed at providing cognitive

1.4Background and Motivation

1.7Organization of this Report

comparison of classification accuracy when a single accelerometer was used for

2.3 Related Work

3.2 Classical ML algorithm

3.3 Logistic Regression

Odds= p/ (1-p) = probability of event occurrence / probability of event

3.3.1.1 Support Vector Machine

Fig 3.2: Hair data plotting(1)

Fig 3.3: Hair data plotting(2)

3.3.1.2 Naïve Bayes

 P(c|x) is the posterior probability of class (target) given predictor (attribute).

Step 1: Convert the data set to the frequency table.

Step 2: Create a Likelihood table by finding the probabilities like Overcast

3.3.1.3 KNN(K-Nearest Neighbor)

Fig 2.10: Data plotting on KNN

Fig 3.11: Bunch of Random Forest

Fig 3.12: Decision tree curve

 Another main feature of CNNs is weight sharing. Let‘s take an example to

 In terms of performance, CNNs outperform NNs and other classical ML

 Has high computational efficiency (needs fewer operations to be able to

Fig 3.13: Summary of a proposed model

Fig 3.14: Epoch of a CNN proposed model

Fig 3.16: CNN Loss

Fig 3.17: Process of CNN

3.3.2.2 CNN Architecture:

Fig 3.18: Architecture of CNN

3.3.2.3 CNN with multi label image classification: