Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Aryan - Face Detection - Aryan Masih

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

CHAPTER 1.

INTRODUCTION

UNIVERSITY POLYTECHNIC
GREATER NOIDA - 201306 (APRIL – 2021)

Face Detection Using Machine


Learning

PROJECT-2 REPORT

Submitted By
Aryan Masih

ADMISSION NO. :- 18GPTC4060113


ENROLMENT NO. :- 18014060096

Diploma in COMPUTING SCIENCE


AND ENGINEERING

2
CHAPTER 1. INTRODUCTION

DECLARATION

Project Title: Face Detection Using Machine Learning. Degree for which the
project work is submitted: diploma in Computer Science and Engineering.

I declare that the presented project represents largely my own ideas and work
in my own words. Where others ideas or words have been included, I have
adequately cited and listed in the reference materials. The report has been
prepared without resorting to plagiarism. I have adhered to all principles of
academic honesty and integrity. No falsified or fabricated data have been
presented in the report. I understand that any violation of the above will cause
for disciplinary action by the Institute, including revoking the conferred degree,
if conferred, and can also evoke penal action from the sources which have not
been properly cited or from whom proper permission has not been taken.

Date: 5/4/2021

Aryan Masih
Admission no. :- 18GPTC4060113

3
CHAPTER 1. INTRODUCTION

CERTIFICATE

It is certified that the work contained in this project entitled Face Detection
Using Machine Learning submitted by Aryan masih, for the degree of
diploma in Computer Science and Engineering is absolutely based on
his/her own work carried out under my supervision and this project work has
not been submitted elsewhere for any degree. been submitted elsewhere for
any degree.

GUIDE

Ms. Nutan Gusain


Assistant Professor
Galgotias University

4
CHAPTER 1. INTRODUCTION

ABSTRACT

Machine learning has been gaining momentum over last decades: self-driving cars,
efficient web search, speech and image recognition. The successful results
gradually propagate into our daily live. Machine learning is a class of artificial
intelligence methods, which allows the computer to operate in a self-learning
mode, without being explicitly programmed. It is a very interesting and complex
topic, which could drive the future of technology.
Face detection is an important step in face recognition and emotion recognition,
which is one of the more representative and classic application in computer
vision. Face is one of the physiological bio-metrics based on stable features.
Face detection by computer systems has become a major field of interest. Face
detection algorithms are used in wide range of applications, such as security
control, video retrieving, biometric signal processing, human computer
interface, emotion detection, face recognition and image database
management. Face detection is a challenging mission because faces in the
images are all uncontrolled. E.g. illumination condition, vary pose, different
facial expressions.

5
CHAPTER 1. INTRODUCTION

ACKNOWLEDGEMENT

I wish to express my profound and sincere gratitude to, Ms. Nutan Gusain
(Assistant Professor), of Computer Science Engineering. University Polytechnic,
Galgotias University, Uttar Pradesh, who guided me into the intricacies of this
project non - chalantly with matchless magnanimity.

6
CHAPTER 1. INTRODUCTION
TABLE OF CONTENTS
• Declaration
• Certificate
• Abstract

• Acknowledgement

• List of Figures

• List of Tables

1 Introduction 1

1.1 Overall Description . . . . . . . . . . . . . . 2

1.2 Purpose . . . . . . . . . . . . . . . . . . . . . 3

1.3 Motivation And Scope . . . . . . . . . . . . .

2 Literature Survey 5

2.1 Face Detection Methods . . . . . . . . . . . .


2.2 Image Processing Stages . . . . . . . . . . . .

3 Proposed Model 11

4 Module Split-Up 19

5 Implementation 27

6 Results and Discussions 34

7 Conclusions and Future Works 37

References 39

7
CHAPTER 1. INTRODUCTION

Figures

2.1 Facial Expression Recognition System


3.1 Overall procedure of emotion detection algorithm
3.2 System Architecture Of Face Detection
3.3 Use case Diagram
3.4 Activity Diagram of Face Detection
3.5 Sequence Diagram - Face Detection
3.6 sequence Diagram - Preparation Sample
4.1 Neural Network
4.2 Procedure of retrieval module and data management module in a largescale .
4.3 Block Diagram Of face Detection
4.4 Face Detector Module
4.5 Feature Extraction Module
4.6 Emotion Detection Module
5.1 Trainable vs Non-trainable prams
5.2 Final Epoch
6.1 Accuracy graph while training the dataset
6.2 Result using input images

List of Tables

2.1 Universal Emotion Identification.

3.1 Features in eye region.

8
CHAPTER 1. INTRODUCTION

Chapter 1

Introduction

Face detection is a computer technology being used in a variety of applications that


identifies human faces in digital images. Face detection also refers to the
psychological process by which humans locate and attend to faces in a visual scene.
Face detection involves separating image windows into two classes; one containing
faces (tarning) and other containing background (clutter). It is difficult because
although commonalities exist between faces, they can vary considerably in terms
of age, skin colour and facial expression. The problem is further complicated by
differing lighting conditions, image qualities and geometries, as well as the
possibility of partial occlusion and disguise.
An ideal face detector would therefore be able to detect the presence of any face
under any set of lighting conditions, upon any background. The face detection task
can be broken down into two steps. The first step is a classification task that takes
some arbitrary image as input and outputs a binary value of yes or no, indicating
whether there are any faces present in the image. The second step is the face
localization task that aims to take an image as input and output the location of any
face or faces within that image as some bounding box with (x, y, width, height).

The face detection system can be divided into the following


1.1. OVERALL DESCRIPTION

steps:[4]

1. Pre-Processing: To reduce the variability in the faces, the images are processed
before they are fed into the network. All positive examples that is the face images
are obtained by cropping images with frontal faces to include only the front view.
All the cropped images are then corrected for lighting through standard
algorithms.

2. Classification: Neural networks are implemented to classify the images as faces


or non-faces by training on these examples. We use our implementation of the
neural network for this task. Different network configurations are experimented
to optimize the results.

9
CHAPTER 1. INTRODUCTION
3. Localization: The trained neural network is then used to search for faces in an
image and if present localize them in a bounding box. Various Feature of Face on
which the work has been done.

1.1 Overall Description

Face Detection is the most important step of emotion recognition. Not only the
emotion recognition, face detection is also a first step in Human Computer
Interaction (HCI) systems. E.g. expression recognition. Unlike traditional HCI
device, keyboard, mouse and display, it provides more effective methods to
increase user experience with computer used. As a result, it speeds up human’s
work. It conveys information from physical world into logical thinking to control
the computer system. In addition, face detection is one of an object detection
which is used to classify the desired object from the given images/video and locate
it.
License plate detection is one of the examples of the object
1.2. PURPOSE

detection. In bio-metric approaches, human faces are unique object like


fingerprints, iris which is widely used in security issues. Many types of personal
authentications system have been developed related to these approaches which
takes advantage of unique and special characteristics. System can be searched
effectively to screen out useful information (face) from dozens of video media or
photos from internet. For example, Video surveillance in UK, there is one CCTV
cameras for every 14 people. They also need to analyses all these videos, which
use face detection to extract any useful information and store it for further use.
Human faces are non-rigid objects and appears in different scale, pose, angle and
facial expressions. Human faces always have variations for example, glasses. In
addition, the images have different brightness, contrast. These results are the
challenges of the face detection.

1.2 Purpose
The aim of this project is to develop and propose a system to detect human faces in
digital images effectively, and recognize their facial expression no matter what
person’s ethnic, pose, etc. Input images may be varied with face size, complex of
background and illumination condition. Face detection is widely use in bio-metric,
photography, etc.
Analysis of facial expression plays fundamental roles for applications which are
based on emotion recognition like Human Computer Interaction (HCI), Social

10
CHAPTER 1. INTRODUCTION
Robot, Animation, Alert System & Pain monitoring for patients, movie
recommendation as per mood, mental state identification, etc.

1.3. MOTIVATION AND SCOPE

1.3 Motivation And Scope

Faces form a class of fairly similar objects. Each face consists of the same
components in the same geometrical configuration. This is the main reason for the
success of frontal face detection systems. However, the problem of pose invariance
is still unsolved. Detecting faces which are rotated in depth remains a challenging
task.
The motivation behind this project is that facial detection has an amplitude of
possible applications. From common household objects like digital cameras that
automatically focus on human faces to security cameras that actually match a face
to a person’s identity. Webcams are often used as a security measure for locking a
personal computer. With the rapid development of technologies it is required to
build an intelligent system that can understand human emotion. Cameras can also
use this technology to track human faces and keep a count of the number of people
in a shot or in a certain location or even coming in through an entrance. This
technology can be further narrowed down to the recognition and tracking of eyes.
This would save power by dimming a screen if viewer is not looking. For this
project, we hope to use an already existing algorithm as a basis for face detection
and emotion recognition and build upon it to create improvements and explore
more data.

11
Chapter 2

Literature Survey

Face detection is a computer technology that determines the location and size of
human face in arbitrary (digital) image. The facial features are detected and any
other objects like trees, buildings and bodies etc are ignored from the digital
image. It can be regarded as a specific case of object class detection, where the task
is to find the location and sizes of all objects in an image that belong to a given
class. Face detection, can be regarded as a more general case of face localization.
In face localization, the task is to find the locations and sizes of a known number
of faces (usually one). Basically there are two types of approaches to detect facial
part in the given image i.e. feature base and image base approach. Feature base
approach tries to extract features of the image and match it against the knowledge
of the face features. While image base approach tries to get best match between
training and testing images.

2.1 Face Detection Methods


Some of the main face detection methods are discussed here
1. Knowledge based Method: Knowledge based methods are developed on the
rules derived from the researchers knowledge of human faces. Problem in this
approach is the difficulty in translating human knowledge into welldefined
rules.
2. Featured Based Methods: Invariant features of faces are used for detecting
texture, skin colour. But features from such algorithm can be severely
corrupted due to illumination, noise and occlusion.

3. Template Matching: Input image is compared with predefined face template.


But the performance here suffers due to variations in scale, pose and shape.

4. Appearance Based Method: In template matching methods, the templates are


predefined by experts. Whereas, the templates in appearance based methods
are learned from examples in images. Statistical analysis and machine learning
techniques can be used to find the relevant characteristics of face and nonface
images.The system which performs recognition of facial expression is called
facial recognition system. Image processing is used for Facial expression
recognition. With the help of image processing useful information from image
can get extracted. Image processing converts image into digital form and
perform some operations on it to extract useful information from image. We
begin reviewing how facial expressions are produced, how they can be

12
analyzed objectively and which are the main problems when working with
emotions are. Facial expressions are produced due to face muscle movements
that end up in temporary wrinkles in the face skin and the temporary
deformation or displacement of facial features like eyebrows,

2.1. FACE DETECTION METHODS CHAPTER 2. LITERATURE SURVEY

eyelids, nose and mouth. In most cases, facial expression persistence is short in
time; usually no more than a few sec- onds.
Emotion Definition Motion of Facial Part
Anger Anger is one of the most dangerous Eyebrows pulled down,
emotions. This emotion may be Open eye, teeth shut and
harmful so, humans are trying to avoid lips tightened, upper and
this emotion.Secondary emotions of lower lids pulled up.
anger are irritation, annoyance,
frustration, hate and dislike.

Fear Fear is the emotion of danger. It may be Outer eyebrow down, inner
because of danger of physical or eyebrow up, mouth open,
psychological harm. Secondary jaw dropped.
emotions of fear are Horror,
nervousness, panic, worry and dread.

Happiness Happiness is most desired expression Open Eyes, mouth edge up,
by human. Secondary emotions are open mouth, lip corner
cheerfulness, pride, relief, hope, pulled up, cheeks raised,
pleasure, and thrill. and wrinkles around eyes.

Sadness Sadness is opposite emotion of Outer eyebrow down, inner


Happiness. Secondary emotions are corner of eyebrows raised,
suffering, hurt, despair, pitty and mouth edge down, closed
hopelessness. eye, lip corner pulled down.

13
Surprise This emotion comes when unexpected Eyebrows up, open eye,
things happens. Secondary emotions of mouth open, jaw dropped.
surprise are amazement,
astonishment.

Disgust Disgust is a feeling of dislike. Human Lip corner depressor, nose


may feel disgust from any taste, smell, wrinkle ,lower lip
sound or tough. depressor, Eyebrows pulled
down.

Table 2.1: Universal Emotion Identification.

2.2 Image Processing Stages


There are five stages in any digital image processing application. They are broadly
classified as:[3]

• Image Acquisition

• Image Pre-processing

• Image Segmentation

• Features Extraction • Classification and Prediction

1. Image Acquisition:
Static image or image sequences are used for facial expression recognition. 2-D
gray scale facial image is most popular for facial image recognition although color
images can convey more information about emotion such as blushing. In future
color images will prefer for the same because of low cost availability of color
image equipment’s. For image acquisition Camera, Cell Phone or other digital
devices are used.

2. Image Pre-processing:

Pre-processing plays a key role in overall process. Preprocessing stage enhances


the quality of input image and locates data of interest by removing noise and
smoothing the image. It removes redundancy from image without the image
detail. Pre-processing also includes filtering and normalization of image which
produces uniform size and rotated image.

14
3.Image Segmentation:
Segmentation separates image into meaningful reasons.
Segmentation of an image is a method of dividing the image into homogenous, self-
consistent regions corresponding to different objects in the image on the bases of
texture, edge and intensity

4.Features Extraction:
Feature extraction can be considered as “interest” part in image. It includes information
of shape, motion, color, texture of facial image. It extracts the meaningful information
form image. As compared to original image feature extraction significantly reduces the
information of image, which gives advantage in storage.

1.Classification and Prediction:


Classification stage follows the output of feature extraction stage. Classification
stage identifies the facial image and grouped them according to certain classes
and help in their proficient recognition. Classification is a complex process
because it may get affected by many factors. Classification stage can also called
feature selection stage, deals with extracted information and group them
according to certain parameters.

2.2. IMAGE PROCESSING STAGES CHAPTER 2. LITERATURE SURVEY

15
2.2. IMAGE PROCESSING STAGES CHAPTER 2. LITERATURE SURVEY

Figure 2.1: Facial Expression Recognition System

16
Chapter 3

Proposed Model

Overall procedure of emotion detection algorithm is shown below:

Figure 3.1: Overall procedure of emotion detection algorithm.

We present image processing stage to extract fundamental information from facial


image. Figure 3.1 shows the overall procedure of emotion detection

17
CHAPTER 3. PROPOSED MODEL

Algorithm. In image processing stage, the facial region is extracted and then facial
components are extracted.
The feature vector extraction method is most important key point in emotion
recognition problem. Especially, it is necessary to get good feature vector to make
better recognition accuracy. In the facial feature extraction stage, we propose a
new feature vector extraction method. The proposed method divide whole image
into three feature region: eye region, mouth region, and auxiliary region. Several
information are extracted from each region: geometric and shape information.
Features description Size
Xe1 Distance between two eye brow. 1x1

Xe2 Distance between eye and eye brow. 1x1

Xe3 Distance between nose 1x1


and eye(left).

Xe4 Distance between nose 1x1


and eye(right).

Xe2 Error between eye and template 4x1

Table 3.1: Features in eye region.


Features description Size
Xm1 1x1

.
Xm2 Distance between nose 1x1
and mouth.

Xse Error between mouth and template. 6x1

Table 3.2: Features in mouth region.

Table 3.1 shows the specific features of eye region. The four features represent
geometric information of eye and eye brow. Table 3.2 shows the features in mouth
region. There are two features for geometric information. Since size of facial image
is not static value, we need to normalize the feature vector. In this paper, all
features are normalized by width of facial image.Comparing images is not easy and

18
CHAPTER 3. PROPOSED MODEL

spends much time to compute. To overcome this difficulty, new calculated method
is used to compare facial component image with template. Let Xw, Xh, and Xp are
width, height, and the number of pixel in image. The similarity S can be calculated
as

where Tw, Th, and Tp are width, height, and the number of pixel in template.
The system architecture for face detection is given below:

Figure 3.2: System Architecture Of Face Detection

19
CHAPTER 3. PROPOSED MODEL

Figure 3.3: Use case Diagram

20
CHAPTER 3. PROPOSED MODEL

Figure 3.4: Activity Diagram of Face Detection

21
CHAPTER 3. PROPOSED MODEL

Figure 3.5: Sequence Diagram - Face Detection

Figure 3.6: sequence Diagram - Preparation Sample

22
Chapter 4

Module Split-Up

The entire project is divided into three modular designs.They are:

• Face Detection,

• Feature Extraction,

• Emotion Recognition.

For face detection module we used a haar-cascade classifier which easily detects
faces in an image or a video frame. For emotion recognition module we wrote a
python script to train a custom supervised machine learning model using
Tensorflow and Keras that will be able to recognize the emotions of a face. We
used a 5layered Convolution Neural Network, the first layer is the input layer and
the last layer is the output layer.
Neural networks consist of individual units called neurons. Neurons are located in
a series of groups-layers. Neurons in each layer are connected to neurons of the
next layer. Data comes from the input layer to the output layer along these
compounds. Each individual node performs a simple mathematical calculation.
hen it transmits its data to all the nodes it is connected to.[1]

23
CHAPTER 4. MODULE SPLIT-UP

Figure 4.1: Neural Network

The last wave of neural networks came in connection with the increase in
computing power and the accumulation of experience. That brought Deep
learning, where technological structures of neural networks have become more
complex and able to solve a wide range of tasks that could not be effectively solved
before. Image classification is a prominent example.
Let us consider the use of CNN for image classification in more detail. The main
task of image classification is acceptance of the input image and the following
definition of its class. This is a skill that people learn from their birth and are able
to easily determine that the image in the picture is an elephant. But the computer
sees the pictures quite differently:
Instead of the image, the computer sees an array of pixels. For example, if image size is
300 x 300. In this case, the size of the array will be 300x300x3. Where 300 is width, next
300 is height and 3 is RGB channel values. The computer is assigned a value from 0 to
255 to each of these numbers. his value describes the intensity of the pixel at each point.
To solve this problem the computer looks for the characteristics of the base level. In
human understanding such characteristics are for example the trunk or large ears. For
the computer, these characteristics are boundaries or curvatures. And then through the
groups of convolutional layers the computer constructs more abstract concepts. In more
detail: the image is passed through a series of convolutional, nonlinear, pooling layers
and fully connected layers, and then generates the output.
The Convolution layer is always the first. The image (matrix with pixel values) is
entered into it. Imagine that the reading of the input matrix begins at the top left
of image. Next the software selects a smaller matrix there, which is called a filter

24
CHAPTER 4. MODULE SPLIT-UP
(or neuron, or core). Then the filter produces convolution, i.e. moves along the
input image. The filter’s task is to multiply its values by the original pixel values.
All these multiplications are summed up. One number is obtained in the end. Since
the filter has read the image only in the upper left corner, it moves further and
further right by 1 unit performing a similar operation. After passing the filter
across all positions, a matrix is obtained, but smaller then a input matrix.

Figure 4.2: Procedure of retrieval module and data management module in a largescale
face detection.

25
CHAPTER 4. MODULE SPLIT-UP

Figure 4.3: Block Diaqram Of face Detection

26
CHAPTER 4. MODULE SPLIT-UP

Figure 4.4: Face Detector Module

27
CHAPTER 4. MODULE SPLIT-UP

Figure 4.5: Feature Extraction Module

Figure 4.6: Emotion Detection Module

28
Chapter 5

Implementation

In face and emotion recognition system we have used the following Libraries:
• cv2
• numpy
• keras
• pandas
• TensorFlow
• imutils
We have downloaded the dataset from Kaggle. The dataset can be easily
downloaded by regitering on the kaggle website [5].The data consists of 48x48
pixel grayscale images of faces. The faces have been automatically registered so
that the face is more or less centered and occupies about the same amount of space
in each image. The task is to categorize each face based on the emotion shown in
the facial expression in to one of seven categories (0=Angry, 1=Disgust, 2=Fear,
3=Happy, 4=Sad, 5=Surprise, 6=Neutral).
First we load and process the dataset. The process.py module first set path for
dataset as ’fer2013/fer2013.csv’ (fer2013.csv is our dataset) and set image size as
(48,48). There are two methods in process.py load fer2013 and preprocess input.
The code snippet for load fer2013 and preprocess input is below:

def load_fer2013():
data = pd.read_csv(dataset_path) pixels
= data[’pixels’].tolist() width, height = 48, 48 faces
= [] for pixel_sequence in pixels:
face = [int(pixel) for pixel in pixel_sequence.split(’ ’)] face =
np.asarray(face).reshape(width, height) face =
cv2.resize(face.astype(’uint8’),image_size)
faces.append(face.astype(’float32’))
faces = np.asarray(faces) faces = np.expand_dims(faces, -
1) emotions = pd.get_dummies(data[’emotion’]).as_matrix()
return faces, emotions

def preprocess_input(x, v2=True):


x = x.astype(’float32’) x = x / 255.0 if v2:
x = x - 0.5 x = x

29
CHAPTER 5. IMPLEMENTATION
* 2.0 return x

And then we train the dataset for emotion recognition. The number of epochs used
are 106. Epochs are used to train the dataset for 106 times.
Below are the snippets of the system while training the dataset:

Figure 5.1: Trainable vs Non-trainable params.

Figure 5.2: Final Epoch

30
CHAPTER 5. IMPLEMENTATION
The train.py module is used to train the dataset. We used a 5 layered
Convolution Neural Network. The first layer is the input layer to the Neural
Network and then there are 3 hidden layers and the the output layer. The code
snippet for layers of CNN is below:

# base img_input = Input(input_shape) x =


Conv2D(8, (3, 3), strides=(1, 1), kernel_regularizer=regularization,
use_bias=False)(img_input)
x = BatchNormalization()(x) x = Activation(’relu’)(x) x = Conv2D(8, (3,
3), strides=(1, 1), kernel_regularizer=regularization, use_bias=False)(x)
x = BatchNormalization()(x) x = Activation(’relu’)(x)

# module 1 residual = Conv2D(16, (1, 1), strides=(2, 2), padding=’same’,


use_bias=False)(x)
residual = BatchNormalization()(residual) x = SeparableConv2D(16, (3,
3), padding=’same’, kernel_regularizer=regularization, use_bias=False)(x)
x = BatchNormalization()(x) x = Activation(’relu’)(x) x =
SeparableConv2D(16, (3, 3), padding=’same’,
kernel_regularizer=regularization, use_bias=False)(x)
x = BatchNormalization()(x) x = MaxPooling2D((3,
3), strides=(2, 2), padding=’same’)(x) x =
layers.add([x, residual])

# module 2 residual = Conv2D(32, (1, 1), strides=(2, 2),


padding=’same’, use_bias=False)(x)
residual = BatchNormalization()(residual) x = SeparableConv2D(32, (3,
3), padding=’same’, kernel_regularizer=regularization, use_bias=False)(x)
x = BatchNormalization()(x) x = Activation(’relu’)(x) x =
SeparableConv2D(32, (3, 3), padding=’same’,
kernel_regularizer=regularization, use_bias=False)(x)
x = BatchNormalization()(x) x = MaxPooling2D((3, 3),
strides=(2, 2), padding=’same’)(x) x = layers.add([x, residual])

# module 3 residual = Conv2D(64, (1, 1), strides=(2, 2), padding=’same’,


use_bias=False)(x) residual =
BatchNormalization()(residual) x = SeparableConv2D(64, (3,
3),padding=’same’, k
kernel_regularizer=regularization,use_bias=False)(x) x =
BatchNormalization()(x) x = Activation(’relu’)(x) x =
SeparableConv2D(64, (3, 3),
padding=’same’,kernel_regularizer=regularization,
use_bias=False)(x) x = BatchNormalization()(x) x =

31
CHAPTER 5. IMPLEMENTATION
MaxPooling2D((3, 3), strides=(2, 2), padding=’same’)(x) x =
layers.add([x, residual])

# module 4 residual = Conv2D(128, (1, 1), strides=(2,


2),padding=’same’, use_bias=False)(x) residual =
BatchNormalization()(residual) x = SeparableConv2D(128, (3,
3),padding=’same’, kernel_regularizer=regularization, use_bias=False)(x) x
= BatchNormalization()(x) x = Activation(’relu’)(x) x =
SeparableConv2D(128, (3, 3),
padding=’same’,kernel_regularizer=regularization, use_bias=False)(x) x =
BatchNormalization()(x) x = MaxPooling2D((3, 3), strides=(2, 2),
padding=’same’)(x) x = layers.add([x, residual])

x = Conv2D(num_classes, (3, 3), padding=’same’)(x) x =


GlobalAveragePooling2D()(x) output =
Activation(’softmax’,name=’predictions’)(x)

The next module is video.py. This module is used to detect face and emotion in a
video frame. First we used the haarcascade classifier for face detection in the video
frame and then the detected face undergoes as an input for emotion recognition
module. The video frame first converted to the gray and then the image is scaled
and the target face is detected in a rectangle around the face.
Then we load the trained model and create a list of target faces. The target faces
are angry ,disgust, scared, happy, sad, surprised, neutral. The trained model is then
predict the emotion of the faces in a video frame or in an image. The code snippet
for video capture, loading of Haarcascade classifier and keras model, and target
set of emotion list are as follows:
face_detection = cv2.CascadeClassifier(detection_model_path)
emotion_classifier = load_model(emotion_model_path, compile=False)
EMOTIONS = ["angry" ,"disgust","scared", "happy",
"sad", "surprised", "neutral"]
# starting video streaming cv2.namedWindow(’your_face’) camera =
cv2.VideoCapture(0)

32
Chapter 6

Results and Discussions

The accuracy graph while training the dataset is given below:

Figure 6.1: Accuracy graph while training the dataset.

The dataset is trained on a low computing power machine. The accuracy achieved
in 4 Epochs is 48.33%. Accuracy can be increased to 70% by increasing the
number of epochs to 100 on high computing power machine.
The face detection and emotion recognition system works
CHAPTER 6. RESULTS AND DISCUSSIONS

Better under bright lightning condition and good quality web camera. The system
is able to detect sad,

33
angry, fear, happy and neutral faces accurately in a video frame and fear, happy,
angry, disgust, scared, surprised and neutral in an image file.

The images for testing are downloaded from the Shutter stock (A website for free stock
images) [8]. The
Results using an input image are below:
Figure 6.2: Result using input images

34
Chapter 7

Conclusions and Future Works

In this project of face and emotion detection, I have tried to study on various
techniques on face and emotion recognition. The techniques included Viola-Jones
algorithm which detects the various parts of the face, Histogram-Equalization
which is used to adjust image intensities and Thresholding which is used to create
a binary image from a gray-scale image. Hence using all these techniques, key
points of the face are extracted which supplies all these data as training data set
for classification.
The following two techniques are used for respective mentioned tasks in face recognition
system.
Haar feature-based cascade classifiers: It detects frontal face in an image
well. It is real time and faster in comparison to other face detector. We used an
implementation from Open-CV.
CNN Model: We trained a classification CNN model architecture which takes
bounded face (48*48 pixels) as input and predicts probabilities of 7 emotions in
the output layer. The face detection module is not able to correctly detect tilted
faces and faces with glasses and the emotion recognition module is not able to
detect surprised faces correctly under webcam but works well for input image.
Future areas of development includes detection of tilted faces as well
CHAPTER 7. CONCLUSIONS AND FUTURE WORKS

as faces with glasses and correctly identify disgust and surprised faces in video
stream. And we will try to speed up the system to identify emotion accurately up
to 90% by increasing the hidden layers of the CNN and by using/creating better
dataset.

35
References

[1] A Beginner’s Guide To Understanding Convolutional Neural Networks. https:


//adeshpande3.github.io/adeshpande3.github.io/A-Beginner\%27sGuide-
ToUnderstanding-Convolutional-Neural-Networks/. Accessed: 2019-03-30.
[2] MANUEL GRANAT ANDONI BERISTAIN. EMOTION RECOGNITION BASED ON THE
ANALYSIS OF FACIAL EXPRESSIONS. 2018. url: https:
//www.researchgate.net/publication/46510584_EMOTION_RECOGNITION_
BASED_ON_THE_ANALYSIS_OF_FACIAL_EXPRESIONS_A_SURVEY.
[3] Lavanya M. Anjali R. FACIAL EMOTIONS RECOGNITION SYSTEM FOR AUTISM.
2014. url: https://technicaljournalsonline.com/ijeat/
VOL%20V/IJAET%20VOL%20V%20ISSUE%20II%20APRIL%20JUNE%202014/
Article%2011%20V%20II%202014.pdf.
[4] Ms Sk Ayesha. FACE DETECTION SYSTEM WITH FACE RECOGNITION. 2017.
url: http://www.pace.ac.in/documents/ece/FACE%
20RECOGNITION%20SYSTEM%20WITH%20FACE%20DETECTION.pdf .
[5] Challenges in Representation Learning: Facial Expression Recognition Challenge.
https://www.kaggle.com/c/challenges-in-representationlearningfacial-expression-
recognition-challenge/data. Accessed: 2019-01-17.
[6] Prof. Nitin Agrawal Kanika Bhatia Prof. Umesh Kumar Lilhore. Review of Different
Face Detection and Recognition Methods. 2017. url: http://
ijsrcseit.com/paper/CSEIT1725137.pdf .

[7] Lokesh Singh Monika Dubey. Automatic Emotion Recognition Using Facial
Expression. 2016. url: https://www.irjet.net/archives/V3/i2/IRJETV3I284.pdf.
[8] Stock assets to power your creativity. https://www.shutterstock.com/.
Accessed: 2019-03-24.

36

You might also like