Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Real Time Emotion Recognition Through Facial Expressions For Desktop Devices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/316511050

Real Time Emotion Recognition through Facial Expressions for Desktop


Devices

Article · January 2013

CITATIONS READS

15 1,064

5 authors, including:

Pramila M. Chawan
Veermata Jijabai Technological Institute
157 PUBLICATIONS   380 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Cloud based Automated Attendance Monitoring System using RFID and IOT View project

Decision Tree View project

All content following this page was uploaded by Pramila M. Chawan on 27 April 2017.

The user has requested enhancement of the downloaded file.


International Journal of Emerging Science and Engineering (IJESE)
ISSN: 2319–6378, Volume-1, Issue-7, May 2013

Real Time Emotion Recognition through Facial


Expressions for Desktop Devices
P. M. Chavan, Manan C .Jadhav, Jinal B. Mashruwala, Aditi K. Nehete, Pooja A. Panjari

Abstract—Thepaper states the technique of recognizing Hence, we hereby propose the development of a robust real-
emotion using facial expressions is a central element in human time perceptive system, which will take into account the
interactions. By creating machines that can detect and facial expressions, detect human face and code the
understand emotion, we can enhance the human computer expression dynamics. Thus, systems like working on these
interaction. In this paper, we discuss a framework for the
lines have a wide range of applications in basic and applied
classification of emotional states, based on still images of the
face and the implementation details of a real-time facial feature research areas, including man-machine communication,
extraction and emotion recognition application are discussed. security, law enforcement, psychiatry, education, and
The application automatically detects frontal faces from the telecommunications. As a result, attempts are being made to
captured image and codes them with respect to 7 dimensions in elevate the levels of usability and ease of interaction with
real time: neutral, anger, disgust, fear, joy, sadness, surprise. minimum latency (fast response time) as well as minimum
Most interestingly the outputs of the classifier change smoothly error co-efficient in the output In an attempt to make the
as a function of time, providing a possibly worth representation system more efficient, we suggest an application, wherein, 7
of code facial expression dynamics in a fully automatic and dimensions of facial expressions can be mapped, high level
unnoticeable manner. The main objective of the paper is the
of accuracy can be obtained, and last but the not the least,
real-time implementation of a facial emotion recognition
system. The system has been deployed on a Microsoft’s processing time can be saved.
Windows desktop.

Keywords—Real time, facial expression, emotion recognition. II. SYSTEM ARCHITECTURE

There are 10 basic building blocks of the emotion


I. INTRODUCTION recognition as enlisted below:
According to Dr. Charles Darwin, the facial expression
Live Streaming: Image acquisition
indeed contribute in communicating one’s emotions,
opinions as well as intentions to each other in a effective
Skin Color Segmentation: Discriminates between face and
way. In addition to this, his study on human behavior
non-face parts
explicitly states that such expressions also provide
information about the cognitive state of a person. This
Face Detection: Locates position of face in the given image
includes the states like- boredom, stress, interest, confusion
frame using skin pixels
so on so forth. Hence, on the similar lines, considering the
importance and the escalating need for advanced Human
Eye Detection: Identifies position of eyes in the image
Computer Interaction, this paper revolves around generating
frame
the mood or emotion of humans based on their facial
expression. Furthermore, it explains the necessity of Real-
Lip Detection:Determnies lip coordinates on face
Time Systems in order to achieve high levels of interactions
with the machines. It is of prime importance that the
Longest Binary Pattern: Skin and non-skinpixels are
interaction of humans with the computers should be latency
converted to white and black pixels, respectively
free, thus taking it to the level of face-to-face
communication.
Bezier Curve Algorithm:Applies bezier curve equation on
the facial feature points
Manuscript Received on May 2013.
Prof. P. M. Chavan, Professor, Department of Computer Technology Emotion Detection: Emotion is detected by pattern
and Information Technology, Veermata Jijabai Technological Institute, matching using valuesfrom database
Matunga, Mumbai- 400 019, India
Mr. Manan C. Jadhav, Student, Department of Computer Technology
and Information Technology, Veermata Jijabai Technological Institute, Database: Stores values derived from bezier curve equation
Matunga, Mumbai- 400 019, India that are used for comparison in emotion detection phase
Ms. Aditi K. Nehete, Student, Department of Computer Technology and
Information Technology, Veermata Jijabai Technological Institute, Output Display:Renders ouptut from detectionphase on
Matunga, Mumbai- 400 019, India
Ms. Jinal B. Mashruwala, Student, Department of Computer screen
Technology and Information Technology, Veermata Jijabai Technological
Institute, Matunga, Mumbai- 400 019, India
Ms. Pooja A. Panjari, Student, Department of Computer Technology
and Information Technology, Veermata Jijabai Technological Institute,
Matunga, Mumbai- 400 019, India

104
Real Time Emotion Recognition through Facial Expressions for Desktop Devices

face is detected by identifying facial features while ignoring


all non-face elements.
To perform face detection image is first converted into a
binary form and scan the image for forehead. We then
search for maximum width of continuous white pixels till
we reach eye brows.We now cut the face in such a way that
its height is 1.5 times that of its width.

iv. Eye Detection

For eyes detection too, we convert RGB image to binary


image and then scan image from W/4 (where W is width) to
remaining width in order to find the middle position of the
eyes.Next, we determine upper position of the two
eyebrows. For left eye search ranges from W/8 to mid
(middle position between eyes) of the imageand for right
eye from mid of the image to W-W/8 of the total width. We
then place continuous black pixels vertically in order to get
eyes and eyebrows connected. To achieve this for left eye,
pixel lines are placed vertically between mid/2 to mid/4 and
in case of right eye vertical black pixel lines are placed
between {mid + (W-mid)/4 } to {mid + 3*(W – mid)/4}. We
further search black pixel lines, horizontally, from the mid
position to the starting position so as to locate the right side
of left eye. To detect left side of right eye we scan the
starting to mid position of black pixels within the upper and
lower position of right eye. The left side of left eye is simply
the starting width of image and right eye’s right side is the
ending width. Using this we cut the left side, the right side
and the upper and lower positions of each eye from the
previous RGB image.
Figure 1. System Architecture
v. Lip Detection

Another important feature to be detected to analyze


i. Live Streaming emotions is Lip. It helps us to know whether shape of lips is
: plain, slightly curved, broadly curved , pout , parted lips.
Live streaming is the fundamental step of image acquisition This module outputs a lip box having dimensions that can
in real time, where image frames are received using contain entire lip in the box and some part of nose.
streaming media. In this stage, the application receives
images from built-in webcam or external video camera vi. Longest Binary Pattern
device. The streaming continues until input image frame is
acquired. It is extremely important that the area of interest be made
smooth so as to detect the exact emotions. So we convert the
ii. Skin Color Segmentation skin pixels in that box to white pixels and rest of them to
black. Then to find a particular region , say lip, we find the
Skin color segmentation, discriminates between skin and biggest connected region of monotone color , here its
non-skin pixels. Skin segmentation permits face detection to longest sequences of black pixels. This results in longest
focus on reduced skin areas in preference to the entire image binary pattern.
frame. Skin segmentation has proved to be instrumental as
regions containing skin can be located fast by exploring vii. Bezier Curve Algorithm
pixels having skin color. Before performing skin color
segmentation, we first contrast the image which involves Most notable applications of Bezier Curve include
separation of brightest and darkest image areas. Further, the interpolation, approximantion, curve fitting, and object
largest continuous region is spotted to identify potential representation. The aim of the algorithm is to find points in
candidate faces. Then the probability of this largest the middle of 2 nearby points and to repeat this until we
connected area becoming a face is determined and is further have no more iterations. The new values of points results in
used in subsequent steps for emotion detection curve. The famous Beizer equation is accurate formulation
of this idea.
iii. Face Detection
Formula is :
Face detection can be considered as a specific case of
Object-class detection that discovers size and location of (t) = ni=0Pi [n!/(i!*(n-1)!)](1-t)n-i* ti
objects, listed in class, within an input image.In this step,

105
International Journal of Emerging Science and Engineering (IJESE)
ISSN: 2319–6378, Volume-1, Issue-7, May 2013

Cubic Bezier: movement et al. For this particular region, explicit detection
B(t) = (1-t)3p1 + 3(1-t)2tp2 + 3(1-t)t2p3 + t3p4 of such features is required. So prior to their detection, we
should recognize a face frame so that we can generate the
dimensions and work with the corresponding dimensions for
viii. Emotion Detection
further mappings.
Next step after detecting features is to recognize the
Now, for detection of such a frame, the first step is to
emotions. For this we convert all the bezier curve into larger
convert RGB image into a binary image containing a series
regions. This static curves are then compared with the
of black and white pixels. Hence, the RGB values of all the
values already present in database. Then the nearest
pixels are taken. An average value of RGB for each pixel is
matching pattern is picked and related emotion is given as
calculated and if it is below 110, then the corresponding
output.If the input doesnt match the ones in database, then
pixel is substituted by a black pixels. On the other hand, if it
average height for each emotion in database is computed &
is greater than 110, it is substituted by white pixel. Thus a
on basis of this a decision is made.
binary image is generated.
ix. Database
Now, we start scanning the binary image from the
midpoint of the edge of the image. Since we are bound to
Database contains existing values that can be later used in
come across hair first, we get a series of continuous black
comparisions to find out the closest matching emotion to be
pixels. The moment forehead appears, it changes to white
displayed as output.
pixels. This point is marked. Now, vertical scanning from
middle axis of the image in both sides-left and right is done
to find the maximum width of our face frame. When we
x. Output Display
come across eyebrows while scanning, we find black pixels.
From middle line, vertical scan will have no eyebrow
Once the comparision is done, the output will contain the
portion, hence, only white pixels will be encountered. We
graphical form as well as an animated figure describing the
keep scanning in both sides- left and right until the width
emotion of the portrayed input.
drops to half of the previously mapped width. This will
mark the end of eyebrow portion. Once we get the final
III. KNOWLEDGE BASE width, we calculate the length of frame as 1.5 times of the
width. From the starting point of forehead which was
In our database, there are two relations. First relation "Input"
marked earlier, the length of frame is measured and a
contains name of the poser and their index of 7 kinds of
rectangular frame is created. This is the face region we need
emotion which are stored in other relation "Orientation".
to work on from a full-fledged image.
In the “Orientation” relation, for each index, there are 6
control points for lip Bezier curve, 6 control points for left
& right eye Bezier curve each, lip height and width, left eye VI. EYE DETECTION
& right eye height and width . Applying this approach the
We now have the rectangular frame with width (w) and
system understands and maps the emotion of the captured length (l=w*1.5). We know that our face exhibits symmetry.
input. Hence, finding a middle (central) axis is of utmost
importance. We cannot directly map the axis at w/2. This is
because we need to avoid any type of discrepancy in our
IV. SKIN COLOR SEGMENTATION
detection. Hence, we scan the frame from w/4 to w-w/4. As
Human face comprises of varied features namely eyes, mentioned in Face Detection Module, the central part of face
nose, lips, forehead, hair, eyebrows etc. For detecting gives a series of continuous white pixels. Hence, whenever
emotions, some features like eyes, lips et al are required (around w/2) such series is obtained, we mark it as central
while some others like hair, ear et al are not. We need to sort axis (mid).
the face into fragments which will separate face and non- We proceed by finding the upper starting point of both
face elements. This is achieved by using the image the eyebrows. This is done by finding black pixels while
processing technique called Image Segmentation. This scanning vertically from w/8 to mid for the left eyebrow and
generates segments of adjacent pixels which share similar mid to w-w/8 for the right eyebrow.
visual characteristics and thus makes it easier to analyse the
facial features. Eyes and eyebrows, when combined together will point
These segments are thus mapped and largest connected to exact eye movement. Hence, while detecting an eye, the
region which corresponds to similar characteristics like eyebrow and eye should be kept connected. This is done by
intensity or RGB value are taken. If this region has placing black pixels in the white places between an eye and
dimension ratio (height: width ratio) between 1 and 2, then it an eyebrow. But, since for different faces, this lengths/space
can be speculated that the region has to be a face. Explicit may differ, we pick a standard function of width of the frame
detection of facial features is done hereafter. and central axis. As a result, for the left eye, we place the
black pixels from mid/4to mid /2 and for the right eye, we
place the black pixels from mid+(w-mid)/ 4 to mid+3*(w-
mid)/ 4. The height for the placing of pixels is calculated to
V. FACE DETECTION be from position of eyebrow to (h-position of eyebrow)/4.h
Once segmentation is done, it becomes easier to analyze stands for the height of the image.
pixels having similar and different visual characteristics.
Next, we find the lower positions of both the eyes. Again,
The emotions depend on shape of eyebrows, lip curves, eye this is done by searching black pixels vertically. Scanning

106
Real Time Emotion Recognition through Facial Expressions for Desktop Devices

from mid/4 to mid-mid/4 for left eye and symmetrically similar pixels, we will need to consider 7 as a standard for
scanning from lower end of image to starting point of comparison instead of 10. However, if the histogram gives
eyebrow from mid+(w-mid)/4 to mid+3*(w-mid)/4 for the the difference between lowest and highest RGB value greater
right eye. than 70, then due to the low quality of the image, we will
need to consider 10 as the comparison factor.
Thereafter, the end points: left side of right eye and right
side of left eye are plotted. This is achieved by searching To detect just the lip, we find the biggest connected
black pixels horizontally from middle point to starting point region. This can be cross checked using the fact that in out
of black pixels between the upper and lower positions of left rectangular lip box, lip forms the largest area which
eye, and from middle point to starting point of black pixels comprises of mono-tone and which is different than the skin.
between upper and lower positions of right eye.
Thereafter, we apply Bezier curve on the black pixilated
Thus, we get a rectangular frame comprising of eye and lip region that is obtained from above steps. In order to apply
eyebrow feature. The frame starts at left side of left eye and it, we find start and end point of lip in horizontal direction by
ends at right side of right eye. Corresponding area is cut from scanning for continuous black pixels and stopping when a
the RGB image. In this way, the eyes are detected in a white pixel is encountered.
formulated manner, which does not hinder various
dimensional issues for different types of faces. We proceed by drawing tangents from each end pixel of
the lip for each of the upper and lower lip. We then find 2
VII. LIP DETECTION points on the tangents which do not lie on the lip. We then
slowly format the tangents and the 4 (2 from upper lip
Just as eye detection is done, similar type of frame is to tangent and 2 from lower lip tangent) points on it, to give an
be carved out for the lips. The shape of lips: plain, slightly outline curve of the lip area. Explicit use of Cubic Bezier
curved, broadly curved, pout, parted lips (revealing teeth); is Curves is done in this process.
one of the important features for detecting emotion.
Happiness, Anger, Excitement, Sadness, Concealment, and For instance, if the points are p1, p2, p3, p4, then explicit
Nervousness et al can be judged if the movement of lips is form of curve is given by:
known.
B(t) = (1-t)3p1 + 3(1-t)2tp2 + 3(1-t)t2p3 + t3p4
Thus, we need to determine a lip box having such
From these 4 distinct points, we can create a cubic beizer
dimensions that the entire lip be contained in the box itself.
curve that goes through all 4 points in order and gives us the
Again, this must be purely dimensional, custom made for
outline curve for mapping our values in the movement of
each of the different faces and hence, it becomes a function
lips.
of distance between different features in all.
Firstly, we determine the distance between forehead and
eyes. We then add this distance to the lower point of eye.
These dimensions are based on the symmetry that our face Figure 2. Bezier Curve on Lips.
represents. Now, this final sum gives us the upper height of
lip box. The ending height (lower height) of this box will be
the lower end of the face. Moreover, the width is calculated
as distance between the ¼ position of left eye box and ¾
position of right eye box. IX. APPLY BEZIER CURVE ON EYES

The rectangular lip box thus obtained contains entire lip


Emotions like Anger or Sadness can be directly judged
and may contain some part of nose.
based on the size of the eye. If the eyes are wide then, it
might point to surprise or anger, while deep eyes may point
to sadness. Hence, an outline curve of eyes is also needed to
VIII. APPLY BEZIER CURVE ON LIP be computed.

The rectangular lip box contains lip, which is inarguably While applying Bezier curve on the eyes, we need to
the most varied feature of our face. It may have variations in remove the eyebrows from the eye. For doing so, we first
tone and evenness. It is of utmost importance that the lip area search continuous black pixels and then continuous white
be made sufficiently smooth so as to detect exact movements pixels followed again by continuous black pixels in the
of lips and the trenches it makes in the face. binary image of the detected eyes. By just removing the first
We convert the skin pixels in the lip box to white pixels series of black pixels, we eliminate the eyebrows.
and rest of them to black pixels. The tone of skin, as we
know changes unevenly in this area (upper lip, chin, dents in Similar to what we did while applying Bezier curve to
cheek etc).Hence, these pixels which are similar to skin lips, we find the largest connected region to segregate eyes
pixels are also converted to white pixels. If average RGB and eliminate skin and skin-alike pixels owing to mono-tone
value of pixels is calculated, and the difference between two effect. Once again , we mark the end pixels and get four
such pixels is less than or equal to 10, then we consider them points on the tangents so that for each eye, we can represent
as similar pixels. the area by a cubic Bezier curve.
While finding the similar pixels, 10 can not always be the
standard comparison value. Depending on the quality of the
image, we will get variations in the RGB values. To tackle Figure 3. Bezier Curve on Eye.
this issue, we first use a histogram for finding the lowest and
the highest RGB value. If this difference is less than 70, then
the quality of corresponding image is high. Hence, to judge

107
International Journal of Emerging Science and Engineering (IJESE)
ISSN: 2319–6378, Volume-1, Issue-7, May 2013

X. EMOTION DETECTION
Once all the feature detection is done, all that remains is XII. REFERENCES
comparison. We convert all the Bezier curves (left eye, right
eye, lips) into larger areas with width 100 units and height
according to its width (h=1.5w). These static curves are then
compared with the existing values in the database. [1] MarianStewart Bartlett, Gwen Littlewort, Ian Fasel ,Javier R.
Movellan, Real Time Face Detection and Facial Expression
Depending on the comparison, nearest matching pattern is Recognition: Development andApplications to Human Computer
picked and the corresponding emotion is given as output. Interaction,Machine perception laboratory Institute for Neural
Computation, University of California, San Diego, CA.
If the input doesn’t match the ones in the database, then
average height for each emotion in the database is calculated,
and decision is generated on the basis of average height. The
closest matching emotion is then given as output.
The output after comparison with the database would be [2] Liyanage C De Silva, Suen Chun Hui, Real Time Facial Feature
of the graphical form as well as an animated figure depicting Extraction and Emotion Recognition 2003.
the emotion as it is! [3] Abdesselam Bouzerdoum, Son Lam Phung, Fok Hing Chi Tivive,
Peiyao LiFeature, Selection For Facial Expression Recognition .
[4] Irfan A. Essa,Coding, Analysis, Interpretation, and Recognition of
Facial Expressions.

About the authors:

1) Prof. P. M. Chavan
Professor (Computer Technology and Information
Technology Department) at Veermata Jijabai
Technological Institute,
Matunga, Mumbai- 400 019
Figure 4. Facial Feature Detection and Emotion Recognition

XI. CONCLUSION AND FUTURE WORK

In this paper we have proposed a system that automatically 2) Mr. Manan C. Jadhav
detects human emotions on the basis of facial expressions. Student at Veermata Jijabai Technological Institute,
Matunga, Mumbai- 400 019
This paper and the techniques followed are more suited as
this paper already uses the best of other approaches for face
recognition, thus limiting the latency in response. Moreover,
it further takes a step to improvise the emotion detection
technique using Cubic Bezier Curve Implementation which
is more adaptive and resurface as the ones with utmost 3) Ms. Aditi K. Nehete
importance in various other fields like that of robotics, Student at Veermata Jijabai Technological Institute,
computer graphics, automation and animation. The system Matunga, Mumbai- 400 019
works well for faces with different shapes, complexions as
well as skin tones and senses basic six emotional
expressions.

The system even handles face rotations across x-axis and


fetches accurate results for horizontal rotations. However 4) Ms. Jinal B. Mashruwala
this flexibility is not extended over vertical rotations of Student at Veermata Jijabai Technological Institute,
image. Matunga, Mumbai- 400 019
Although it is unable to discover compound or mixed
emotions.The system accurately detects the emotions for a
single face, however this correctness reduces prominently
for multiple faces. The system can further be upgraded to
include multiple face detections. 5) Ms. Pooja A. Panjari
Student at Veermata Jijabai Technological Institute,
This would indeed form a big step in areas of Human Matunga, Mumbai- 400 019
Computer Interaction, Image Processing, Animation,
Automation and Robotics.

108
View publication stats

You might also like