Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
1 | P a g e
HAND GESTURE RECOGNITION
SYSTEM
FINAL YEAR PROJECT REPORT
AFNAN UR REHMAN (P11-6053)
HASEEB ANSER IQBAL (p11-6106)
ANWAAR UL HAQ (p11-6001)
SESSION 2011-2015
SUPERVISED BY
Dr. NAVEED ISLAM
DEPARTMENT OF COMPUTER SCIENCE
NATIONAL UNIVERSITY OF COMPUTER & EMERGING SCIENCES,
PESHWAR CAMPUS
(MAY 2015)
2 | P a g e
STUDENT’S DECLARATION
I declare that this project entitled “HAND GESTURE RECOGNITION SYSTEM”,
submitted as requirement for the award of BS (CS) degree, does not contain any material
previously submitted for a degree in any university; and that to the best of my knowledge
it does not contain any materials previously published or written by another person except
where due reference is made in the text.
AFNAN UR REHMAN ________________________
HASEEB ANSER IQBAL ________________________
ANWAAR UL HAQ ________________________
3 | P a g e
HAND GESTURE RECOGNITION SYSTEM
THE DEPARTMENT OF COMPUTER SCIENCE, NATIONAL UNIVERSITY OF
COMPUTER & EMERGING SCIENCES, ACCEPTS THIS THESIS SUBMITTED BY
AFNAN UR REHMAN, HASEEB ANSER IQBAL, ANWAAR UL HAQ IN ITS
PRESENT FORM AND IT IS SATISFYING THE DISSERTATION REQUIREMENTS
FOR THE AWARD OF BACHELOR DEGREE IN COMPUTER SCIENCE.
SUPERVISOR
Dr. NAVEED ISLAM
ASSISTANT PROFESSOR ________________________
FYP COORDINATOR
Mr. SHAKIR ULLAH
ASSISTANT PROFESSOR ________________________
HEAD OF DEPARTMENT
FAZL-E-BASIT
ASSISTANT PROFESSOR ________________________
DATED:
DEPARTMENT OF COMPUTER SCIENCE
NATIONAL UNIVERSITY OF COMPUTER & EMERGING SCIENCES,
PESHWAR CAMPUS
4 | P a g e
ACKNOWLEDGEMENT
Through this acknowledgment, we express our sincere gratitude to all those people
who have been associated with this project and have helped us with it and made it
a worthwhile experience.
Firstly we extend our thanks to the Final year project coordinator who arranged
and managed all the presentations and understood all of our problems in a good
manner and effectively. Without his management skills we might have faced a lot of
problem.
Secondly we would like to thank Final year project committee, who attended each
and every presentation and the listened to our project related problems and
presented solutions and opinions. They effectively raised questions about the
limitations of our system in different phases and advised us to use better and
effective techniques where we could, it was due to their judgment that we improved
our project to overcome those limitations, so they were crucial to this project.
In the last we would like to take this opportunity to express a deep sense of gratitude
to our Final year project Supervisor for his cordial support, exemplary guidance,
monitoring and constant encouragement. Whenever we needed his help he was there
to help us.
We are obliged to our batch fellows and parents for their valuable guidance and
co-operation during the period of this task. Their blessing, help and guidance was
a deep inspiration to us.
5 | P a g e
ABSTRACT
We have proposed a method for real time Hand Gesture Recognition and feature extraction
using a web camera. In this approach, the image is captured through webcam attached to
the system. First the input image is preprocessed and threshold is used to remove noise
from image and smoothen the image. After this apply region filling to fill holes in the
gesture or the object of interest. This helps in improving the classification and recognition
step. Then select the biggest blob (biggest binary linked object) in the image and remove
all small object, this is done to remove extra unwanted objects or noise from image.
When the preprocessing is complete the image is passed on to feature extraction phase.
For feature extraction “HU moments” are used because of their distinct properties like
rotation, scale and translation invariance. The extracted features are normalized and
matched with the training dataset features using KNN (K-nearest neighbor) algorithm.
Euclidean distance in KNN is used to calculate the distance and then for finding the nearest
neighbor. The test image is classified in nearest neighbor’s class in training set. The
classification results are displayed to user and through the windows text to speech API
gesture is translated into speech as well. The training data set of images that is used has 5
gestures, each with 50 variations of a single gesture with different lighting conditions. The
purpose of this is to improve the accuracy of classification.
Keywords
Hand gestures, gesture recognition, contours, HU moments invariant, Sign language
recognition, Matlab, K-mean classifier, Human Computer interface, Text to speech
conversion and Machine learning.
6 | P a g e
Disclaimer
The report is submitted as part requirement for Bachelor’s degree in Computer science at
FAST NU Peshawar. It is substantially the result of Afnan-Ur-Rehman, Anwaar Ul Haq
and Haseeb Anser Iqbal’s own work except where explicitly indicated in the text.
The report will be distributed to FYP supervisor and FYP coordinator to examine it, but
there after may not be copied or distributed.
7 | P a g e
Table of Contents
1 Introduction ............................................................................................................. 9
2 Background............................................................................................................10
Literature ...................................................................................................................10
Image sensing...........................................................................................................10
3 Method ...................................................................................................................12
Proposed Method .....................................................................................................12
Steps chart:...............................................................................................................13
Flow chart: ................................................................................................................14
4 Image Acquisition...................................................................................................15
5 Preprocessing ........................................................................................................16
Flow chart of steps:..................................................................................................16
RGB to Grayscale:....................................................................................................16
Binarize......................................................................................................................17
Grayscale filtering using value ...............................................................................17
Noise removal and smoothing ................................................................................18
Remove small objects other than hand......................................................................20
Region filling.............................................................................................................21
Canny edge detection (Additional step).................................................................21
6 Hand Detection.......................................................................................................23
7 Hand cropping........................................................................................................24
8 Feature extraction ..................................................................................................25
9 Hand Gesture Training (Machine learning).............................................................27
8 | P a g e
Machine Learning.....................................................................................................27
Training Dataset .......................................................................................................27
Feature Extraction:...................................................................................................30
Normalization:...........................................................................................................30
Inter class difference: ..............................................................................................30
10 Classification..........................................................................................................31
11 Text to speech........................................................................................................34
12 UML Diagrams .......................................................................................................35
Use Case Diagram....................................................................................................35
Sequence Diagram ...................................................................................................36
Flow Diagram ............................................................................................................37
13 Conclusion .............................................................................................................38
Future work ...............................................................................................................38
Potential applications ..............................................................................................38
14 Project poster .........................................................................................................39
15 References.............................................................................................................41
16 Turnitin Originality Report.......................................................................................42
9 | P a g e
1 Introduction
Hands are human organs which are used to manipulate physical objects. For this very
reason hands are used most frequently by human beings to communicate and interact with
machines. Mouse and Keyboard are the basic input/output to computers and the use of both
of these devices require the use of hands. Most important and immediate information
exchange between man and machine is through visual and aural aid, but this
communication is one sided. Computers of this age provide humans with 1024 * 768 pixels
at a rate of 15 frames per second and compared to it a good typist can write 60 words per
minute with each word on average containing 6 letters. To help somewhat mouse remedies
this problem, but there are limitations in this as well. Although hands are most commonly
used for day to day physical manipulation related tasks, but in some cases they are also
used for communication. Hand gestures support us in our daily communications to convey
our messages clearly. Hands are most important for mute and deaf people, who depends
their hands and gestures to communicate, so hand gestures are vital for communication in
sign language.
If computer had the ability to translate and understand hand gestures, it would be a leap
forward in the field of human computer interaction. The dilemma, faced with this is that
the images these days are information rich and in-order to achieve this task extensive
processing is required. Every gesture has some distinct features, which differentiates it
from other gestures, HU invariant moments are used to extract these features of gestures
and then classify them using KNN algorithm. Real life applications of gesture based human
computer interaction are; interacting with virtual objects, in controlling robots, translation
of body and sign language and controlling machines using gestures.
10 | P a g e
2 Background
Literature
Several methods are proposed for both dynamic and static hand gestures. [1] Pujan Ziaie
proposed a technique of first computing the similarity of different gestures and then assign
probabilities to them using Bayesian Interface Rule. Invariant classes were estimated from
these using a modification of KNN (K-Nearest Neighbor).These classes consist of Hu-
moments with geometrical attributes like rotation, transformation and scale in variation
which were used as features for classification. Performance of this technique was very
well and it was giving 95 % accurate results. [2] Pujan Ziaie also proposed a similar
technique which also uses HU-moments along with modified KNN (K-Nearest
Neighbor) algorithm for classification called as Locally Weighted Naive Bayes Classifier.
Classification results were this technique were 93% accurate under different lighting
conditions with different users. [3] Rajat Shrivastava proposed a method, in which he used
HU moments and hand orientation for feature extraction. Baum Welch algorithm was used
for recognition. The method has accuracy of 90 %. [4] Technique propose by Neha S.
Chourasia, Kanchan Dhote and Supratim Saha used a hybrid feature descriptor, combining
HU invariant moments and SURF. They used (KNN) K-nearest neighbors and SVM for
classification. They Achieved 96% accuracy. [5] Joyeeta Singa proposed a hand gestures
recognition system based on K-L Transform. This system was consisting of five steps,
which are; skin filtering (Image acquisition, converting RGB to HSV, filtering image,
smoothing image, binary image, finding biggest BLOB), palm cropping, Hand edge
detection using Canny edge detector, feature extraction using K-L Transform and
classification. [6] Huter proposed a system that uses Zernike moments to extract image
features and used Hidden Markov Model for recognition. [7] Raheja proposed a technique
that scanned the image all directions to find the edges of finger tips. [8] Segan proposed a
technique that used edges for feature extraction. This reduces the time complexity and also
help for removing noise.
Image sensing
Image is a two-dimensional function f(x, y), where x and y are spatial coordinates, and the
amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the
image at that point.
Image creation is based on two main factors which are; Reflection or absorption of energy
from the object being imaged and Illumination source. Illumination source can be an
electromagnetic energy like; infrared, or X-ray or sources like ultrasound, sunlight or
Computer generated illumination pattern. In some cases, the energy that is transmitted or
reflected is focused onto converter, this is called photo converter. This photo converter
converts energy into visible light. A basic arrangement of sensors is used to convert energy
11 | P a g e
into digital images. The energy that is coming in is converted into a voltage, by the use of
input electrical power and sensor material that is responsive to a specific type of energy
that is being detected. In response the sensor produces an output waveform and the digital
quantity produced by each sensor. This is just the approximation or real scene.
Camera in computers usually include a lens (image sensor) and they also may include a
microphone to capture sound. Image sensors of computer can be one of two type available;
CCD or CMOS. CCD stands for charge coupled device and CMOS stands for
Complementary metal oxide semiconductor. Most of the user web cameras are able to
provide VGA resolutions. This is at a rate of 30 frames per second. The next generation
modern devices on the other hand are capable of providing multi-megapixel resolutions. In
the project ordinary Web camera is used to capture the scene.
12 | P a g e
3 Method
Proposed Method
In order to extract features and recognize a gesture following method is proposed:
1. A GUI which allows the user to capture the scene. This phase is called image
acquisition.
2. After capturing the image, next step is to detect the hand and separate the hand
from the scene, because only hand gesture is needed for accurate classification. If
hand is not separated from the scene it will affect the accuracy of the system
while extracting and matching the features.
3. Crop hand out of scene.
4. Preprocessing steps, which are:
a. Convert RGB to Gray scale.
b. Gray filtering using Value.
c. Noise removal and smoothing.
d. Remove small objects other than hand.
5. Feature extraction using HU moments invariant.
6. Classification using KNN algorithm. Using Euclidean distance formula for
calculating distance and having threshold to have better results.
7. Translation (conversion) in Speech.
The proposed method is given in the figure 3.1.
13 | P a g e
Steps chart:
Figure 3.1 Proposed steps
Image
acquisition
Hand
detection
Crop HandPreprocessing
Feature
extraction
Classification
Gesture to
speech
14 | P a g e
Flow chart:
Figure 3.2 Proposed flow chart
Detection:
Capture scene (image)
Preprocessing
Hand Detection
Feature Extraction for
Gesture
Contour detection
Learning:
Training Set (Hand
Gestures)
Feature Extraction
Recognition:
Feature Matching
Gesture Recognition
Conversion to speech
15 | P a g e
4 Image Acquisition
In this step a GUI, is made which shows the video stream of the scene. From that GUI
when the capture button is clicked it takes an image of the scene. The problem is that this
scene includes the whole body and other unwanted objects as well. The figures below
shows the GUI based front end of the system through which user can capture the image:
Figure 4.1 System GUI
16 | P a g e
5 Preprocessing
Flow chart of steps:
Figure 5.1 Steps of preprocessing
RGB to Grayscale:
RGB stands for Red, Green and blue. It is a system of colors in which these three mentioned
colors are added in different quantities to give different colors. A human’s ability of visions
can distinguish between many different colors, their intensities and shades. When it comes
to the shades of gray, human vision can only distinguish approximately 100 shades of gray.
So it is evident from this fact that the images that colored contain more information.
RGB to
Grayscale
Gray
filtering
using
value
Binarize
Noise
removal
and
smoothing
Remove
small
objects
other than
hand
Region
filling
17 | P a g e
Figure 5.2 This is RGB Image
Figure 5.3 This is a grayscale Image
Binarize
Binarization is a process which converts a gray level image to a binary image. Gray level
image has 0 to 255 levels, whereas in binary image there are only two values; 0 and 1(black
and white).
Grayscale filtering using value
There are many different type of filters in the field of Digital Image Processing, Gray level
filter is one of them. This filter works on gray level image. The aim is to reduce noise in
order to increase accuracy and get better results out of this system. In this a threshold is
18 | P a g e
used to filter out noise in grayscale image. The threshold used in this project was 75, it was
giving better results.
Figure 5.4 Grayscale image
Figure 5.5 Image after Grayscale filtering
Noise removal and smoothing
What is noise? Noise is actually a variation in an image or unwanted and undesired changes
in the color or brightness of an image. Noise in the image need to be removed, because it
will affect the results. If features extracted from a noisy image are used and then it is
classified, it will be misleading and will result in bad classification and results, so in-order
19 | P a g e
to avoid this image is preprocessed by removing noise from this image. It will increase the
accuracy of the system.
In the field of digital image processing smoothing is used as a preprocessing step. This is
a process which will use different type of filters and apply them on the image. What it does
is that it will give an approximation, which means that you can get the important portion
or pattern in an image and the noise in that image will be reduced significantly, hence
improving the results massively. In the figure below there is a small dot, which is unwanted
and is a noise, which need to be removed, because this dot will participate in the feature
extractions process and then in classifying this image in a labeled class it might deviate and
give wrong results.
Figure 5.6 Image with Noise
Figure 5.7 Filtered image with noise being removed.
20 | P a g e
In-order to remove noise from this image a 3x3 median filter is used. What is does is that
it will create a small matrix of dimensions 3x3 and this matrix will move on the image
pixel by pixel. This will calculate the median of all the covered pixels and replace the
middle value or the current pixel with the median of its neighborhood pixels. It will also
make edges clear. The result of this filter is evident in the above example figure.
Remove small objects other than hand
In the figure 5.7 it can be seen that the biggest object in the image is the hand. The object
of interest is the hand, not other small objects or noise acting as a small object in the images.
This biggest object in this case which is hand is called biggest BLOB. In this step a
threshold of 50 was used, that removed all the connected components that have a pixel size
lower than 50, it means remove all the objects that have pixels smaller than 50. As a result
only the biggest object is extracted, which is hand in this case. This uses 8 connected
neighbors.
Figure 5.8 Image before Applying BLOB
21 | P a g e
Figure 5.9 Image after removing small objects other than hand
Region filling
To improve accuracy region filling is applied. This completed the hand portion where due
to bad lighting conditions erroneous or bad image of gestures was captured. This improved
the accuracy of the project a lot. It actually fills the holes left in the gestures.
.....3,2,1)( 1   kABXX c
kk
Take the first point in the hole which is X0. B will be the structuring element, Ac
will be the
complement of the image A. The algorithm will move through all the pixels inside the hole
and apply the above equation which involves dilation operation, till Xk. At this stage the
result will be the whole inside area of the shape and then its union is taken will be taken
with the original image.
Canny edge detection (Additional step)
One additional step that can performed is to extract the contours (edges) of the hand.
Actually edge detection is a technique, which extracts the boundaries of an object in an
image which in this case is the hand. This works and finds edges by using the
discontinuities in the brightness in the image. There are many edge detections algorithms
like, Sobel, Prewitt, Fuzzy logic, Canny and even using erosion and then subtracting it
from original image.
Canny is an algorithm designed to detect edges in the best possible way. What sets Canny
apart from others? Actually Canny takes double threshold value, one for sharp edges and
one for weak edges. Which mean it detects better. They major plus point of Canny over
other algorithms is that it takes First Derivative in Horizontal Direction, Vertical Direction
22 | P a g e
and even diagonally while others can only do this in one direction, either horizontally or
vertically.
Canny takes an image as input and outputs an image with the edges of the object found on
the basis of discontinuities in the brightness. Initially what it does is that it will apply
Gaussian Convolutions to perform image smoothing. After this it applies the derivatives
which results in outputting ridges. Ridges is mountain top or hill top kind of shape, then it
uses a threshold to make all the other parts 0, which means it makes all the other part black
and leave only edge. In the figures 5.10 and 5.11, the effect of Canny and other algorithms
can be seen and it is understandable why Canny is better.
Figure 5.10 Image after applying canny edge detection.
Figure 5.11 Image after applying Sobel edge detection.
It is evident from the figures 5.10 and 5.11 that Canny is a better technique for edge
detection.
23 | P a g e
6 Hand Detection
First of all colored image is read which is captured in image acquisition step. Once we get
the image, the dimensions of the image are calculated. Number of color bands should be
one. If the image is not a grayscale, convert it to grayscale by only taking green channel.
Now find the biggest blobs. This technique results in giving two biggest blobs, ignore the
first biggest blob, which is the largest one. The second biggest blob will be the hand. This
result in drawing box around the blobs and second biggest blob is separated from the image.
The limitation of this technique is that color of clothes and other objects in scene might
effect it. It can be demonstrated by the following figure.
Figure 6.1 hand detection
24 | P a g e
7 Hand cropping
Once the portion of hand is separated from the Image, the hand is cropped out, for this
certain threshold is used. Actually in binarizing of the image a threshold value is used,
which only gives out the portion of image with hand and then we can crop out the hand.
This image of hand is then stored and passed to the next phase.
25 | P a g e
8 Feature extraction
What are features? To understand this let us consider a scenario. An image is acquired, and
user wants to classify this image, now what user does is that he/she will have a large amount
of images stored which will take a lot of space, plus user will have to compare image pixel
by pixel which will be computation expensive and will also have a large space complexity.
This is not a realistic approach. Both these factors need to be reduced. Plus if the object
which in this case is hand, its rotation, translation or position will result in a bad
classification if that variation is not already present in the images that is compared with.
So in-order to avoid this dilemma user uses feature extraction. Now let us come back to
what are features, feature is a term related to the field of computer vision. A features is a
small information or the prominent and important details. These details can be the edges
(contours), or objects.
There are various algorithms used for features extractions like Zernike moments and
Fourier descriptors. In general, descriptors are some set of numbers that are produced to
describe a given shape. A few simple descriptors are:
 Area: The number of pixels in the shape.
 Perimeter : The number of pixels in the boundary of the shape
 Elongation: Rotate a rectangle so that it is the smallest rectangle in which the
shape fits. Then compare its height to its width.
 Rectangularity: How rectangular a shape is (how much it fills its minimal
bounding box) area of the object.
 Orientation: The overall direction of the shape.
Moments are common in statistics and physics. What Statistical Moments Are?
1) Mean
2) Variance
3) Skew
4) Kurtosis
Moment of image is weighted average of the images (Intensities of Pixels) they are usually
have some attractive property. It is useful to describe shapes in an image (Binary) after
segmentation. Using image moments one can find simple properties of an image such as
area (intensity), centroid and orientation of object inside an image.
Raw Moments
Image with pixel intensities I(x, y)
Raw moments of a simple image include:
1) Sum of grey levels or Area (In case of Binary image) : M00
2) Centroid : M10/ M00 , M01/ M00
26 | P a g e
Central Moments
Where f(x, y) is input digital image where
Scale Invariant Moments
Moments nij where i+j >=2 can be constructed to be invariant to both translation and changes
in scale by dividing the corresponding central moment by dividing 00th
moment.
Rotation Invariant Moments (HU set of invariant moments)
HU set of invariant moments are most frequently used which are invariant under Translation,
Rotation and Scale.
These 7 values from I1 to I7 are the feature set Stored as descriptors for each image.
The usefulness of these moments in this application is that they are used to process images
in order to make their features invariant to scale, translation and transformations.HU
moments are used in this project. They are also called invariant statistical moments because
they are not affected by rotation, scaling and translation.
27 | P a g e
9 Hand Gesture Training (Machine learning)
Machine Learning
Machine Learning involves two basic Steps:
 Collecting Training Set.
 Feature Extraction.
Figure 9.1 Machine learning
Training Dataset
The dataset with variations is captured for training step. Training dataset consist of 5
gestures, there are 50 variations for each Gesture. So that the system is trained to get more
accuracy with
Variations of same gesture. This helps to recognize the gesture under different conditions.
Few samples from the proposed dataset are:
Gesture 1:
First Variation Second Variation Third Variation
Training
set Images
Features from
Invariant HU
Moment
Feature
Set
28 | P a g e
Figure 9.2 Punch gesture
Gesture 2:
First Variation Second Variation Third Variation
Figure 9.3 Left gesture
Gesture 3:
First variation Second Variation Third variation
Figure 9.4 Well done gesture
Gesture 4:
First variation Second Variation Third variation
Figure 9.5 Drop gesture
29 | P a g e
Gesture 5:
First variation Second variation Third variation
Figure 9.6 Catch gesture
The following 5 gestures are included.
Figure 9.7 Gestures included in the system
30 | P a g e
Feature Extraction:
Feature extraction in training step is the same as explained in chapter 8 (See page 25).
In training/learning step features of each image are extracted using the method of HU Set
of Invariant Moments and store the result for each image of training set in a file so that
during classification step it need not to be done again. The file contains a matrix having
descriptor values of each image from the training dataset and its classifier class. It saves
time and makes classification robust because the most time consuming operation among
most of these is training.
Normalization:
The matrix of features which is calculated and stored, each row in it represents one image.
Each attribute of matrix represent a specific feature (attribute), one attribute does not
depend on another. Therefore the values of each column need to be normalized irrespective
to each other. Max of each column is stored in a file which will be later used in the
classification step.
Value of each row in a particular attribute (feature) is selected, and is divided by the
maximum value of that attribute (feature) in whole matrix; it is repeated for all the records.
What this does is that it normalizes the values which mean that the resultant values will be
in a range of 0 and 1. This vastly improves the results of classification. It will decrease
biasness where each attribute has the same weight in classification.
Inter class difference:
Average of each class is calculated from matrix of descriptors. In this step one class is
chosen and the distance between the current row and the rest of the classes is calculated.
Same step is done for all the classes and results are stored. After this find three values from
these results, which are; Maximum, Minimum and Median. These values can be used as a
threshold and it depends on the level of hardness for classification. This is an adoptive
threshold and the purpose of this is the prevent under-fitting. The level of hardness is the
level of under-fitting.
31 | P a g e
10 Classification
Classification involves two basic steps:
 Machine Leaning
 Recognition
Machine Learning:
Recognition:
Figure 10.1 Classification steps
Training
set Images
Features from Invariant
HU Moment
Feature
Set
Resul
t Classified
Classification
Test
Image
Feature
s
32 | P a g e
Recognition:
Recognition involve following steps:
 First the features of the test image are calculated using Hu moments.
 These features are compared with the training feature set.
 The algorithm used for classification is KNN (K-nearest neighbor).
 This algorithm uses neighbors to calculate distance and on the basis of distances it
classifies the current record in one of the predefined classes.
 Euclidean is used for finding the distance by comparison.
Euclidean Distance ( ( X,Y),(A,B) ) = [ (X – A)2
+ ( Y – B )2
]1/2
 Gesture is classified in to the class with which it has minimum distance.
 K value is selected, which is the number of neighbors taken in account for every
calculation.
 Carefully select the value of K, if the value of K is too small it is sensitive to noise,
and if the K value is too large the neighbors might include points that are from other
classes. So a normal or medium value of K is selected.
 One of the limitations of this method is that it will classify the input gesture to at
least one of the training class with minimum distance, which results in in-correct
classification. So a Threshold is applied.
 After calculating distance the value is compared with the Threshold.
 If it passes the threshold it is classified, otherwise it is identified as a new gesture.
Test results:
Figure 10.2 Punch gesture test
33 | P a g e
Figure 10.3 Drop gesture test
Figure 10.4 Catch gesture test
Figure 10.5 Left gesture test
Figure 10.6 Well done gesture test
34 | P a g e
11 Text to speech
Once the gesture gets translated the class of the gesture which was given at run time is
obtained. In this function, first of all the type of voices available is searched and then the
first available voice is picked up by default. As a parameter user sends it text and voice
type. After that it sets the speed of text. The speed or pace range of voice can be in the
range of -10 to 10. By default the speed is 0. After that it sets the rate of sampling of the
speech. It is based on speech API of MS window 32.
35 | P a g e
12 UML Diagrams
Use Case Diagram
Figure 12.1 Use case diagram
36 | P a g e
Sequence Diagram
Figure 12.2 Sequence diagram
37 | P a g e
Flow Diagram
Figure 12.3 Flow diagram
38 | P a g e
13 Conclusion
Future work
There are some aspects of projects which can be improved in future.
 Instead of webcam a better and more accurate acquisition device can be used which
even used Infrared for accuracy e.g. Kinect.
 Mechanism for hand detection is not accurate.
 HU set of invariant moments are very basic descriptors as features of image which
will not have good accuracy. A better descriptor can give good results but
classification mechanism may change.
Potential applications
Image recognition concept have vital applications in various fields like:
 Robotics.
 Artificial Intelligence.
 Controlling the Computer through hand gestures.
39 | P a g e
14 Project poster
Poster for this project is created using Adobe InDesign, which is a software by Adobe
specially made for poster designing. This poster is of standard size and is using vector
graphics so no matter how much it is zoomed, its pixels will not burst.
Figure 14.1 Project poster in Adobe InDesign.
40 | P a g e
Figure 14.2 Project poster.
41 | P a g e
15 References
[1] Pujan Ziaie, Thomas M¨uller and Alois Knoll. A Novel Approach to Hand-
Gesture Recognition in a Human-Robot Dialog System: Robotics and Embedded
Systems Group Department of Informatics Technische Universit¨at Munchen.
[2] Pujan Ziaie and Alois Knoll. An invariant-based approach to static Hand-Gesture
Recognition: Technical University of Munich.
[3] Rajat Shrivastava. A Hidden Markov Model based Dynamic Hand Gesture
Recognition System using OpenCV: Dept. of Electronics and Communication
Engineering Maulana Azad National Institute of Technology Bhopal-462001,
India.
[4] Neha S. Chourasia, Kanchan Dhote, Supratim Saha. Analysis on Hand Gesture
Spotting using Sign Language through Computer Interfacing: International
Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3,
Issue 3, May 2014.
[5] Joyeeta Singha, Karen Das. Hand Gesture Recognition Based on Karhunen-
Loeve Transform: Department of Electronics and Communication Engineering
Assam Don Bosco University, Guwahati, Assam, India.
[6] Hunter, E. Posture estimation in reduced model gesture imput systems,
Proceedings of International Workshop on Automated Face and Gestures
Recognition, June 1995.
[7] Chaudhary, A., Raheja, J. L., Das, K., Raheja, S., A Vision based Geometrical
Method to find Fingers Positions in Real Time Hand Gesture Recognition,
Journal of Software, Academy Publisher, Vol.7, 2012.
[8] Segan, J, Controlling computers with gloveless gestures in Virtual Reality
Systems. 1993
[9] Gastaldi G. and et al., "a man-machine communication system based on the
visual analysis of dynamic gestures", International conference on image
processing, Genoa, Italy, September, 2005, pp.397-400
42 | P a g e
16 Turnitin Originality Report
HAND GESTURE RECOGNITION SYSTEM by Afnan Ur Rehman, Haseeb Ansar Iqbal,
Anwaar ul Haq
From HAND GESTURE RECOGNITION SYSTEM (Research)
 Processed on 30-Jun-2015 08:29 PKT
 ID: 553340821
 Word Count: 3906
Similarity Index
10%
Similarity by Source
Internet Sources:
8%
Publications:
6%
Student Papers:
6%
Sources:
1
2% match (Internet from 11-Dec-2007)
http://www.forbes.com/lists/2007/10/07billionaires_The-Worlds-
Billionaires_NameHTML_36.html
2
1% match (Internet from 12-Jul-2013)
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/MORSE/region-props-and-
moments.pdf
3
1% match (Internet from 12-Oct-2014)
http://www.ijsret.org/pdf/120374.pdf
4
1% match (publications)
A. Musso. "Structural dynamic monitoring on Vega platform: an example of Industry and
University collaboration", Proceedings of European Petroleum Conference EUROPEC,
10/1996
5
1% match (student papers from 16-Dec-2014)
Submitted to iGroup on 2014-12-16
6
1% match (student papers from 03-Aug-2010)
Submitted to Universiti Teknikal Malaysia Melaka on 2010-08-03
7
< 1% match (Internet from 01-Jul-2003)
http://www.discovery.mala.bc.ca/web/bandalia/digital/work.htm
8
43 | P a g e
< 1% match (publications)
Yeo, Hangu, Vadim Sheinin, Yuri Sheinin, and Benoit M. Dawant. "", Medical Imaging
2009 Image Processing, 2009.
9
< 1% match (student papers from 16-Dec-2013)
Submitted to Universiti Malaysia Perlis on 2013-12-16
10
< 1% match (Internet from 05-Jun-2012)
http://www.csjournals.com/IJCSC/PDF1-1/16.pdf
11
< 1% match (student papers from 27-Oct-2012)
Submitted to VIT University on 2012-10-27
12
< 1% match (Internet from 30-Apr-2003)
http://www.goodstaff.com/jobseekers/articles/sat/Sat14.html
13
< 1% match (Internet from 29-Jul-2010)
http://ethesis.nitrkl.ac.in/1459/1/Removal_of_RVIN.pdf
14
< 1% match (Internet from 08-Oct-2013)
http://www.lifesciencesite.com/lsj/life1009s/041_20339life1009s_289_296.pdf
15
< 1% match (publications)
Henke, Daniel, Padhraic Smyth, Colene Haffke, and Gudrun Magnusdottir. "Automated
analysis of the temporal behavior of the double Intertropical Convergence Zone over the
east Pacific", Remote Sensing of Environment, 2012.
16
< 1% match (Internet from 07-Mar-2015)
http://en.wikipedia.org/wiki/Image_moment
17
< 1% match (Internet from 05-Dec-2013)
http://eventos.spc.org.pe/inns-iesnn/papers/Jimenez-Oliden-Huapaya-Cardenas-
Neurocopter.pdf
18
< 1% match (Internet from 25-Dec-2014)
http://ijcsn.org/IJCSN-2014/3-4/A-Fast-and-Robust-Hybridized-Filter-for-Image-De-
Noising.pdf
19
< 1% match (Internet from 26-Nov-2002)
http://aips2.nrao.edu/released/docs/user/Utility/node248.html
20
< 1% match (publications)
Sungsik Huh. "A Vision-Based Automatic Landing Method for Fixed-Wing UAVs",
Selected papers from the 2nd International Symposium on UAVs Reno Nevada U S A
June 8–10 2009, 2009
44 | P a g e
21
< 1% match (publications)
Kong, Fan zhi, Xing zhou Zhang, Yi zhong Wang, Da wei Zhang, Jun lan LI, Shanhong
Xia, Chih-Ming Ho, and Helmut Seidel. "", 2008 International Conference on Optical
Instruments and Technology MEMS/NEMS Technology and Applications, 2008.

More Related Content

What's hot

eye phone technology
eye phone technologyeye phone technology
eye phone technology
Naga Dinesh
 
Haptic technology ppt
Haptic technology pptHaptic technology ppt
Haptic technology ppt
Mohammad Sabouri
 
Human Activity Recognition in Android
Human Activity Recognition in AndroidHuman Activity Recognition in Android
Human Activity Recognition in Android
Surbhi Jain
 
Virtual keyboard seminar ppt
Virtual keyboard seminar pptVirtual keyboard seminar ppt
Virtual keyboard seminar ppt
Shruti Maheshwari
 
gesture-recognition
gesture-recognitiongesture-recognition
gesture-recognition
Venkat RAGHAVENDRA REDDY
 
SRS for online examination system
SRS for online examination systemSRS for online examination system
SRS for online examination system
lunarrain
 
Online doctor appointment
Online doctor appointmentOnline doctor appointment
Online doctor appointment
Amna Nawazish
 
Screenless Display PPT
Screenless Display PPTScreenless Display PPT
Screenless Display PPT
Vikas Kumar
 
20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies
Seminar Links
 
Human activity recognition
Human activity recognitionHuman activity recognition
Human activity recognition
Randhir Gupta
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image Processing
Sabnam Pandey, MBA
 
Virtual Mouse
Virtual MouseVirtual Mouse
Virtual Mouse
Vivek Khutale
 
Gesture Recognition Technology-Seminar PPT
Gesture Recognition Technology-Seminar PPTGesture Recognition Technology-Seminar PPT
Gesture Recognition Technology-Seminar PPT
Suraj Rai
 
Haptic Technology ppt
Haptic Technology pptHaptic Technology ppt
Haptic Technology ppt
Arun Sivaraj
 
Face recognition attendance system
Face recognition attendance systemFace recognition attendance system
Face recognition attendance system
Naomi Kulkarni
 
FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT
VaishaliSrigadhi
 
CSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationCSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android Application
Ahammad Karim
 
Hand Gesture Recognition Using OpenCV Python
Hand Gesture Recognition Using OpenCV Python Hand Gesture Recognition Using OpenCV Python
Hand Gesture Recognition Using OpenCV Python
Arijit Mukherjee
 
Object and pose detection
Object and pose detectionObject and pose detection
Object and pose detection
AshwinBicholiya
 
Sensor Cloud
Sensor CloudSensor Cloud
Sensor Cloud
Debjyoti Ghosh
 

What's hot (20)

eye phone technology
eye phone technologyeye phone technology
eye phone technology
 
Haptic technology ppt
Haptic technology pptHaptic technology ppt
Haptic technology ppt
 
Human Activity Recognition in Android
Human Activity Recognition in AndroidHuman Activity Recognition in Android
Human Activity Recognition in Android
 
Virtual keyboard seminar ppt
Virtual keyboard seminar pptVirtual keyboard seminar ppt
Virtual keyboard seminar ppt
 
gesture-recognition
gesture-recognitiongesture-recognition
gesture-recognition
 
SRS for online examination system
SRS for online examination systemSRS for online examination system
SRS for online examination system
 
Online doctor appointment
Online doctor appointmentOnline doctor appointment
Online doctor appointment
 
Screenless Display PPT
Screenless Display PPTScreenless Display PPT
Screenless Display PPT
 
20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies
 
Human activity recognition
Human activity recognitionHuman activity recognition
Human activity recognition
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image Processing
 
Virtual Mouse
Virtual MouseVirtual Mouse
Virtual Mouse
 
Gesture Recognition Technology-Seminar PPT
Gesture Recognition Technology-Seminar PPTGesture Recognition Technology-Seminar PPT
Gesture Recognition Technology-Seminar PPT
 
Haptic Technology ppt
Haptic Technology pptHaptic Technology ppt
Haptic Technology ppt
 
Face recognition attendance system
Face recognition attendance systemFace recognition attendance system
Face recognition attendance system
 
FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT
 
CSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationCSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android Application
 
Hand Gesture Recognition Using OpenCV Python
Hand Gesture Recognition Using OpenCV Python Hand Gesture Recognition Using OpenCV Python
Hand Gesture Recognition Using OpenCV Python
 
Object and pose detection
Object and pose detectionObject and pose detection
Object and pose detection
 
Sensor Cloud
Sensor CloudSensor Cloud
Sensor Cloud
 

Viewers also liked

Hand gesture recognition
Hand gesture recognitionHand gesture recognition
Hand gesture recognition
Muhammed M. Mekki
 
Gesture recognition
Gesture recognitionGesture recognition
Gesture recognition
PrachiWadekar
 
Gesture recognition adi
Gesture recognition adiGesture recognition adi
Gesture recognition adi
aditya verma
 
Gesture Recognition Technology
Gesture Recognition TechnologyGesture Recognition Technology
Gesture Recognition Technology
Nikith Kumar Reddy
 
Gesture recognition technology
Gesture recognition technology Gesture recognition technology
Gesture recognition technology
Nagamani Gurram
 
Hand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape ParametersHand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape Parameters
Nithinkumar P
 
GESTURE RECOGNITION TECHNOLOGY
GESTURE RECOGNITION TECHNOLOGYGESTURE RECOGNITION TECHNOLOGY
GESTURE RECOGNITION TECHNOLOGY
jinal thakrar
 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
Chris Gledhill
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
Sarang Afle
 
Deaf Culture and Sign Language Writing System – a Database for a New Approac...
Deaf Culture and Sign Language Writing System – a Database for a New  Approac...Deaf Culture and Sign Language Writing System – a Database for a New  Approac...
Deaf Culture and Sign Language Writing System – a Database for a New Approac...
Jeferson Fernando Guardezi
 
Final year project on Remote Infrastructure Management
Final year project on Remote Infrastructure ManagementFinal year project on Remote Infrastructure Management
Final year project on Remote Infrastructure Management
jairaman
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognition
Jaison2636
 
Human machine interaction using Hand gesture recognition
Human machine interaction using Hand gesture recognitionHuman machine interaction using Hand gesture recognition
Human machine interaction using Hand gesture recognition
Manoj Harsule
 
Real time gesture recognition of human hand
Real time gesture recognition of human handReal time gesture recognition of human hand
Real time gesture recognition of human hand
Vishnu Kudumula
 
Hand Gesture recognition
Hand Gesture recognitionHand Gesture recognition
Hand Gesture recognition
Nimishan Sivaraj
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
Chiranjeevi Adi
 
Deaf and dumb
Deaf and dumbDeaf and dumb
Deaf and dumb
Mariam Khalid
 
Speech Recognition , Noise Filtering and Content Search Engine , Research Do...
Speech Recognition , Noise Filtering and  Content Search Engine , Research Do...Speech Recognition , Noise Filtering and  Content Search Engine , Research Do...
Speech Recognition , Noise Filtering and Content Search Engine , Research Do...
Gayan Kalanamith Mannapperuma
 
Deaf and Dump Gesture Recognition System
Deaf and Dump Gesture Recognition SystemDeaf and Dump Gesture Recognition System
Deaf and Dump Gesture Recognition System
Praveena T
 
Voice Recognition Service (VRS)
Voice Recognition Service (VRS)Voice Recognition Service (VRS)
Voice Recognition Service (VRS)
Shady A. Alefrangy
 

Viewers also liked (20)

Hand gesture recognition
Hand gesture recognitionHand gesture recognition
Hand gesture recognition
 
Gesture recognition
Gesture recognitionGesture recognition
Gesture recognition
 
Gesture recognition adi
Gesture recognition adiGesture recognition adi
Gesture recognition adi
 
Gesture Recognition Technology
Gesture Recognition TechnologyGesture Recognition Technology
Gesture Recognition Technology
 
Gesture recognition technology
Gesture recognition technology Gesture recognition technology
Gesture recognition technology
 
Hand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape ParametersHand Gesture Recognition Based on Shape Parameters
Hand Gesture Recognition Based on Shape Parameters
 
GESTURE RECOGNITION TECHNOLOGY
GESTURE RECOGNITION TECHNOLOGYGESTURE RECOGNITION TECHNOLOGY
GESTURE RECOGNITION TECHNOLOGY
 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
My old 2002 Thesis on Hand Gesture Recognition using a Web Cam! 
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
 
Deaf Culture and Sign Language Writing System – a Database for a New Approac...
Deaf Culture and Sign Language Writing System – a Database for a New  Approac...Deaf Culture and Sign Language Writing System – a Database for a New  Approac...
Deaf Culture and Sign Language Writing System – a Database for a New Approac...
 
Final year project on Remote Infrastructure Management
Final year project on Remote Infrastructure ManagementFinal year project on Remote Infrastructure Management
Final year project on Remote Infrastructure Management
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognition
 
Human machine interaction using Hand gesture recognition
Human machine interaction using Hand gesture recognitionHuman machine interaction using Hand gesture recognition
Human machine interaction using Hand gesture recognition
 
Real time gesture recognition of human hand
Real time gesture recognition of human handReal time gesture recognition of human hand
Real time gesture recognition of human hand
 
Hand Gesture recognition
Hand Gesture recognitionHand Gesture recognition
Hand Gesture recognition
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Deaf and dumb
Deaf and dumbDeaf and dumb
Deaf and dumb
 
Speech Recognition , Noise Filtering and Content Search Engine , Research Do...
Speech Recognition , Noise Filtering and  Content Search Engine , Research Do...Speech Recognition , Noise Filtering and  Content Search Engine , Research Do...
Speech Recognition , Noise Filtering and Content Search Engine , Research Do...
 
Deaf and Dump Gesture Recognition System
Deaf and Dump Gesture Recognition SystemDeaf and Dump Gesture Recognition System
Deaf and Dump Gesture Recognition System
 
Voice Recognition Service (VRS)
Voice Recognition Service (VRS)Voice Recognition Service (VRS)
Voice Recognition Service (VRS)
 

Similar to Hand gesture recognition system(FYP REPORT)

Apeksha Resume -1-
Apeksha Resume -1-Apeksha Resume -1-
Apeksha Resume -1-
Apeksha Lokare
 
Synopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji ConversionSynopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji Conversion
IRJET Journal
 
VTU final year project report
VTU final year project reportVTU final year project report
VTU final year project report
athiathi3
 
Major File On web Development
Major File On web Development Major File On web Development
Major File On web Development
Love Kothari
 
Obj report
Obj reportObj report
Obj report
Manish Raghav
 
Preliminry report
 Preliminry report Preliminry report
Preliminry report
Jiten Ahuja
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
sneha penmetsa
 
A Facial Expression Recognition System A Project Report
A Facial Expression Recognition System A Project ReportA Facial Expression Recognition System A Project Report
A Facial Expression Recognition System A Project Report
Allison Thompson
 
Look Based Media Player
Look Based Media PlayerLook Based Media Player
Look Based Media Player
IRJET Journal
 
Print
PrintPrint
Internship report on MyGP of Grameenphone LTD.
Internship report on MyGP of Grameenphone LTD.Internship report on MyGP of Grameenphone LTD.
Internship report on MyGP of Grameenphone LTD.
Insan Haque
 
Project Report on Policy File Track and Trace System.pdf
Project Report on Policy File Track and Trace System.pdfProject Report on Policy File Track and Trace System.pdf
Project Report on Policy File Track and Trace System.pdf
Zafar Ahmad
 
Accident detection and notification system
Accident detection and notification systemAccident detection and notification system
Accident detection and notification system
Solomon Mutwiri
 
Automated Evaluator for Bharatanatyam (Nritta)
Automated Evaluator for Bharatanatyam (Nritta)Automated Evaluator for Bharatanatyam (Nritta)
Automated Evaluator for Bharatanatyam (Nritta)
Renu Hiremath
 
IRJET- Career Counselling Chatbot
IRJET-  	  Career Counselling ChatbotIRJET-  	  Career Counselling Chatbot
IRJET- Career Counselling Chatbot
IRJET Journal
 
An investigation into the physical build and psychological aspects of an inte...
An investigation into the physical build and psychological aspects of an inte...An investigation into the physical build and psychological aspects of an inte...
An investigation into the physical build and psychological aspects of an inte...
Jessica Navarro
 
INTERNET OF BEHAVIOUR.docx
INTERNET OF BEHAVIOUR.docxINTERNET OF BEHAVIOUR.docx
INTERNET OF BEHAVIOUR.docx
Amit Kumar
 
Project report feedback_system(1)
Project report feedback_system(1)Project report feedback_system(1)
Project report feedback_system(1)
Sonu Lovesforu
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
IRJET Journal
 

Similar to Hand gesture recognition system(FYP REPORT) (20)

Apeksha Resume -1-
Apeksha Resume -1-Apeksha Resume -1-
Apeksha Resume -1-
 
Synopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji ConversionSynopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji Conversion
 
VTU final year project report
VTU final year project reportVTU final year project report
VTU final year project report
 
Major File On web Development
Major File On web Development Major File On web Development
Major File On web Development
 
Obj report
Obj reportObj report
Obj report
 
Preliminry report
 Preliminry report Preliminry report
Preliminry report
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
 
A Facial Expression Recognition System A Project Report
A Facial Expression Recognition System A Project ReportA Facial Expression Recognition System A Project Report
A Facial Expression Recognition System A Project Report
 
Look Based Media Player
Look Based Media PlayerLook Based Media Player
Look Based Media Player
 
Print
PrintPrint
Print
 
Internship report on MyGP of Grameenphone LTD.
Internship report on MyGP of Grameenphone LTD.Internship report on MyGP of Grameenphone LTD.
Internship report on MyGP of Grameenphone LTD.
 
Project Report on Policy File Track and Trace System.pdf
Project Report on Policy File Track and Trace System.pdfProject Report on Policy File Track and Trace System.pdf
Project Report on Policy File Track and Trace System.pdf
 
Accident detection and notification system
Accident detection and notification systemAccident detection and notification system
Accident detection and notification system
 
Automated Evaluator for Bharatanatyam (Nritta)
Automated Evaluator for Bharatanatyam (Nritta)Automated Evaluator for Bharatanatyam (Nritta)
Automated Evaluator for Bharatanatyam (Nritta)
 
IRJET- Career Counselling Chatbot
IRJET-  	  Career Counselling ChatbotIRJET-  	  Career Counselling Chatbot
IRJET- Career Counselling Chatbot
 
An investigation into the physical build and psychological aspects of an inte...
An investigation into the physical build and psychological aspects of an inte...An investigation into the physical build and psychological aspects of an inte...
An investigation into the physical build and psychological aspects of an inte...
 
INTERNET OF BEHAVIOUR.docx
INTERNET OF BEHAVIOUR.docxINTERNET OF BEHAVIOUR.docx
INTERNET OF BEHAVIOUR.docx
 
Project report feedback_system(1)
Project report feedback_system(1)Project report feedback_system(1)
Project report feedback_system(1)
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 

Hand gesture recognition system(FYP REPORT)

  • 1. 1 | P a g e HAND GESTURE RECOGNITION SYSTEM FINAL YEAR PROJECT REPORT AFNAN UR REHMAN (P11-6053) HASEEB ANSER IQBAL (p11-6106) ANWAAR UL HAQ (p11-6001) SESSION 2011-2015 SUPERVISED BY Dr. NAVEED ISLAM DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF COMPUTER & EMERGING SCIENCES, PESHWAR CAMPUS (MAY 2015)
  • 2. 2 | P a g e STUDENT’S DECLARATION I declare that this project entitled “HAND GESTURE RECOGNITION SYSTEM”, submitted as requirement for the award of BS (CS) degree, does not contain any material previously submitted for a degree in any university; and that to the best of my knowledge it does not contain any materials previously published or written by another person except where due reference is made in the text. AFNAN UR REHMAN ________________________ HASEEB ANSER IQBAL ________________________ ANWAAR UL HAQ ________________________
  • 3. 3 | P a g e HAND GESTURE RECOGNITION SYSTEM THE DEPARTMENT OF COMPUTER SCIENCE, NATIONAL UNIVERSITY OF COMPUTER & EMERGING SCIENCES, ACCEPTS THIS THESIS SUBMITTED BY AFNAN UR REHMAN, HASEEB ANSER IQBAL, ANWAAR UL HAQ IN ITS PRESENT FORM AND IT IS SATISFYING THE DISSERTATION REQUIREMENTS FOR THE AWARD OF BACHELOR DEGREE IN COMPUTER SCIENCE. SUPERVISOR Dr. NAVEED ISLAM ASSISTANT PROFESSOR ________________________ FYP COORDINATOR Mr. SHAKIR ULLAH ASSISTANT PROFESSOR ________________________ HEAD OF DEPARTMENT FAZL-E-BASIT ASSISTANT PROFESSOR ________________________ DATED: DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF COMPUTER & EMERGING SCIENCES, PESHWAR CAMPUS
  • 4. 4 | P a g e ACKNOWLEDGEMENT Through this acknowledgment, we express our sincere gratitude to all those people who have been associated with this project and have helped us with it and made it a worthwhile experience. Firstly we extend our thanks to the Final year project coordinator who arranged and managed all the presentations and understood all of our problems in a good manner and effectively. Without his management skills we might have faced a lot of problem. Secondly we would like to thank Final year project committee, who attended each and every presentation and the listened to our project related problems and presented solutions and opinions. They effectively raised questions about the limitations of our system in different phases and advised us to use better and effective techniques where we could, it was due to their judgment that we improved our project to overcome those limitations, so they were crucial to this project. In the last we would like to take this opportunity to express a deep sense of gratitude to our Final year project Supervisor for his cordial support, exemplary guidance, monitoring and constant encouragement. Whenever we needed his help he was there to help us. We are obliged to our batch fellows and parents for their valuable guidance and co-operation during the period of this task. Their blessing, help and guidance was a deep inspiration to us.
  • 5. 5 | P a g e ABSTRACT We have proposed a method for real time Hand Gesture Recognition and feature extraction using a web camera. In this approach, the image is captured through webcam attached to the system. First the input image is preprocessed and threshold is used to remove noise from image and smoothen the image. After this apply region filling to fill holes in the gesture or the object of interest. This helps in improving the classification and recognition step. Then select the biggest blob (biggest binary linked object) in the image and remove all small object, this is done to remove extra unwanted objects or noise from image. When the preprocessing is complete the image is passed on to feature extraction phase. For feature extraction “HU moments” are used because of their distinct properties like rotation, scale and translation invariance. The extracted features are normalized and matched with the training dataset features using KNN (K-nearest neighbor) algorithm. Euclidean distance in KNN is used to calculate the distance and then for finding the nearest neighbor. The test image is classified in nearest neighbor’s class in training set. The classification results are displayed to user and through the windows text to speech API gesture is translated into speech as well. The training data set of images that is used has 5 gestures, each with 50 variations of a single gesture with different lighting conditions. The purpose of this is to improve the accuracy of classification. Keywords Hand gestures, gesture recognition, contours, HU moments invariant, Sign language recognition, Matlab, K-mean classifier, Human Computer interface, Text to speech conversion and Machine learning.
  • 6. 6 | P a g e Disclaimer The report is submitted as part requirement for Bachelor’s degree in Computer science at FAST NU Peshawar. It is substantially the result of Afnan-Ur-Rehman, Anwaar Ul Haq and Haseeb Anser Iqbal’s own work except where explicitly indicated in the text. The report will be distributed to FYP supervisor and FYP coordinator to examine it, but there after may not be copied or distributed.
  • 7. 7 | P a g e Table of Contents 1 Introduction ............................................................................................................. 9 2 Background............................................................................................................10 Literature ...................................................................................................................10 Image sensing...........................................................................................................10 3 Method ...................................................................................................................12 Proposed Method .....................................................................................................12 Steps chart:...............................................................................................................13 Flow chart: ................................................................................................................14 4 Image Acquisition...................................................................................................15 5 Preprocessing ........................................................................................................16 Flow chart of steps:..................................................................................................16 RGB to Grayscale:....................................................................................................16 Binarize......................................................................................................................17 Grayscale filtering using value ...............................................................................17 Noise removal and smoothing ................................................................................18 Remove small objects other than hand......................................................................20 Region filling.............................................................................................................21 Canny edge detection (Additional step).................................................................21 6 Hand Detection.......................................................................................................23 7 Hand cropping........................................................................................................24 8 Feature extraction ..................................................................................................25 9 Hand Gesture Training (Machine learning).............................................................27
  • 8. 8 | P a g e Machine Learning.....................................................................................................27 Training Dataset .......................................................................................................27 Feature Extraction:...................................................................................................30 Normalization:...........................................................................................................30 Inter class difference: ..............................................................................................30 10 Classification..........................................................................................................31 11 Text to speech........................................................................................................34 12 UML Diagrams .......................................................................................................35 Use Case Diagram....................................................................................................35 Sequence Diagram ...................................................................................................36 Flow Diagram ............................................................................................................37 13 Conclusion .............................................................................................................38 Future work ...............................................................................................................38 Potential applications ..............................................................................................38 14 Project poster .........................................................................................................39 15 References.............................................................................................................41 16 Turnitin Originality Report.......................................................................................42
  • 9. 9 | P a g e 1 Introduction Hands are human organs which are used to manipulate physical objects. For this very reason hands are used most frequently by human beings to communicate and interact with machines. Mouse and Keyboard are the basic input/output to computers and the use of both of these devices require the use of hands. Most important and immediate information exchange between man and machine is through visual and aural aid, but this communication is one sided. Computers of this age provide humans with 1024 * 768 pixels at a rate of 15 frames per second and compared to it a good typist can write 60 words per minute with each word on average containing 6 letters. To help somewhat mouse remedies this problem, but there are limitations in this as well. Although hands are most commonly used for day to day physical manipulation related tasks, but in some cases they are also used for communication. Hand gestures support us in our daily communications to convey our messages clearly. Hands are most important for mute and deaf people, who depends their hands and gestures to communicate, so hand gestures are vital for communication in sign language. If computer had the ability to translate and understand hand gestures, it would be a leap forward in the field of human computer interaction. The dilemma, faced with this is that the images these days are information rich and in-order to achieve this task extensive processing is required. Every gesture has some distinct features, which differentiates it from other gestures, HU invariant moments are used to extract these features of gestures and then classify them using KNN algorithm. Real life applications of gesture based human computer interaction are; interacting with virtual objects, in controlling robots, translation of body and sign language and controlling machines using gestures.
  • 10. 10 | P a g e 2 Background Literature Several methods are proposed for both dynamic and static hand gestures. [1] Pujan Ziaie proposed a technique of first computing the similarity of different gestures and then assign probabilities to them using Bayesian Interface Rule. Invariant classes were estimated from these using a modification of KNN (K-Nearest Neighbor).These classes consist of Hu- moments with geometrical attributes like rotation, transformation and scale in variation which were used as features for classification. Performance of this technique was very well and it was giving 95 % accurate results. [2] Pujan Ziaie also proposed a similar technique which also uses HU-moments along with modified KNN (K-Nearest Neighbor) algorithm for classification called as Locally Weighted Naive Bayes Classifier. Classification results were this technique were 93% accurate under different lighting conditions with different users. [3] Rajat Shrivastava proposed a method, in which he used HU moments and hand orientation for feature extraction. Baum Welch algorithm was used for recognition. The method has accuracy of 90 %. [4] Technique propose by Neha S. Chourasia, Kanchan Dhote and Supratim Saha used a hybrid feature descriptor, combining HU invariant moments and SURF. They used (KNN) K-nearest neighbors and SVM for classification. They Achieved 96% accuracy. [5] Joyeeta Singa proposed a hand gestures recognition system based on K-L Transform. This system was consisting of five steps, which are; skin filtering (Image acquisition, converting RGB to HSV, filtering image, smoothing image, binary image, finding biggest BLOB), palm cropping, Hand edge detection using Canny edge detector, feature extraction using K-L Transform and classification. [6] Huter proposed a system that uses Zernike moments to extract image features and used Hidden Markov Model for recognition. [7] Raheja proposed a technique that scanned the image all directions to find the edges of finger tips. [8] Segan proposed a technique that used edges for feature extraction. This reduces the time complexity and also help for removing noise. Image sensing Image is a two-dimensional function f(x, y), where x and y are spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. Image creation is based on two main factors which are; Reflection or absorption of energy from the object being imaged and Illumination source. Illumination source can be an electromagnetic energy like; infrared, or X-ray or sources like ultrasound, sunlight or Computer generated illumination pattern. In some cases, the energy that is transmitted or reflected is focused onto converter, this is called photo converter. This photo converter converts energy into visible light. A basic arrangement of sensors is used to convert energy
  • 11. 11 | P a g e into digital images. The energy that is coming in is converted into a voltage, by the use of input electrical power and sensor material that is responsive to a specific type of energy that is being detected. In response the sensor produces an output waveform and the digital quantity produced by each sensor. This is just the approximation or real scene. Camera in computers usually include a lens (image sensor) and they also may include a microphone to capture sound. Image sensors of computer can be one of two type available; CCD or CMOS. CCD stands for charge coupled device and CMOS stands for Complementary metal oxide semiconductor. Most of the user web cameras are able to provide VGA resolutions. This is at a rate of 30 frames per second. The next generation modern devices on the other hand are capable of providing multi-megapixel resolutions. In the project ordinary Web camera is used to capture the scene.
  • 12. 12 | P a g e 3 Method Proposed Method In order to extract features and recognize a gesture following method is proposed: 1. A GUI which allows the user to capture the scene. This phase is called image acquisition. 2. After capturing the image, next step is to detect the hand and separate the hand from the scene, because only hand gesture is needed for accurate classification. If hand is not separated from the scene it will affect the accuracy of the system while extracting and matching the features. 3. Crop hand out of scene. 4. Preprocessing steps, which are: a. Convert RGB to Gray scale. b. Gray filtering using Value. c. Noise removal and smoothing. d. Remove small objects other than hand. 5. Feature extraction using HU moments invariant. 6. Classification using KNN algorithm. Using Euclidean distance formula for calculating distance and having threshold to have better results. 7. Translation (conversion) in Speech. The proposed method is given in the figure 3.1.
  • 13. 13 | P a g e Steps chart: Figure 3.1 Proposed steps Image acquisition Hand detection Crop HandPreprocessing Feature extraction Classification Gesture to speech
  • 14. 14 | P a g e Flow chart: Figure 3.2 Proposed flow chart Detection: Capture scene (image) Preprocessing Hand Detection Feature Extraction for Gesture Contour detection Learning: Training Set (Hand Gestures) Feature Extraction Recognition: Feature Matching Gesture Recognition Conversion to speech
  • 15. 15 | P a g e 4 Image Acquisition In this step a GUI, is made which shows the video stream of the scene. From that GUI when the capture button is clicked it takes an image of the scene. The problem is that this scene includes the whole body and other unwanted objects as well. The figures below shows the GUI based front end of the system through which user can capture the image: Figure 4.1 System GUI
  • 16. 16 | P a g e 5 Preprocessing Flow chart of steps: Figure 5.1 Steps of preprocessing RGB to Grayscale: RGB stands for Red, Green and blue. It is a system of colors in which these three mentioned colors are added in different quantities to give different colors. A human’s ability of visions can distinguish between many different colors, their intensities and shades. When it comes to the shades of gray, human vision can only distinguish approximately 100 shades of gray. So it is evident from this fact that the images that colored contain more information. RGB to Grayscale Gray filtering using value Binarize Noise removal and smoothing Remove small objects other than hand Region filling
  • 17. 17 | P a g e Figure 5.2 This is RGB Image Figure 5.3 This is a grayscale Image Binarize Binarization is a process which converts a gray level image to a binary image. Gray level image has 0 to 255 levels, whereas in binary image there are only two values; 0 and 1(black and white). Grayscale filtering using value There are many different type of filters in the field of Digital Image Processing, Gray level filter is one of them. This filter works on gray level image. The aim is to reduce noise in order to increase accuracy and get better results out of this system. In this a threshold is
  • 18. 18 | P a g e used to filter out noise in grayscale image. The threshold used in this project was 75, it was giving better results. Figure 5.4 Grayscale image Figure 5.5 Image after Grayscale filtering Noise removal and smoothing What is noise? Noise is actually a variation in an image or unwanted and undesired changes in the color or brightness of an image. Noise in the image need to be removed, because it will affect the results. If features extracted from a noisy image are used and then it is classified, it will be misleading and will result in bad classification and results, so in-order
  • 19. 19 | P a g e to avoid this image is preprocessed by removing noise from this image. It will increase the accuracy of the system. In the field of digital image processing smoothing is used as a preprocessing step. This is a process which will use different type of filters and apply them on the image. What it does is that it will give an approximation, which means that you can get the important portion or pattern in an image and the noise in that image will be reduced significantly, hence improving the results massively. In the figure below there is a small dot, which is unwanted and is a noise, which need to be removed, because this dot will participate in the feature extractions process and then in classifying this image in a labeled class it might deviate and give wrong results. Figure 5.6 Image with Noise Figure 5.7 Filtered image with noise being removed.
  • 20. 20 | P a g e In-order to remove noise from this image a 3x3 median filter is used. What is does is that it will create a small matrix of dimensions 3x3 and this matrix will move on the image pixel by pixel. This will calculate the median of all the covered pixels and replace the middle value or the current pixel with the median of its neighborhood pixels. It will also make edges clear. The result of this filter is evident in the above example figure. Remove small objects other than hand In the figure 5.7 it can be seen that the biggest object in the image is the hand. The object of interest is the hand, not other small objects or noise acting as a small object in the images. This biggest object in this case which is hand is called biggest BLOB. In this step a threshold of 50 was used, that removed all the connected components that have a pixel size lower than 50, it means remove all the objects that have pixels smaller than 50. As a result only the biggest object is extracted, which is hand in this case. This uses 8 connected neighbors. Figure 5.8 Image before Applying BLOB
  • 21. 21 | P a g e Figure 5.9 Image after removing small objects other than hand Region filling To improve accuracy region filling is applied. This completed the hand portion where due to bad lighting conditions erroneous or bad image of gestures was captured. This improved the accuracy of the project a lot. It actually fills the holes left in the gestures. .....3,2,1)( 1   kABXX c kk Take the first point in the hole which is X0. B will be the structuring element, Ac will be the complement of the image A. The algorithm will move through all the pixels inside the hole and apply the above equation which involves dilation operation, till Xk. At this stage the result will be the whole inside area of the shape and then its union is taken will be taken with the original image. Canny edge detection (Additional step) One additional step that can performed is to extract the contours (edges) of the hand. Actually edge detection is a technique, which extracts the boundaries of an object in an image which in this case is the hand. This works and finds edges by using the discontinuities in the brightness in the image. There are many edge detections algorithms like, Sobel, Prewitt, Fuzzy logic, Canny and even using erosion and then subtracting it from original image. Canny is an algorithm designed to detect edges in the best possible way. What sets Canny apart from others? Actually Canny takes double threshold value, one for sharp edges and one for weak edges. Which mean it detects better. They major plus point of Canny over other algorithms is that it takes First Derivative in Horizontal Direction, Vertical Direction
  • 22. 22 | P a g e and even diagonally while others can only do this in one direction, either horizontally or vertically. Canny takes an image as input and outputs an image with the edges of the object found on the basis of discontinuities in the brightness. Initially what it does is that it will apply Gaussian Convolutions to perform image smoothing. After this it applies the derivatives which results in outputting ridges. Ridges is mountain top or hill top kind of shape, then it uses a threshold to make all the other parts 0, which means it makes all the other part black and leave only edge. In the figures 5.10 and 5.11, the effect of Canny and other algorithms can be seen and it is understandable why Canny is better. Figure 5.10 Image after applying canny edge detection. Figure 5.11 Image after applying Sobel edge detection. It is evident from the figures 5.10 and 5.11 that Canny is a better technique for edge detection.
  • 23. 23 | P a g e 6 Hand Detection First of all colored image is read which is captured in image acquisition step. Once we get the image, the dimensions of the image are calculated. Number of color bands should be one. If the image is not a grayscale, convert it to grayscale by only taking green channel. Now find the biggest blobs. This technique results in giving two biggest blobs, ignore the first biggest blob, which is the largest one. The second biggest blob will be the hand. This result in drawing box around the blobs and second biggest blob is separated from the image. The limitation of this technique is that color of clothes and other objects in scene might effect it. It can be demonstrated by the following figure. Figure 6.1 hand detection
  • 24. 24 | P a g e 7 Hand cropping Once the portion of hand is separated from the Image, the hand is cropped out, for this certain threshold is used. Actually in binarizing of the image a threshold value is used, which only gives out the portion of image with hand and then we can crop out the hand. This image of hand is then stored and passed to the next phase.
  • 25. 25 | P a g e 8 Feature extraction What are features? To understand this let us consider a scenario. An image is acquired, and user wants to classify this image, now what user does is that he/she will have a large amount of images stored which will take a lot of space, plus user will have to compare image pixel by pixel which will be computation expensive and will also have a large space complexity. This is not a realistic approach. Both these factors need to be reduced. Plus if the object which in this case is hand, its rotation, translation or position will result in a bad classification if that variation is not already present in the images that is compared with. So in-order to avoid this dilemma user uses feature extraction. Now let us come back to what are features, feature is a term related to the field of computer vision. A features is a small information or the prominent and important details. These details can be the edges (contours), or objects. There are various algorithms used for features extractions like Zernike moments and Fourier descriptors. In general, descriptors are some set of numbers that are produced to describe a given shape. A few simple descriptors are:  Area: The number of pixels in the shape.  Perimeter : The number of pixels in the boundary of the shape  Elongation: Rotate a rectangle so that it is the smallest rectangle in which the shape fits. Then compare its height to its width.  Rectangularity: How rectangular a shape is (how much it fills its minimal bounding box) area of the object.  Orientation: The overall direction of the shape. Moments are common in statistics and physics. What Statistical Moments Are? 1) Mean 2) Variance 3) Skew 4) Kurtosis Moment of image is weighted average of the images (Intensities of Pixels) they are usually have some attractive property. It is useful to describe shapes in an image (Binary) after segmentation. Using image moments one can find simple properties of an image such as area (intensity), centroid and orientation of object inside an image. Raw Moments Image with pixel intensities I(x, y) Raw moments of a simple image include: 1) Sum of grey levels or Area (In case of Binary image) : M00 2) Centroid : M10/ M00 , M01/ M00
  • 26. 26 | P a g e Central Moments Where f(x, y) is input digital image where Scale Invariant Moments Moments nij where i+j >=2 can be constructed to be invariant to both translation and changes in scale by dividing the corresponding central moment by dividing 00th moment. Rotation Invariant Moments (HU set of invariant moments) HU set of invariant moments are most frequently used which are invariant under Translation, Rotation and Scale. These 7 values from I1 to I7 are the feature set Stored as descriptors for each image. The usefulness of these moments in this application is that they are used to process images in order to make their features invariant to scale, translation and transformations.HU moments are used in this project. They are also called invariant statistical moments because they are not affected by rotation, scaling and translation.
  • 27. 27 | P a g e 9 Hand Gesture Training (Machine learning) Machine Learning Machine Learning involves two basic Steps:  Collecting Training Set.  Feature Extraction. Figure 9.1 Machine learning Training Dataset The dataset with variations is captured for training step. Training dataset consist of 5 gestures, there are 50 variations for each Gesture. So that the system is trained to get more accuracy with Variations of same gesture. This helps to recognize the gesture under different conditions. Few samples from the proposed dataset are: Gesture 1: First Variation Second Variation Third Variation Training set Images Features from Invariant HU Moment Feature Set
  • 28. 28 | P a g e Figure 9.2 Punch gesture Gesture 2: First Variation Second Variation Third Variation Figure 9.3 Left gesture Gesture 3: First variation Second Variation Third variation Figure 9.4 Well done gesture Gesture 4: First variation Second Variation Third variation Figure 9.5 Drop gesture
  • 29. 29 | P a g e Gesture 5: First variation Second variation Third variation Figure 9.6 Catch gesture The following 5 gestures are included. Figure 9.7 Gestures included in the system
  • 30. 30 | P a g e Feature Extraction: Feature extraction in training step is the same as explained in chapter 8 (See page 25). In training/learning step features of each image are extracted using the method of HU Set of Invariant Moments and store the result for each image of training set in a file so that during classification step it need not to be done again. The file contains a matrix having descriptor values of each image from the training dataset and its classifier class. It saves time and makes classification robust because the most time consuming operation among most of these is training. Normalization: The matrix of features which is calculated and stored, each row in it represents one image. Each attribute of matrix represent a specific feature (attribute), one attribute does not depend on another. Therefore the values of each column need to be normalized irrespective to each other. Max of each column is stored in a file which will be later used in the classification step. Value of each row in a particular attribute (feature) is selected, and is divided by the maximum value of that attribute (feature) in whole matrix; it is repeated for all the records. What this does is that it normalizes the values which mean that the resultant values will be in a range of 0 and 1. This vastly improves the results of classification. It will decrease biasness where each attribute has the same weight in classification. Inter class difference: Average of each class is calculated from matrix of descriptors. In this step one class is chosen and the distance between the current row and the rest of the classes is calculated. Same step is done for all the classes and results are stored. After this find three values from these results, which are; Maximum, Minimum and Median. These values can be used as a threshold and it depends on the level of hardness for classification. This is an adoptive threshold and the purpose of this is the prevent under-fitting. The level of hardness is the level of under-fitting.
  • 31. 31 | P a g e 10 Classification Classification involves two basic steps:  Machine Leaning  Recognition Machine Learning: Recognition: Figure 10.1 Classification steps Training set Images Features from Invariant HU Moment Feature Set Resul t Classified Classification Test Image Feature s
  • 32. 32 | P a g e Recognition: Recognition involve following steps:  First the features of the test image are calculated using Hu moments.  These features are compared with the training feature set.  The algorithm used for classification is KNN (K-nearest neighbor).  This algorithm uses neighbors to calculate distance and on the basis of distances it classifies the current record in one of the predefined classes.  Euclidean is used for finding the distance by comparison. Euclidean Distance ( ( X,Y),(A,B) ) = [ (X – A)2 + ( Y – B )2 ]1/2  Gesture is classified in to the class with which it has minimum distance.  K value is selected, which is the number of neighbors taken in account for every calculation.  Carefully select the value of K, if the value of K is too small it is sensitive to noise, and if the K value is too large the neighbors might include points that are from other classes. So a normal or medium value of K is selected.  One of the limitations of this method is that it will classify the input gesture to at least one of the training class with minimum distance, which results in in-correct classification. So a Threshold is applied.  After calculating distance the value is compared with the Threshold.  If it passes the threshold it is classified, otherwise it is identified as a new gesture. Test results: Figure 10.2 Punch gesture test
  • 33. 33 | P a g e Figure 10.3 Drop gesture test Figure 10.4 Catch gesture test Figure 10.5 Left gesture test Figure 10.6 Well done gesture test
  • 34. 34 | P a g e 11 Text to speech Once the gesture gets translated the class of the gesture which was given at run time is obtained. In this function, first of all the type of voices available is searched and then the first available voice is picked up by default. As a parameter user sends it text and voice type. After that it sets the speed of text. The speed or pace range of voice can be in the range of -10 to 10. By default the speed is 0. After that it sets the rate of sampling of the speech. It is based on speech API of MS window 32.
  • 35. 35 | P a g e 12 UML Diagrams Use Case Diagram Figure 12.1 Use case diagram
  • 36. 36 | P a g e Sequence Diagram Figure 12.2 Sequence diagram
  • 37. 37 | P a g e Flow Diagram Figure 12.3 Flow diagram
  • 38. 38 | P a g e 13 Conclusion Future work There are some aspects of projects which can be improved in future.  Instead of webcam a better and more accurate acquisition device can be used which even used Infrared for accuracy e.g. Kinect.  Mechanism for hand detection is not accurate.  HU set of invariant moments are very basic descriptors as features of image which will not have good accuracy. A better descriptor can give good results but classification mechanism may change. Potential applications Image recognition concept have vital applications in various fields like:  Robotics.  Artificial Intelligence.  Controlling the Computer through hand gestures.
  • 39. 39 | P a g e 14 Project poster Poster for this project is created using Adobe InDesign, which is a software by Adobe specially made for poster designing. This poster is of standard size and is using vector graphics so no matter how much it is zoomed, its pixels will not burst. Figure 14.1 Project poster in Adobe InDesign.
  • 40. 40 | P a g e Figure 14.2 Project poster.
  • 41. 41 | P a g e 15 References [1] Pujan Ziaie, Thomas M¨uller and Alois Knoll. A Novel Approach to Hand- Gesture Recognition in a Human-Robot Dialog System: Robotics and Embedded Systems Group Department of Informatics Technische Universit¨at Munchen. [2] Pujan Ziaie and Alois Knoll. An invariant-based approach to static Hand-Gesture Recognition: Technical University of Munich. [3] Rajat Shrivastava. A Hidden Markov Model based Dynamic Hand Gesture Recognition System using OpenCV: Dept. of Electronics and Communication Engineering Maulana Azad National Institute of Technology Bhopal-462001, India. [4] Neha S. Chourasia, Kanchan Dhote, Supratim Saha. Analysis on Hand Gesture Spotting using Sign Language through Computer Interfacing: International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3, Issue 3, May 2014. [5] Joyeeta Singha, Karen Das. Hand Gesture Recognition Based on Karhunen- Loeve Transform: Department of Electronics and Communication Engineering Assam Don Bosco University, Guwahati, Assam, India. [6] Hunter, E. Posture estimation in reduced model gesture imput systems, Proceedings of International Workshop on Automated Face and Gestures Recognition, June 1995. [7] Chaudhary, A., Raheja, J. L., Das, K., Raheja, S., A Vision based Geometrical Method to find Fingers Positions in Real Time Hand Gesture Recognition, Journal of Software, Academy Publisher, Vol.7, 2012. [8] Segan, J, Controlling computers with gloveless gestures in Virtual Reality Systems. 1993 [9] Gastaldi G. and et al., "a man-machine communication system based on the visual analysis of dynamic gestures", International conference on image processing, Genoa, Italy, September, 2005, pp.397-400
  • 42. 42 | P a g e 16 Turnitin Originality Report HAND GESTURE RECOGNITION SYSTEM by Afnan Ur Rehman, Haseeb Ansar Iqbal, Anwaar ul Haq From HAND GESTURE RECOGNITION SYSTEM (Research)  Processed on 30-Jun-2015 08:29 PKT  ID: 553340821  Word Count: 3906 Similarity Index 10% Similarity by Source Internet Sources: 8% Publications: 6% Student Papers: 6% Sources: 1 2% match (Internet from 11-Dec-2007) http://www.forbes.com/lists/2007/10/07billionaires_The-Worlds- Billionaires_NameHTML_36.html 2 1% match (Internet from 12-Jul-2013) http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/MORSE/region-props-and- moments.pdf 3 1% match (Internet from 12-Oct-2014) http://www.ijsret.org/pdf/120374.pdf 4 1% match (publications) A. Musso. "Structural dynamic monitoring on Vega platform: an example of Industry and University collaboration", Proceedings of European Petroleum Conference EUROPEC, 10/1996 5 1% match (student papers from 16-Dec-2014) Submitted to iGroup on 2014-12-16 6 1% match (student papers from 03-Aug-2010) Submitted to Universiti Teknikal Malaysia Melaka on 2010-08-03 7 < 1% match (Internet from 01-Jul-2003) http://www.discovery.mala.bc.ca/web/bandalia/digital/work.htm 8
  • 43. 43 | P a g e < 1% match (publications) Yeo, Hangu, Vadim Sheinin, Yuri Sheinin, and Benoit M. Dawant. "", Medical Imaging 2009 Image Processing, 2009. 9 < 1% match (student papers from 16-Dec-2013) Submitted to Universiti Malaysia Perlis on 2013-12-16 10 < 1% match (Internet from 05-Jun-2012) http://www.csjournals.com/IJCSC/PDF1-1/16.pdf 11 < 1% match (student papers from 27-Oct-2012) Submitted to VIT University on 2012-10-27 12 < 1% match (Internet from 30-Apr-2003) http://www.goodstaff.com/jobseekers/articles/sat/Sat14.html 13 < 1% match (Internet from 29-Jul-2010) http://ethesis.nitrkl.ac.in/1459/1/Removal_of_RVIN.pdf 14 < 1% match (Internet from 08-Oct-2013) http://www.lifesciencesite.com/lsj/life1009s/041_20339life1009s_289_296.pdf 15 < 1% match (publications) Henke, Daniel, Padhraic Smyth, Colene Haffke, and Gudrun Magnusdottir. "Automated analysis of the temporal behavior of the double Intertropical Convergence Zone over the east Pacific", Remote Sensing of Environment, 2012. 16 < 1% match (Internet from 07-Mar-2015) http://en.wikipedia.org/wiki/Image_moment 17 < 1% match (Internet from 05-Dec-2013) http://eventos.spc.org.pe/inns-iesnn/papers/Jimenez-Oliden-Huapaya-Cardenas- Neurocopter.pdf 18 < 1% match (Internet from 25-Dec-2014) http://ijcsn.org/IJCSN-2014/3-4/A-Fast-and-Robust-Hybridized-Filter-for-Image-De- Noising.pdf 19 < 1% match (Internet from 26-Nov-2002) http://aips2.nrao.edu/released/docs/user/Utility/node248.html 20 < 1% match (publications) Sungsik Huh. "A Vision-Based Automatic Landing Method for Fixed-Wing UAVs", Selected papers from the 2nd International Symposium on UAVs Reno Nevada U S A June 8–10 2009, 2009
  • 44. 44 | P a g e 21 < 1% match (publications) Kong, Fan zhi, Xing zhou Zhang, Yi zhong Wang, Da wei Zhang, Jun lan LI, Shanhong Xia, Chih-Ming Ho, and Helmut Seidel. "", 2008 International Conference on Optical Instruments and Technology MEMS/NEMS Technology and Applications, 2008.