Gesture Recognition
Gesture Recognition
Gesture Recognition
Project report submitted in partial fulfilment of the requirement for the degree
of Bachelor of Technology In
Computer Science and Engineering/Information Technology
By
VAIBHAV SAWHNEY (151327)
Under the supervision of Dr. Hemraj Saini
T
o
I hereby declare that the work presented in this report entitled “Gesture Recognition” in
partial fulfilment of the requirements for the award of the degree of Bachelor of Technology
in Computer Science and Engineering/ Information Technology submitted in the department
of Computer Science & Engineering and Information Technology, Jaypee University of
Information Technology, Waknaghat, is an authentic record of my own work carried out
over a period from August 2017 to May 2019 under the supervision of Dr. Hemraj Saini
(Associate Professor, Computer science and Engineering ).The matter embodied in the
report has not been submitted for the award of any other degree or diploma.
This is to certify that the above statement made by the candidates is true to the best of my
knowledge.
Associate Professor
Page | i
ACKNOWLEDGEMENT
We would like to express our special thanks and gratitude to our project guide Dr. Hemraj
Saini who helped us in conceptualizing the project and actual building of procedures used
to develop the project. We would also like to thank our Head of department for providing
us this golden opportunity to work on a project like this, which helped us in doing a lot of
research and we came to know about so many things.
Secondly, we would like to thank our family and friends who guided us throughout the
project.
Thanking you,
Page | ii
LIST OF ABBREVIATIONS
Page | iii
TABLE OF FIGURES
Figure1. 1:Gestures Classification. ......................................................................................................... x
Figure1. 2:Body parts used for Gesturing.............................................................................................. xi
Figure1. 3:Steps for Hand Gesture Recognition Figure. ......................................................................xii
Figure1. 4:The above figures depicts the flowchart for gesture recognition. ...................................... xiv
Figure2. 1:CMOS Camera. ............................................................................................................... xviii
Figure2. 2:Flex Sensors. ...................................................................................................................... xix
Figure2. 3:Leaf Switches. ..................................................................................................................... xx
Figure3. 1:Stages for Proposed Algorithm. ...................................................................................... xxiii
Figure3. 2:Data Set. ........................................................................................................................... xxiv
Figure3. 3:Selected hand gestures for recognition. ........................................................................... xxiv
Figure3. 4:Detailed Flowchart for Proposed Algorithm. .................................................................... xxv
Figure3. 5:Example Picture. .............................................................................................................xxvii
Figure3. 6:Grayscale Graph Results. .............................................................................................. xxviii
Figure3. 7:Different Grayscale Results. ............................................................................................. xxx
Figure3. 8:Color Graph Results. ........................................................................................................ xxxi
Figure3. 9:Different Results for color Histogram. ............................................................................xxxii
Figure4. 1:Back End Architecture. .................................................................................................. xxxiv
Figure4. 2:Motion based detection of different gestures in different light.: .................................... xxxvi
Figure4. 3:Motion based detection of a gesture............................................................................... xxxvi
Figure4. 4:Skin division hand identification of a motion in same light. ........................................xxxvii
Figure4. 5:Skin division hand location of signal in various light. ................................................ xxxviii
Figure4. 6:Proposed Method for Our Gesture Recognition.
xxxix
Page | iv
S No Title Page No
1.Introduction
1.1 Introduction 5
1.3 Objectives 7
1.4 Methodology 9
1.5 Organization 9
2. Literature Survey
2.1 Context 11
3. System Description
3.3 Open CV 19
6. Conclusion
6.1 Reflection 40
7. References 43
Page | vi
ABSTRACT
Signal Acknowledgment is the most supported and practicable answer for recover
human connection with PCs and has been generally acknowledged as of late gratitude
to its training in gaming gadgets, for example, Xbox, PS4,etc just as different gadgets,
for example, workstations, Cell phone, and so forth of motions and especially the
acknowledgment of hand motions is used in different applications, for example,
openness support, emergency the board, drug and so on. This report delineates our
fourth-year venture "Affirmation of motions", portraying the various bearings and
techniques that are used for hand signal acknowledgment. Furthermore, it depicts
numerous strategies used for advancement and its exact portrayal, demonstrates the yield
accumulated and the tests executed to test the refined programming ancient rarity. Since
hand motion acknowledgment is connected to two fundamental AI and picture handling
fields, the report further depicts distinctive APIs and instruments that can be used to
execute various methodologies and techniques in such territories.
Page | vii
CHAPTER I
INTRODUCTION
Page | viii
INTRODUCTION
1.1.1 GESTURE
A gesture is a movement of a body part which can be a hand, head, neck or any other
to express any symbolic representation of data. It is basically a way of interacting in
a non-verbal manner using our body parts to communicate the desired message. It
includes various movements such as head movements, hand movements or other body
parts movements. It is the use of hand gestures that can be used to express the non-
verbal communication with the computer interface.
Gesture Recognition is defined as the process to identify the various motions designed
by the user and are fed to the machine which can be a PC, Tablet or any other machine.
Gesture Recognition can be studied using two methods. The first is static gesture
recognition and the second is dynamic gesture recognition. Using Static gesture
recognition, the predefined gestures stored in database can be identified whereas
Dynamic gesture recognition is more based on practical situations. With more
practicality comes more difficulty.
Page | ix
1.1.3 THE GESTURES CAN BE CLASSIFIED INTO TWO SUB –
CLASSES
o Static Gestures
o Dynamic Gestures
With the growth and development in computing, user interaction with keyboard, mouse
and other input devices are not sufficient. These devices have certain limitations and
henceforth the usable commands that can be directed to the machine have also become
limited. Moreover, it has become quite difficult for the blind and deaf to communicate
Page | x
with others. Gesture Recognition allows to solve these problems by having a predefined
data set.
The gesture fed to the devices is used as the input to invoke the command stored in the
database and the corresponding output is displayed on the screen.
The primary step is to discover and track the hands, which includes to get the required
picture or video and pre-process it to recognize the hand by applying different procedures
attributes to determine the signal generated by the hand. Handling pictures, retrieval of
information, recreating the picture can be utilized to generate the hand gesture.
Page | xi
1.3 GESTURE RECOGNITION OBJECTIVES
The various objectives are shown in Figure1. 3Error! Reference source not found.
• Tracking the hand signal
• Extracting the desired features
• Identification of gesture
• Applying the methodology
Hand
Detection Recognition
&
Tracking
Feature
Extraction
Page | xii
1.3.1 TRACKING THE HAND SIGNAL
This phase focuses to find and observe the various hand patterns by analysing the
video frames to determine the hands of various skins color in different environment
and lighting conditions.
The second phase includes separating the critical features from the unwanted features
and discovering the highlights in order to shape the gesture with its required
properties.
This phase includes two phases: The first phase is to compare the filtered signal with
the predefined dataset. The second phase is to get the recognized gesture in highest
precision for better comparisons and error avoidance.
Page | xiii
Camera Input
Feature Extraction
Classification &
Recognition
Training Gesture
Database Recognized
Figure1. 4:The above figures depict the flowchart for gesture recognition.
Page | xiv
1.4 METHODOLOGY
There are various methods to detect a static gesture, but the main scenario is to deal with
dynamic gestures that shows dynamic changes with time. The methodology to opt
depend on the user and the scenarios used as priority to determine the gesture.
1.5 ORGANIZATION
The report contains of various sections. The first section contains the introduction to the
gesture recognition and gives an overview to it. The second section contains the various
trending technologies used for gesture recognition. The third section talks about the
various data sets and inputs. The fourth section talks about the implemented algorithms.
The fifth section talks about the various interpreted and generated results, the sixth gives
a reflection the work and the scopes in future.
Page | xv
CHAPTER II
LITERATURE SURVEY
Page | xvi
LITERATURE SURVEY
2 .1 CONTEXT
The literature survey conducted provides an insight into the different methods that can
be adopted and implemented to achieve hand gesture recognition. Also helps in
understanding the advantages and disadvantages associated with the various techniques.
The commonly used methods of capturing input that has been observed are data gloves,
hand belts and cameras.
Here we have to design a circuit or any alternative method to generate a digital pattern
corresponding hand gesture as shown in methodology. Initially we would with different
methods to design digital gloves which are mentioned below:
At first hand motion picture is caught by CMOS camera as appeared in figure then we
get limit of hand motion by identified by edge recognition rule in Figure2. 1.
Page | xvii
Figure2. 1:CMOS Camera [26)]
Drawbacks:
Sensor refers to a transducer which converts physical energy into electrical energy. Flex
means ‘bend’ or ‘curve’ as in Figure2. 2. Flex sensor acts as a resistive sensor
which changes its resistance as per the change in bend or curvature of it into analog
voltage. Resistance changes from 45k to 75k by increasing the curvature from 0 degree
to 90 degree.
Page | xviii
Figure2. 2:Flex Sensors [26)]
Drawbacks:
• We go through destroy resistors so as to become solid justification levels, strong logic levels
are not obtained.
• Low scope of simple yield from flex sensor.
• Less correct analog output from flex sensors.
• Extra circuits.
• Luxurious.
Like ordinary switches yet when these are planned so that when weight is connected on
the switch, the two closures come into contact and the switch will be shut as in Figure2.
3. At the point when these leaf switches end come into contact; the switch will be shut.
These switches are set on the fingers on the glove with the end goal that two terminals
of the switch come into contact when the finger is twisted.
Page | xix
Figure2. 3:Leaf Switches [26)]
Drawback:
The disadvantage related with the Leaf switches is that after drawn out use, the switch
as opposed to being open when the finger is straight, it will be shut bringing about ill-
advised transmission of signal.
Page | xx
CHAPTER III
SYSTEM DESCRIPTION
Page | xxi
SYSTEM DESCRIPTION
The primary function of this system is to recognize hand gesture. Here, gesture image is
taken using the camera, the image will be processed using methods of contours and after
identifying the gesture, an output will be provided on the display screen as shown in
Figure3. 1.
• Input Stage
• Processing Stage
• Output Stage
▪ The image captured must be as clear as possible to lower the occurrence of error.
▪ The input image is transferred to the raspberry pi module for further processing.
Page | xxii
3. The contour of image is recorded.
The most important part of the entire report revolves around the predefined dataset
which includes the gestures to be recognized via input signal. The data set has been taken
from the CAMBRIDGE HAND FLAG LIST as shown in Figure3. 2.
Page | xxiii
Figure3. 2:Data Set.
In this endeavour, we use four classes for seeing manual flags, that are showed up in
Figure3. 3. We use the important, the fourth, the sixth and the ninth signal as showed up
in Figure3. 2.
Page | xxiv
Figure3. 4:Detailed Flowchart for Proposed Algorithm.
3.3 OPEN CV
Page | xxv
OpenCV has a particular structure, which implies that the bundle incorporates a few
shared or static libraries. The accompanying modules are accessible:
• imgproc - a picture preparing module that incorporates straight and non-direct picture
separating, geometrical picture changes (resize, relative and point of view distorting,
nonexclusive table-based remapping), shading space transformation, histograms, etc.
• calib3d - essential numerous view geometry calculations, single and stereo camera
alignment, object present estimation, stereo correspondence calculations, and
components of 3D reproduction.
Page | xxvi
3.4 IMAGE FILTERING USING HISTOGRAM
A histogram is a chart or a plan that speaks to the conveyance of the pixel forces in a
representation. In this post we're going to concentrate on the RGB shading space (see
here on the off chance that you need a clarification about the contrast between some
shading spaces, for example, RGB and Lab), thus the force of a pixel is in the range [0,
255].
A histogram can be determined both for the grayscale picture and for the shaded picture.
In the main case we have a solitary channel, henceforth a solo histogram.
Page | xxvii
3.4.1 TYPES OF IMAGE FILTERING:
• Grayscale histogram
In the event that we execute this capacity for the example pictures we get the resulting
histograms as shown in below Figure3. 6.
Page | xxviii
We should now examine this plot and see what sort of information’s we can extract from
them as shown in Figure3. 7.
From the first we can surmise that the every one of the pixels of the relating picture have
low force as their practically all in the [0, 60] area around. From the second one we can
see that the conveyance of the pixel powers is still increasingly twisted over the darker
side as the middle esteem is around 80, however the difference is a lot bigger.
At that point from the last one we can gather that the relating picture is a lot lighter
generally speaking, yet in addition have some dim areas.
Page | xxix
Figure3. 7:Different Grayscale Results.
• Color histogram
How about we currently move onto the histograms of the shaded example
pictures in Figure3. 5. Indeed, even for this situation we can compose the
accompanying assistant capacity to show utilizing matplotlib the histogram a
picture:
In the event that we execute this capacity for the example pictures we get the resulting
histograms as shown in below Figure3. 8.
Page | xxx
Figure3. 8:Color Graph Results.
The plots are in a similar request of the example pictures. As we could have anticipated
from the main plot, we can see that every one of the channels have low powers relating
to exceptionally dim red, green and blue. We likewise need to think about that the
shading dark, which is given by (0, 0, 0) in RGB, is bounteous in the comparing picture
and that may clarify why every one of the diverts have tops in the lower some portion
of the X pivot as shown in Figure3. 9.
Page | xxxi
Figure3. 9:Different Results for color Histogram.
Page | xxxii
CHAPTER IV
PROPOSED SYSTEM DESIGN WITH DETAILED
ALGORITHM
Page | xxxiii
PROPOSED SYSTEM DESIGN WITH DETAILED ALGORITHM
The structure comprises of two main ends: the front-end and the back-end. The back-
end comprises of three modules as shown in Figure4. 1:
• Camera.
• Detection.
• Interface.
Page | xxxiv
4.1.1. Module of Camera
In order to capture the input received from various image detectors and then transmitting
the data to the Module of detection for further pre-processing, Module of camera is used.
Some of the available methods that can be used to capture the input data are available in
the market. They are data gloves, camera etc. In our project, we have used an inbuilt
webcam camera that is cost effective and can detect static gestures easily. USB based
cameras are also available at higher cost.
The input received from the module of camera is processed through various stages such
as conversion of colour, removal of unwanted noise, changing frequencies, extraction of
various RGB frames etc. This may result in two scenarios: Image with Defect and Image
with no Defect. If the gesture is dynamic in nature, then frames with five continuous
movements come into play.
Utilizing this strategy, we get the accompanying outcome, as found in Figure4. 2 and
Figure4. 3, the discovery isn't exceptionally precise, we attempted particular qualities for
edge yet the outcomes were not exact, this might be because of the interaction of tracker
which considerably variations the foundation, likewise the entire hand isn't in movement
assembly it difficult to remember it, subsequently we took a decision of not utilizing this
strategy in the item.
Page | xxxv
Figure4. 2:Motion based detection of different gestures in different light.:
The extra methodology we utilized was skin division, as indicated in section 3.1 we used
the Lab Shading space to as opposed to RGB shading space for skin acknowledgment.
Page | xxxvi
We start by modifying the screening space of the picture to Lab using OpenCV and
afterward use the estimations of pathway a and b which are 8 bit divert and as such have
a motivating force from 0 to 255, as far as likely qualities to use for thresholding the
picture and convey distributed picture exhibiting the fingers as showed as tracks.
Register these edge esteems in light of the for the most part used characteristics for
representing skin shading in Lab shading space and on investigation. Figure4. 4 and
incredible results, likewise we use the skin shading detachment system in the item.
Page | xxxvii
Figure4. 5:Skin division hand location of signal in various light.
The activities must be passed to the proper application. This module is in charge of
mapping the recognized hand motions to their related activities. The front end comprises
of three windows.
The main window comprises of the video input that is caught from the camera with the
relating name of the motion recognized. The second shows the forms found inside the
information pictures. What's more, the 1st window shows the smooth thresholder variant
of the picture. The edge and form window are as a piece of the Graphical UI as a result
of the including them the client mindful of the foundation irregularities that would
influence the contribution to the framework and accordingly, they can alter their
workstation or work area web camera so as to maintain a strategic distance from them.
This would result in better execution.
Page | xxxviii
Step 1 • Input is Captured with
Camera.
• Image is Converted to
Grayscale.
• Noise Removal &
Step 2
Smoothening of Image is
done.
Page | xxxix
CHAPTER V
RESULTS AND PERFORMANCE
Page | xl
RESULTS AND PERFORMANCE
The program will first detect the background which will avoid the detection of any objects
kept at rest. The objects around the hand histogram are avoided.
In order to apply histogram, the user has to remove his or her hand or any body part from
the desired box and ten has to click the required key on keyboard in order to detect any
unwanted objects. To apply histogram the user then places the hand in the histogram box
and then press the required key.
Page | xli
5.2 STORE IMAGE INPUT
Camera feed contribution is taken and put away in a NumPy exhibit named 'outline'.
Foundation is ejected from the picture by taking the foundation model we just made and
running the accompanying line:
Page | xlii
• Convert edge to HSV.
Apply edge to produce a paired picture from the back projection. This edge is utilized
as a veil to isolate out the hand from the remainder of the casing.
• Separate out the part contained in the square shape catch region and dispose of
the rest.
• First discover all shapes of the picture and after that approve the biggest form to
check on the off chance that it coordinates the profile of a hand or not.
Page | xliii
• Dispose of all arched frame focuses excessively far or too close to the hand focus
got above.
Page | xliv
RESULTS
Page | xlv
Page | xlvi
Page | xlvii
SOME CAPTURED GESTURES USING OPENCV.
Page | xlviii
CHAPTER VI
CONCLUSIONS
Page | xlix
CONCLUSIONS
6. CONCLUSION
Sign language is the only medium of communication for physically impaired people like
deaf and dump. But they face difficulty in communicating with those who do not
understand sign language. It involves development of an electronic device that can
translate sign language and display it on the screen in order to make the communication,
between the mute communities and the general public, possible.
HCI is one of the advanced techniques for direct interfacing with computers as compared
to keyboard and mouse. Hand gestures are communicated through dynamic movement
like hand waving or through static poses like victory sign.
6.1 REFLECTION
As indicated in Background region, Signal acknowledgment talked about in the paper are
abnormal in their own particular manner with every one of them having their advantages
and disadvantages. Vision-based approach is extra content and comprehensible whereas
sensor-based methodology being increasingly overwhelming regarding equipment and
limitations on characteristic hand movement. Vision based is further divided into
appearance based and model-based approach. To the extent picture taking care of we found
out about the unmistakable principal methodology for expelling basic information from
representation and procedures similar adaptable thresholding, thinking structure and
moment, we also found out around particular shading spaces and their fittingness for
particular functionalities and with and explicit ultimate objective these and diverse
frameworks we found out about OpenCV and the unmistakable limit it gives.
Page | l
6.2 PROJECT CONCLUSION
Gesture based communication can be actualized to impart, the objective individual must
have a thought of the communication through signing which is unimaginable
dependably. Gesture based communication is one of the helpful gadgets to facilitate the
correspondence between the hard of hearing and quiet networks and ordinary society.
Consequently, our task brings down such boundaries. This undertaking was intended to
be a protype to check the possibility of perceiving gesture-based communication. With
this venture, hard of hearing and unable to speak networks can utilize the gloves to shape
signals to frame motions as per communication through signing and the motions will be
changed over to discourse.
6.2.1 LIMITATION
As assessed over couple of presumption were made for the endeavour, at any rate the
framework has couple of obstructions not withstanding that, similar to its weakness to
recognise and track hand if the foundation is in a general sense proportionate to the skin
shading, to see and track submit insane light conditions. Also, the assertion compose
requires client obstruction as decided already.
The endeavour further can be united with a GUI in setting of python with talk office. By
build-up the structure later on to refresh ID and following to vanquish the requirements,
for example as opposed to using skin shading for zone, we can use various frameworks
for following and especially we can wear out affecting the request to deal with
increasingly autonomous and see the developments constantly alongside.
Page | li
LIST OF REFERENCES
1) C. L. NEHANIV. K J DAUTENHAHN M KUBACKI M. HAEGELE C.
3) H.A JALAB "Static hand Gesture recognition for human computer interaction", 1-7 2012.
networks", 2014.
8) M. Bhuiyan R. Picking "Gesture Control User Inter-Face What Have We Done and What's
Next?" ,2009.
Education. Inc.
10) S. NAJI R. ZAINUDDIN H.A. JALAB "Skin segmentation based on multi pixel color
clustering models,2012.
recognition for Brazilian sign language: a study using distance-based neural networks.
Neural Networks,2009.
Page | lii
12) G. GOMEZ "On selecting colour components for skin detection. Pattern Recognition
2002".
13) S. SINGH D. CHAUHAN M. VATSA R. SINGH "A robust skin color based face
14) S. UMBAUGH Computer Vision and Image Processing: A Practical Approach Using
differential operator",2013.
18) Meenakshi Panwar and Pawan Singh Mehra "Hand Gesture Recognition for Human
19) Amornched Jinda-apiraksa Warong Pongstiensak and Toshiaki Kondo "A Simple Shape-
Algorithm for Human Computer Interaction Using Skin color & Motion Cues ", 2013.
26) VAJJARAPU, LAVANYA, AKULAPRAVIN, M.S., MADHAN MOHAN " Hand
Gesture Recognition and Voice Conversation System Using Sign Language Transcription
System ", 2014.
Page | liii
Page | liv
55