0% found this document useful (0 votes)

3 views

Sign Language Recognition System Using TensorFlow

Uploaded by

sanketggosavi17

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Sign Language Recognition System Using TensorFlow

Uploaded by

sanketggosavi17

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Sign Language Recognition System using TensorFlow

Object Detection API

Sharvani Srivastava, Amisha Gangwar, Richa Mishra, Sudhakar Singh*[0000-0002-0710-

924X]

Department of Electronics and Communication, University of Allahabad, Prayagraj, India

{sharvanisri28, amishagangwar21}@gmail.com, {richa_mishra,
sudhakar}@allduniv.ac.in
*Corresponding Author

Abstract. Communication is defined as the act of sharing or exchanging infor-

mation, ideas or feelings. To establish communication between two people, both
of them are required to have knowledge and understanding of a common lan-
guage. But in the case of deaf and dumb people, the means of communication are
different. Deaf is the inability to hear and dumb is the inability to speak. They
communicate using sign language among themselves and with normal people but
normal people do not take seriously the importance of sign language. Not every-
one possesses the knowledge and understanding of sign language which makes
communication difficult between a normal person and a deaf and dumb person.
To overcome this barrier, one can build a model based on machine learning. A
model can be trained to recognize different gestures of sign language and trans-
late them into English. This will help a lot of people in communicating and con-
versing with deaf and dumb people. The existing Indian Sing Language Recog-
nition systems are designed using machine learning algorithms with single and
double-handed gestures but they are not real-time. In this paper, we propose a
method to create an Indian Sign Language dataset using a webcam and then using
transfer learning, train a TensorFlow model to create a real-time Sign Language
Recognition system. The system achieves a good level of accuracy even with a
limited size dataset.

Keywords: Sign Language Recognition (SLR), Computer Vision, Machine

Learning, Indian Sign Language.

1 Introduction

Communication can be defined as the act of transferring information from one place,
person, or group to another. It consists of three components: the speaker, the message
that is being communicated, and the listener. It can be considered successful only when
whatever message the speaker is trying to convey is received and understood by the
listener. It can be divided into different categories as follows [1]: formal and informal
communication, oral (face-to-face and distance) and written communication, non-ver-
bal, grapevine, feedback, and visual communication, and the active listening. The

------------------------------------------------------
This is a pre-print of an accepted article for publication in proceedings of
International Conference on Advanced Network Technologies and Intelligent
Computing (ANTIC-2021), part of the book series ‘Communications in Computer
and Information Science (CCIS)’, Springer. The final authenticated version is
available online at: https://doi.org/10.1007/978-3-030-96040-7_48
2

formal communication (official communication) is steered through the channels that

are pre-determined. The unofficial or grapevine communication is the spontaneous
communication between individuals in one’s profession that does not have any formal
protocol or structure. The oral communication (face-to-face and distance) is the com-
munication in which words are exchanged between people who are present in front or
at a distance (with the help of technology including voice and video calls, webinars,
etc.). The written communication is the communication in which letters, emails, no-
tices, or any other written form is used for communicating. The non-verbal communi-
cation is the communication that uses gestures, facial expressions, body language, etc.
The feedback communication happens when a person gives feedback on some product
or service provided by an individual or a company. The visual communication occurs
when a person gets information from a visual source like televisions, social networking,
or any other source. Active listening is when a person listens to and understands what
the other individual is trying to convey so that the communication becomes more mean-
ingful and effective [1].
Non-verbal communication helps deaf and dumb people to communicate amongst
themselves and with others. Deaf is a disability that impairs a person's hearing ability
and makes them incapable to hear while dumb is a disability that impairs the speaking
ability and makes them incapable to speak. Not being able to speak or listen makes it
difficult to establish communication with others. This is where sign languages come
into the role, it enables a person to communicate without words. But a problem still
exists, not many people possess the knowledge of sign language. Deaf and dumb may
be able to communicate amongst themselves using sign languages but it is still difficult
for them to communicate with people having normal hearing and vice-versa due to the
lack of knowledge of sign languages. This issue can be resolved by the use of a tech-
nology-driven solution. By using such a solution, one can easily translate the gestures
of sign language into the commonly spoken language, English.
A lot of research has been done in this field and there is still a need for further re-
search. For gesture translation, data gloves, motion capturing systems, or sensors have
been used [2]. Vision-based SLR systems have also been developed previously [3]. The
existing Indian Sign Language Recognition system was developed using machine learn-
ing algorithms with MATLAB [4]. Authors have worked on single-handed and double-
handed gestures. They used two algorithms to train their system, K Nearest Neighbours
Algorithm and Back Propagation Algorithm. Their system achieved 93-96% accuracy.
Though being highly accurate, it is not a real-time SLR system. The objective of this
paper is to develop a real-time SLR system using TensorFlow object detection API and
train it using a dataset that will be created using a webcam.
The rest of this paper after the introduction is organized as follows. Section 2 pre-
sents the related work on the SLR system. Section 3 describes the data acquisition and
generation. Section 4 focuses on the methodology of the developed system. Section 5
presents the experimental evaluation of the system, and finally, Section 6 concludes the
paper with future work.
3

2 Related work

Sign languages are defined as an organized collection of hand gestures having spe-
cific meanings which are employed from the hearing impaired people to communicate
in everyday life [3]. Being visual languages, they use the movements of hands, face,
and body as communication mediums. There are over 300 different sign languages
available all around the world [5]. Though there are so many different sign languages,
the percentage of population knowing any of them is low which makes it difficult for
the specially-abled people to communicate freely with everyone. SLR provides a means
to communicate in sign language without knowing it. It recognizes a gesture and trans-
lates it into a commonly spoken language like English.
SLR is a very vast topic for research where a lot of work has been done but still
various things need to be addressed. The machine learning techniques allow the elec-
tronic systems to take decisions based on experience i.e. data. The classification algo-
rithms need two datasets – training dataset and testing dataset. The training set provides
experiences to the classifier and the model is tested using the testing set [6]. Many
authors have developed efficient data acquisition and classification methods [3][7].
Based on data acquisition method, previous work can be categorized into two ap-
proaches: the direct measurement methods and the vision-based approaches [3]. The
direct measurement methods are based on motion data gloves, motion capturing sys-
tems, or sensors. The motion data extracted can supply accurate tracking of fingers,
hands, and other body parts which leads to robust SLR methodologies development.
The vision-based SLR approaches rely on the extraction of discriminative spatial and
temporal from RGB images. Most of the vision-based methods initially try to track and
extract the hand regions before their classification to gestures [3]. Hand detection is
achieved by semantic segmentation and skin colour detection as the skin colour is usu-
ally distinguishable easily [8][9]. Though, because the other body parts like face and
arms can be mistakenly recognized as hands, so, the recent hand detection methods also
use the face detection and subtraction, and background subtraction to recognize only
the moving parts in a scene [10][11]. To attain accurate and robust hands tracking, par-
ticularly in cases of obstructions, authors employed filtering techniques, for example,
Kalman and particle filters [10][12].
For data acquisition by either the direct measurement or the vision-based ap-
proaches, different devices need to be used. The primary device employed as input pro-
cess in SLR system is camera [13]. There are other devices available that are used for
input such as Microsoft Kinect which provides colour video stream and depth video
stream all together. The depth data helps in background segmentation. Apart from the
devices, other methods used for acquiring data are accelerometer and sensory gloves.
Another system that is used for data acquisition is Leap Motion Controller (LMC)
[14][15] – it is a touchless controller developed by technology company “Leap Motion”
now called “Ultraleap” based in San Francisco. Approximately, it can operate around
200 frames per second and can detect and track the hands, fingers, and objects that look
alike fingers. Most of the researchers collect their training dataset by recording it from
their signer as finding a sign language dataset is a problem [2].
4

Different processing methods have been used for creating an SLR system
[16][17][18]. Hidden Markov Model (HMM) has been widely used in SLR [12]. The
various HMM that have been used are Multi Stream HMM (MSHMM) which is based
on the two standard single-stream HMMs, Light-HMM, and Tied-Mixture Density-
HMM [2]. The other processing models that have been used are neural network
[19][20][21][22][23], ANN [24], Naïve Bayes Classifier (NBC), and Multilayer Per-
ceptron (MLP) [14], unsupervised neural network Self-Organizing Map (SOM) [25],
Self-Organizing Feature Map (SOFM), Simple Recurrent Network (SRN) [26], Support
Vector Machine (SVM) [27], 3D convolutional residual network [28]. Researchers
have also used self-designed methods like the wavelet-based method [29] and Eigen
Value Euclidean Distance [30].
The use of different processing methods or application systems has given different
accuracy results. The Light-HMM gave 83.6% accuracy result, the MSHMM gave
86.7% accuracy result, SVM gave 97.5% accuracy result, Eigen Value gave 97% accu-
racy result, Wavelet Family gave 100% accuracy result [2][31][22][32]. Though differ-
ent models have given high accuracy results, but the accuracy does not depend only on
the processing model used, it depends upon various factors such as size of the dataset,
clarity of images of the dataset depending upon data acquisition methods, devices used,
etc.
There are two types of SLR systems – isolated SLR and continuous SLR. In isolated
SLR, the system is trained to recognize a single gesture. Each image is labelled to rep-
resent an alphabet, a digit, or some special gesture. Continuous SLR is different from
isolated gesture classification. In continuous SLR, the system is able to recognize and
translate whole sentences instead of a single gesture [33][34].
Even with all the research that has been done in SLR, many inadequacies need to be
dealt with by further research. Some of the issues and challenges that need to be worked
on are as follows [33][2][4][6].
• Isolated SLR methods need to do strenuous labeling for each word.
• Continuous SLR methods make use of isolated SLR systems as building blocks with
temporal segmentation as pre-processing, which is non-trivial and unescapably pro-
liferates errors into subsequent steps, and sentence synthesis as post-processing.
• Devices needed for data acquisition are costly, a cheap method is needed for SLR
systems to be commercialized.
• Web camera is an alternative to higher specification camera but the image is blurred
so, the quality is compromised.
• Data acquisition by sensors also has some issues e.g., noise, bad human manipula-
tion, bad ground connection, etc.
• Vision-based methodologies introduce inaccuracies due to overlapping of hand and
finger.
• Large datasets are not available.
• There are misconceptions about sign languages like sign language is same around
the world, while sign language is based upon the spoken language.
5

• Indian Sign Language is communicated using hand gestures made by a single hand
and double hands due to which there are two types of gestures representing the same
thing.
In this paper, the dataset that will be used is created using Python and OpenCV with
the help of a webcam. The SLR system that is being developed is a real-time detection
system.

3 Data acquisition

A real-time sign language detection system is being developed for Indian Sign Lan-
guage. For data acquisition, images are captured by webcam using Python and
OpenCV. OpenCV provides functions which are primarily aimed at the real-time com-
puter vision. It accelerates the use of machine perception in commercial products and
provides a common infrastructure for the computer vision-based applications. The
OpenCV library has more than 2500 efficient computer vision and machine learning
algorithms which can be used for face detection and recognition, object identification,
classification of human actions, tracking camera and object movements, extracting 3D
object models, and many more [35].
The created dataset is made up of signs representing alphabets in Indian Sign Lan-
guage [36] as shown in Fig. 1. For every alphabet, 25 images are captured to make the
dataset. The images are captured in every 2 seconds providing time to record gesture
with a bit of difference every time and a break of five seconds are given between two
individual signs, i.e., to change the sign of one alphabet to the sign of a different alpha-
bet, five seconds interval is provided. The captured images are stored in their respective
folder.

Fig. 1. Indian Sign Language Alphabets

For data acquisition, dependencies like cv2, i.e., OpenCV, os, time, and uuid have been
imported. The dependency os is used to help work with file paths. It comes under stand-
ard utility modules of Python and provides functions for interacting with the operating
systems. With the help of the time module in Python, time can be represented in multi-
ple ways in code like objects, numbers, and strings. Apart from representing time, it
can be used to measure code efficiency or wait during code execution. Here, it is used
to add breaks between the image capturing in order to provide time for hand move-
ments. The uuid library is used in naming the image files. It helps in the generation of
random objects of 128 bits as ids providing uniqueness as the ids are generated on the
basis of time and computer hardware.

Fig. 2. Selecting a portion of the image to label it

Fig. 3. Labelling the selected portion

Once all the images have been captured, they are then one by one labelled using the
LabelImg package. LabelImg is a free open-source tool for graphically labelling im-
ages. The hand gesture portion of the image is labelled by what the gesture in the box
or the sign represents as shown in Fig. 2 and Fig. 3. On saving the labelled image, its
XML file is created. The XML files have all the details of the images including the
detail of the labelled portion. After labelling all the images, their XML files are availa-
ble. This is used for creating the TF (TensorFlow) records. All the images along with
their XML files are then divided into training data and validation data in the ratio of
80:20. From 25 images of an alphabet, 20 (80%) of them were taken and stored as a
training dataset and the remaining 5 (20%) were taken and stored as validation dataset.
This task was performed for all the images of all 26 alphabets.

4 Methodology

The proposed system is designed to develop a real-time sign language detector using a
TensorFlow object detection API and train it through transfer learning for the created
dataset [37]. For data acquisition, images are captured by a webcam using Python and
OpenCV following the procedure described under Section 3.
Following the data acquisition, a labeled map is created which is a representation of
all the objects within the model, i.e., it contains the label of each sign (alphabet) along
with their id. The label map contains 26 labels, each one representing an alphabet. Each
label has been assigned a unique id ranging from 1 to 26. This will be used as a reference
to look up the class name. TF records of the training data and the testing data are then
created using generate_tfrecord which is used to train the TensorFlow object detection
API. TF record is the binary storage format of TensorFlow. Binary files usage for stor-
age of the data significantly impacts the performance of the import pipeline conse-
quently, the training time of the model. It takes less space on a disk, copies fast, and
can efficiently be read from the disk.
The open-source framework, TensorFlow object detection API makes it easy to de-
velop, train and deploy an object detection model. They have their framework called
the TensorFlow detection model zoo which offers various models for detection that
have been pre-trained on the COCO 2017 dataset. The pre-trained TensorFlow model
that is being used is SSD MobileNet v2 320x320. The SSD MobileNet v2 Object de-
tection model is combined with the FPN-lite feature extractor, shared box predictor,
and focal loss with training images scaled to 320x320. Pipeline configuration, i.e., the
configuration of the pre-trained model is set up and then updated for transfer learning
to train it by the created dataset. For configuration, dependencies like TensorFlow, con-
fig_util, pipeline_pb2, and text_format have been imported. The major update that has
been done is to change the number of classes which is initially 90 to 26, the number of
signs (alphabets) that the model will be trained on. After setting up and updating the
configuration, the model was trained in 10000 steps. The hyper-parameter used during
the training was to set up the number of steps in which the model will be trained which
was set up to 10000 steps. During the training, the model has some losses as classifica-
tion loss, regularization loss, and localization loss. The localization loss is mismatched
8

between the predicted bounding box correction and the true values. The formula of the
localization loss [38] is given in Eq. (1) – (5).

𝐿𝐿𝑙𝑙𝑙𝑙𝑙𝑙 (x, l, g) = ∑𝑁𝑁 𝑘𝑘 𝑚𝑚

�𝑗𝑗𝑚𝑚 �
𝑖𝑖∈𝑃𝑃𝑃𝑃𝑃𝑃 ∑𝑚𝑚∈{𝑐𝑐𝑐𝑐,𝑐𝑐𝑐𝑐,𝑤𝑤,ℎ} 𝑥𝑥𝑖𝑖𝑖𝑖 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠ℎ𝐿𝐿1 �𝑙𝑙𝑖𝑖 − 𝑔𝑔 (1)

𝑔𝑔�𝑗𝑗𝑐𝑐𝑐𝑐 = (𝑔𝑔𝑗𝑗𝑐𝑐𝑐𝑐 − 𝑑𝑑𝑖𝑖𝑐𝑐𝑐𝑐 )/𝑑𝑑𝑖𝑖𝑤𝑤 (2)

𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐
𝑔𝑔�𝑗𝑗 = (𝑔𝑔𝑗𝑗 − 𝑑𝑑𝑖𝑖 )/𝑑𝑑𝑖𝑖ℎ (3)

𝑔𝑔�𝑗𝑗𝑤𝑤 = log(𝑔𝑔𝑗𝑗𝑤𝑤 /𝑑𝑑𝑖𝑖𝑤𝑤 ) (4)

𝑔𝑔�𝑗𝑗ℎ = log(𝑔𝑔𝑗𝑗ℎ /𝑑𝑑𝑖𝑖ℎ ) (5)

where, N is the number of the matched default boxes, l is the predicted bounding box,
g is the ground truth bounding box, 𝑔𝑔� is the encoded ground truth bounding box and
𝑥𝑥𝑖𝑖𝑖𝑖𝑘𝑘 is the matching indicator between default box i and ground truth box j of category
k.
The classification loss is defined as the softmax loss over multiple classes. The for-
mula of the classification loss [38] is as Eq. (6).

𝑝𝑝 𝑝𝑝
𝐿𝐿𝑐𝑐𝑐𝑐𝑐𝑐 𝑓𝑓 (𝑥𝑥, 𝑐𝑐) = − ∑𝑁𝑁 0
𝑖𝑖∈𝑃𝑃𝑃𝑃𝑃𝑃 𝑥𝑥𝑖𝑖𝑖𝑖 log�𝑐𝑐̂𝑖𝑖 � − ∑𝑖𝑖∈𝑁𝑁𝑁𝑁𝑁𝑁 log(𝑐𝑐̂𝑖𝑖 ) (6)
𝑝𝑝 𝑝𝑝 𝑝𝑝
where, 𝑐𝑐̂𝑖𝑖 = exp�𝑐𝑐𝑖𝑖 � / ∑𝑝𝑝 exp (𝑐𝑐𝑖𝑖 ) is the softmax activated class score for default box
𝑝𝑝
i with category p, 𝑥𝑥𝑖𝑖𝑖𝑖 is the matching indicator between default box i and the ground
truth box j of category p.
The different losses incurred during the experimentation are mentioned in the sub-
sequent section. After training, the model is loaded from the latest checkpoint which
makes it ready for real-time detection. After setting up and updating the configuration,
the model will be ready for training. The trained model is loaded from the latest check-
point which is created during the training of the model. This completes the model mak-
ing it ready for real-time sign language detection.
The real-time detection is done using OpenCV and webcam again. For, real-time
detection, cv2, and NumPy dependencies are used. The system detects signs in real-
time and translates what each gesture means into English as shown in Fig. 5. The system
is tested in real-time by creating and showing it different signs. The confidence rate of
each sign (alphabet), i.e., how confident the system is in recognizing a sign (alphabet)
is checked, noted, and tabulated for the result.
9

Fig. 4. Real-Time Sign Language Detection

5 Experimental Evaluation

5.1 Dataset and Experimental Setup

The dataset is created for Indian Sign Language where signs are alphabets of the Eng-
lish language. The dataset is created following the data acquisition method described
in Section 3.
The experimentation was carried out on a system with an Intel i5 7th generation 2.70
GHz processor, 8 GB memory and webcam (HP TrueVision HD camera with 0.31 MP
and 640x480 resolution), running Windows 10 operating system. The programming
environment includes Python (version 3.7.3), Jupyter Notebook, OpenCV (version
4.2.0), TensorFlow Object Detection API.

5.2 Results and Discussion

The developed system is able to detect Indian Sign Language alphabets in real-time.
The system has been created using TensorFlow object detection API. The pre-trained
model that has been taken from the TensorFlow model zoo is SSD MobileNet v2
320x320. It has been trained using transfer learning on the created dataset which con-
tains 650 images in total, 25 images for each alphabet.
10

The total loss incurred during the last part of the training, at 10,000 steps was 0.25,
localization loss was 0.18, classification loss was 0.13, and regularization loss was 0.10
as shown in Fig. 4. Fig. 4 also shows that the lowest lost 0.17 was suffered at steps
9900.

Fig. 5. Loss incurred at different steps

The result of the system is based on the confidence rate and the average confidence rate
of the system is 85.45%. For each alphabet, the confidence rate is recorded and tabu-
lated in the result as shown in Table 1. The confidence rate of the system can be in-
creased by increasing the size of the dataset which will boost up the recognition ability
of the system. Thus, improving the result of the system and enhancing it.

Table 1. Confidence rate of each alphabet

A B C D E F G H I

94% 98% 90% 90% 70% 96% 73% 97% 95%

J K L M N O P Q R

57% 87% 93% 91% 55% 78% 95% 95% 83%

S T U V W X Y Z

86% 81% 87% 86% 87% 88% 90% 80%

The state-of-the-art method of the Indian Sign Language Recognition system achieved
93-96% accuracy [4]. Though being highly accurate, it is not a real-time SLR system.
This issue is dealt with in this paper. In spite of the dataset being small, our system has
achieved an average confidence rate of 85.45%.

6 Conclusion and Future Works

Sign languages are kinds of visual languages that employ movements of hands, body,
and facial expression as a means of communication. Sign languages are important for
specially-abled people to have a means of communication. Through it, they can com-
municate and express and share their feelings with others. The drawback is that not
everyone possesses the knowledge of sign languages which limits communication. This
limitation can be overcome by the use of automated Sign Language Recognition sys-
tems which will be able to easily translate the sign language gestures into commonly
spoken language. In this paper, it has been done by TensorFlow object detection API.
The system has been trained on the Indian Sign Language alphabet dataset. The system
detects sign language in real-time. For data acquisition, images have been captured by
a webcam using Python and OpenCV which makes the cost cheaper. The developed
system is showing an average confidence rate of 85.45%. Though the system has
achieved a high average confidence rate, the dataset it has been trained on is small in
size and limited.
In the future, the dataset can be enlarged so that the system can recognize more ges-
tures. The TensorFlow model that has been used can be interchanged with another
model as well. The system can be implemented for different sign languages by changing
the dataset.

References

1. Kapur, R.: The Types of Communication. MIJ. 6, (2020).

2. Suharjito, Anderson, R., Wiryana, F., Ariesta, M.C., Kusuma, G.P.: Sign Language
Recognition Application Systems for Deaf-Mute People: A Review Based on Input-
Process-Output. Procedia Comput. Sci. 116, 441–448 (2017).
https://doi.org/10.1016/J.PROCS.2017.10.028.
3. Konstantinidis, D., Dimitropoulos, K., Daras, P.: Sign language recognition based on hand
and body skeletal data. 3DTV-Conference. 2018-June, (2018).
https://doi.org/10.1109/3DTV.2018.8478467.
4. Dutta, K.K., Bellary, S.A.S.: Machine Learning Techniques for Indian Sign Language
Recognition. Int. Conf. Curr. Trends Comput. Electr. Electron. Commun. CTCEEC 2017.
333–336 (2018). https://doi.org/10.1109/CTCEEC.2017.8454988.
5. Bragg, D., Koller, O., Bellard, M., Berke, L., Boudreault, P., Braffort, A., Caselli, N.,
Huenerfauth, M., Kacorri, H., Verhoef, T., Vogler, C., Morris, M.R.: Sign Language
Recognition, Generation, and Translation: An Interdisciplinary Perspective. 21st Int. ACM
SIGACCESS Conf. Comput. Access. (2019). https://doi.org/10.1145/3308561.
6. Rosero-Montalvo, P.D., Godoy-Trujillo, P., Flores-Bosmediano, E., Carrascal-Garcia, J.,
12

Otero-Potosi, S., Benitez-Pereira, H., Peluffo-Ordonez, D.H.: Sign Language Recognition

Based on Intelligent Glove Using Machine Learning Techniques. 2018 IEEE 3rd Ecuador
Tech. Chapters Meet. ETCM 2018. (2018). https://doi.org/10.1109/ETCM.2018.8580268.
7. Zheng, L., Liang, B., Jiang, A.: Recent Advances of Deep Learning for Sign Language
Recognition. DICTA 2017 - 2017 Int. Conf. Digit. Image Comput. Tech. Appl. 2017-
Decem, 1–7 (2017). https://doi.org/10.1109/DICTA.2017.8227483.
8. Rautaray, S.S.: A Real Time Hand Tracking System for Interactive Applications. Int. J.
Comput. Appl. 18, 975–8887 (2011).
9. Zhang, Z., Huang, F.: Hand tracking algorithm based on super-pixels feature. Proc. - 2013
Int. Conf. Inf. Sci. Cloud Comput. Companion, ISCC-C 2013. 629–634 (2014).
https://doi.org/10.1109/ISCC-C.2013.77.
10. Lim, K.M., Tan, A.W.C., Tan, S.C.: A feature covariance matrix with serial particle filter
for isolated sign language recognition. Expert Syst. Appl. 54, 208–218 (2016).
https://doi.org/10.1016/J.ESWA.2016.01.047.
11. Lim, K.M., Tan, A.W.C., Tan, S.C.: Block-based histogram of optical flow for isolated
sign language recognition. J. Vis. Commun. Image Represent. 40, 538–545 (2016).
https://doi.org/10.1016/J.JVCIR.2016.07.020.
12. Gaus, Y.F.A., Wong, F.: Hidden Markov Model - Based gesture recognition with
overlapping hand-head/hand-hand estimated using Kalman Filter. Proc. - 3rd Int. Conf.
Intell. Syst. Model. Simulation, ISMS 2012. 262–267 (2012).
https://doi.org/10.1109/ISMS.2012.67.
13. Nikam, A.S., Ambekar, A.G.: Sign language recognition using image based hand gesture
recognition techniques. Proc. 2016 Online Int. Conf. Green Eng. Technol. IC-GET 2016.
(2017). https://doi.org/10.1109/GET.2016.7916786.
14. Mohandes, M., Aliyu, S., Deriche, M.: Arabic sign language recognition using the leap
motion controller. IEEE Int. Symp. Ind. Electron. 960–965 (2014).
https://doi.org/10.1109/ISIE.2014.6864742.
15. Enikeev, D.G., Mustafina, S.A.: Sign language recognition through Leap Motion controller
and input prediction algorithm. J. Phys. Conf. Ser. 1715, 012008 (2021).
https://doi.org/10.1088/1742-6596/1715/1/012008.
16. Cheok, M.J., Omar, Z., Jaward, M.H.: A review of hand gesture and sign language
recognition techniques. Int. J. Mach. Learn. Cybern. 2017 101. 10, 131–153 (2017).
https://doi.org/10.1007/S13042-017-0705-5.
17. Wadhawan, A., Kumar, P.: Sign Language Recognition Systems: A Decade Systematic
Literature Review. Arch. Comput. Methods Eng. 2019 283. 28, 785–813 (2019).
https://doi.org/10.1007/S11831-019-09384-2.
18. Camgöz, N.C., Koller, O., Hadfield, S., Bowden, R.: Sign language transformers: Joint
end-to-end sign language recognition and translation. Proc. IEEE Comput. Soc. Conf.
Comput. Vis. Pattern Recognit. 10020–10030 (2020).
https://doi.org/10.1109/CVPR42600.2020.01004.
19. Cui, R., Liu, H., Zhang, C.: A Deep Neural Framework for Continuous Sign Language
Recognition by Iterative Training. IEEE Trans. Multimed. 21, 1880–1891 (2019).
https://doi.org/10.1109/TMM.2018.2889563.
20. Bantupalli, K., Xie, Y.: American Sign Language Recognition using Deep Learning and
Computer Vision. Proc. - 2018 IEEE Int. Conf. Big Data, Big Data 2018. 4896–4899
13

(2019). https://doi.org/10.1109/BIGDATA.2018.8622141.
21. Hore, S., Chatterjee, S., Santhi, V., Dey, N., Ashour, A.S., Balas, V.E., Shi, F.: Indian Sign
Language Recognition Using Optimized Neural Networks. Adv. Intell. Syst. Comput. 455,
553–563 (2017). https://doi.org/10.1007/978-3-319-38771-0_54.
22. Kumar, P., Roy, P.P., Dogra, D.P.: Independent Bayesian classifier combination based sign
language recognition using facial expression. Inf. Sci. (Ny). 428, 30–48 (2018).
https://doi.org/10.1016/J.INS.2017.10.046.
23. Sharma, A., Sharma, N., Saxena, Y., Singh, A., Sadhya, D.: Benchmarking deep neural
network approaches for Indian Sign Language recognition. Neural Comput. Appl. 2020
3312. 33, 6685–6696 (2020). https://doi.org/10.1007/S00521-020-05448-8.
24. Kishore, P.V.V., Prasad, M. V.D., Prasad, C.R., Rahul, R.: 4-Camera model for sign
language recognition using elliptical fourier descriptors and ANN. Int. Conf. Signal
Process. Commun. Eng. Syst. - Proc. SPACES 2015, Assoc. with IEEE. 34–38 (2015).
https://doi.org/10.1109/SPACES.2015.7058288.
25. Tewari, D., Srivastava, S.K.: A Visual Recognition of Static Hand Gestures in Indian Sign
Language based on Kohonen Self-Organizing Map Algorithm. Int. J. Eng. Adv. Technol.
165 (2012).
26. Gao, W., Fang, G., Zhao, D., Chen, Y.: A Chinese sign language recognition system based
on SOFM/SRN/HMM. Pattern Recognit. 37, 2389–2402 (2004).
https://doi.org/10.1016/J.PATCOG.2004.04.008.
27. Quocthang, P., Dung, N.D., Thuy, N.T.: A comparison of SimpSVM and RVM for sign
language recognition. ACM Int. Conf. Proceeding Ser. 98–104 (2017).
https://doi.org/10.1145/3036290.3036322.
28. Pu, J., Zhou, W., Li, H.: Iterative alignment network for continuous sign language
recognition. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2019-June,
4160–4169 (2019). https://doi.org/10.1109/CVPR.2019.00429.
29. Kalsh, E.A., Garewal, N.S.: Sign Language Recognition System. Int. J. Comput. Eng. Res.
6.
30. Singha, J., Das, K.: Indian Sign Language Recognition Using Eigen Value Weighted
Euclidean Distance Based Classification Technique. IJACSA) Int. J. Adv. Comput. Sci.
Appl. 4, (2013).
31. Liang, Z., Liao, S., Hu, B.: 3D Convolutional Neural Networks for Dynamic Sign
Language Recognition. Comput. J. 61, 1724–1736 (2018).
https://doi.org/10.1093/COMJNL/BXY049.
32. Pigou, L., Van Herreweghe, M., Dambre, J.: Gesture and Sign Language Recognition with
Temporal Residual Networks. Proc. - 2017 IEEE Int. Conf. Comput. Vis. Work. ICCVW
2017. 2018-Janua, 3086–3093 (2017). https://doi.org/10.1109/ICCVW.2017.365.
33. Huang, J., Zhou, W., Zhang, Q., Li, H., Li, W.: Video-based Sign Language Recognition
without Temporal Segmentation.
34. Cui, R., Liu, H., Zhang, C.: Recurrent convolutional neural networks for continuous sign
language recognition by staged optimization. Proc. - 30th IEEE Conf. Comput. Vis. Pattern
Recognition, CVPR 2017. 2017-Janua, 1610–1618 (2017).
https://doi.org/10.1109/CVPR.2017.175.
35. About - OpenCV.
36. Poster of the Manual Alphabet in ISL | Indian Sign Language Research and Training Center
14

( ISLRTC), Government of India.

37. Transfer learning and fine-tuning | TensorFlow Core.
38. Wu, S., Yang, J., Wang, X., Li, X.: IoU-balanced Loss Functions for Single-stage Object
Detection. (2020).

Sign Language Recogntion Report
No ratings yet
Sign Language Recogntion Report
29 pages
2021A1R002-1
No ratings yet
2021A1R002-1
14 pages
PFX-48420843 (1)
No ratings yet
PFX-48420843 (1)
6 pages
dec 2024 new paper
No ratings yet
dec 2024 new paper
7 pages
Hand_Gesture_Based_Sign_Language_Recognition_Using_Deep_Learning
No ratings yet
Hand_Gesture_Based_Sign_Language_Recognition_Using_Deep_Learning
5 pages
Major Report
No ratings yet
Major Report
41 pages
2021a1r002
No ratings yet
2021a1r002
51 pages
Visual Language Interpreter
No ratings yet
Visual Language Interpreter
7 pages
Mohammed Maqdoom Jahagirdarp2Yo
No ratings yet
Mohammed Maqdoom Jahagirdarp2Yo
9 pages
107
No ratings yet
107
11 pages
Silent Signals AI Power Sign Language Recognization
No ratings yet
Silent Signals AI Power Sign Language Recognization
8 pages
Sign Language Detection
No ratings yet
Sign Language Detection
5 pages
Sign Laguage To Text Convertor - Synopsis - Docx - Google Drive
No ratings yet
Sign Laguage To Text Convertor - Synopsis - Docx - Google Drive
12 pages
Document 8 - Donee
No ratings yet
Document 8 - Donee
113 pages
Hand Sign Language Translator For Speech Impaired
No ratings yet
Hand Sign Language Translator For Speech Impaired
4 pages
Real-Time Sign Language Interpreter Using Deep-Learning
No ratings yet
Real-Time Sign Language Interpreter Using Deep-Learning
8 pages
IJRPR462
No ratings yet
IJRPR462
9 pages
American_Sign_Language_Real_Time_Detection_Using_TensorFlow_and_Keras_in_Python
No ratings yet
American_Sign_Language_Real_Time_Detection_Using_TensorFlow_and_Keras_in_Python
6 pages
BIt On
No ratings yet
BIt On
12 pages
Indian Sign Language Recognition System for Dynamic Signs
No ratings yet
Indian Sign Language Recognition System for Dynamic Signs
9 pages
G7 Synopsis
No ratings yet
G7 Synopsis
14 pages
All Research
No ratings yet
All Research
133 pages
IJRPR20645
No ratings yet
IJRPR20645
9 pages
s44163-024-00113-8
No ratings yet
s44163-024-00113-8
11 pages
Sign Language Recognition Using Machine Learning
No ratings yet
Sign Language Recognition Using Machine Learning
8 pages
Sign Language Detection
No ratings yet
Sign Language Detection
6 pages
"Asl To Text Conversion": Bachelor of Technology
No ratings yet
"Asl To Text Conversion": Bachelor of Technology
15 pages
fin_irjmets1682255678
No ratings yet
fin_irjmets1682255678
5 pages
(7-14) Journal of Soft Computing and Computational Intelligence5
No ratings yet
(7-14) Journal of Soft Computing and Computational Intelligence5
8 pages
Real Time Indian Sign Language Recognition and Speech Generation Using Convolutional Neural Network
No ratings yet
Real Time Indian Sign Language Recognition and Speech Generation Using Convolutional Neural Network
4 pages
Hand Sign Language Research
No ratings yet
Hand Sign Language Research
7 pages
American - Yolo
No ratings yet
American - Yolo
16 pages
JPNR 2022 S01 126
No ratings yet
JPNR 2022 S01 126
8 pages
Static Sign Language Recognition Using Deep Learning
No ratings yet
Static Sign Language Recognition Using Deep Learning
9 pages
MCA2185_Research paper
No ratings yet
MCA2185_Research paper
8 pages
Sign_Language_Recognition_Using_Hand_Gestures
No ratings yet
Sign_Language_Recognition_Using_Hand_Gestures
5 pages
Final Minor Report
No ratings yet
Final Minor Report
24 pages
Referencia N°07
No ratings yet
Referencia N°07
18 pages
Paper 3+ijisae
No ratings yet
Paper 3+ijisae
15 pages
Sign Language Recognition System With Speech Output
No ratings yet
Sign Language Recognition System With Speech Output
5 pages
Sriatmaja,+Ridwang+19401 AAP
No ratings yet
Sriatmaja,+Ridwang+19401 AAP
10 pages
Efficient Approach For ISL Using ML
No ratings yet
Efficient Approach For ISL Using ML
4 pages
JETIRDV06004
No ratings yet
JETIRDV06004
3 pages
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
No ratings yet
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
18 pages
Sign Language Report
No ratings yet
Sign Language Report
32 pages
Si-Lang Translator With Image Processing
No ratings yet
Si-Lang Translator With Image Processing
4 pages
Enhancing accessibility with long short-term memory-based sign language detection systems
No ratings yet
Enhancing accessibility with long short-term memory-based sign language detection systems
8 pages
Glide A Communication Aid For Deaf-Mute
No ratings yet
Glide A Communication Aid For Deaf-Mute
4 pages
SignLanguageRecognitionSystem
No ratings yet
SignLanguageRecognitionSystem
12 pages
Sign Language Recognition System
No ratings yet
Sign Language Recognition System
12 pages
Hand Signs To Audio Converte1
No ratings yet
Hand Signs To Audio Converte1
11 pages
A Review On The Perception and Recognition Systems For Interpreting Sign Languages Used by Deaf and Mute
No ratings yet
A Review On The Perception and Recognition Systems For Interpreting Sign Languages Used by Deaf and Mute
6 pages
Project Synopsis (2) (1)
No ratings yet
Project Synopsis (2) (1)
20 pages
Finger Motion Capture For Sign Language Interpretation
No ratings yet
Finger Motion Capture For Sign Language Interpretation
11 pages
Blackbook
No ratings yet
Blackbook
35 pages
5476 12069 1 Ed
No ratings yet
5476 12069 1 Ed
14 pages
Updated 10663 M. Arsalan & 10662 M. Abubakar FYP-1 Report
No ratings yet
Updated 10663 M. Arsalan & 10662 M. Abubakar FYP-1 Report
19 pages
Dynamic Gesture Recognition for Sign Language Using Long Short Term Memory Networks
No ratings yet
Dynamic Gesture Recognition for Sign Language Using Long Short Term Memory Networks
7 pages
Sign Language Converter
No ratings yet
Sign Language Converter
4 pages
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
From Everand
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
Steven Cooper
5/5 (1)
IIT Class XI Maths Circle
83% (12)
IIT Class XI Maths Circle
88 pages
GPRS Architecture
No ratings yet
GPRS Architecture
6 pages
Resume 4
No ratings yet
Resume 4
3 pages
Bayers Process
No ratings yet
Bayers Process
21 pages
Export Policy
No ratings yet
Export Policy
125 pages
Advanced Accountancy I
No ratings yet
Advanced Accountancy I
2 pages
Sex Harassment Complaint Reporting Form 2018
No ratings yet
Sex Harassment Complaint Reporting Form 2018
2 pages
Circular Seating Arrangement For Bank PO Prelims Exam
No ratings yet
Circular Seating Arrangement For Bank PO Prelims Exam
19 pages
Global Positioning System (GPS) and Its Applications
No ratings yet
Global Positioning System (GPS) and Its Applications
15 pages
77 - Cisco ASA 5516 Data Sheet
No ratings yet
77 - Cisco ASA 5516 Data Sheet
15 pages
DLSU - Systems Dynamics - Final Paper Format
No ratings yet
DLSU - Systems Dynamics - Final Paper Format
5 pages
RFC 1123
No ratings yet
RFC 1123
94 pages
Uas Bahasa Inggris Kelas 5
No ratings yet
Uas Bahasa Inggris Kelas 5
4 pages
Matsonic MS8127C
No ratings yet
Matsonic MS8127C
80 pages
Marketing Vs Selling
No ratings yet
Marketing Vs Selling
6 pages
List of Department Heads and Chiefs of Offices
No ratings yet
List of Department Heads and Chiefs of Offices
3 pages
Purchasing and Receiving Schemes
No ratings yet
Purchasing and Receiving Schemes
10 pages
Download Full Processes in Microbial Ecology Dave Kirchman PDF All Chapters
100% (9)
Download Full Processes in Microbial Ecology Dave Kirchman PDF All Chapters
66 pages
Sanne Et Al. 17-Item DCS The Swedish DemandÔÇôControlÔÇôSupport Questionnaire (DCSQ)
No ratings yet
Sanne Et Al. 17-Item DCS The Swedish DemandÔÇôControlÔÇôSupport Questionnaire (DCSQ)
9 pages
rulespdf
No ratings yet
rulespdf
223 pages
Coffee and Tea: Socio-Cultural Meaning, Context and Branding
No ratings yet
Coffee and Tea: Socio-Cultural Meaning, Context and Branding
15 pages
Adding and Subtracting Fractions Choiceboard
100% (1)
Adding and Subtracting Fractions Choiceboard
1 page
CGPTM
No ratings yet
CGPTM
5 pages
The Secret Initiation of Jesus at Qumran The Essene Mysteries of John the Baptist Best Quality Download
100% (4)
The Secret Initiation of Jesus at Qumran The Essene Mysteries of John the Baptist Best Quality Download
16 pages
CORE 8 Apply Facial Make Up
100% (1)
CORE 8 Apply Facial Make Up
65 pages
Corruption in India: Click To Edit Master Subtitle Style
No ratings yet
Corruption in India: Click To Edit Master Subtitle Style
10 pages
ICSE Class X Physics Question Paper 2021 Set I
No ratings yet
ICSE Class X Physics Question Paper 2021 Set I
14 pages
STD 153
No ratings yet
STD 153
15 pages
Polyproofx 1: Technical Datasheet
No ratings yet
Polyproofx 1: Technical Datasheet
2 pages
Earthquake Simulations of Large Scale Structures Using Opensees Software On Grid and High Performance Computing in India
No ratings yet
Earthquake Simulations of Large Scale Structures Using Opensees Software On Grid and High Performance Computing in India
4 pages

Sign Language Recognition System Using TensorFlow

Uploaded by

Sign Language Recognition System Using TensorFlow

Uploaded by

Sign Language Recognition System using TensorFlow

Object Detection API

Sharvani Srivastava, Amisha Gangwar, Richa Mishra, Sudhakar Singh*[0000-0002-0710-

Department of Electronics and Communication, University of Allahabad, Prayagraj, India

Abstract. Communication is defined as the act of sharing or exchanging infor-

Keywords: Sign Language Recognition (SLR), Computer Vision, Machine

formal communication (official communication) is steered through the channels that

Fig. 1. Indian Sign Language Alphabets

Fig. 2. Selecting a portion of the image to label it

Fig. 3. Labelling the selected portion

𝐿𝐿𝑙𝑙𝑙𝑙𝑙𝑙 (x, l, g) = ∑𝑁𝑁 𝑘𝑘 𝑚𝑚

𝑔𝑔�𝑗𝑗𝑐𝑐𝑐𝑐 = (𝑔𝑔𝑗𝑗𝑐𝑐𝑐𝑐 − 𝑑𝑑𝑖𝑖𝑐𝑐𝑐𝑐 )/𝑑𝑑𝑖𝑖𝑤𝑤 (2)

𝑔𝑔�𝑗𝑗𝑤𝑤 = log(𝑔𝑔𝑗𝑗𝑤𝑤 /𝑑𝑑𝑖𝑖𝑤𝑤 ) (4)

𝑔𝑔�𝑗𝑗ℎ = log(𝑔𝑔𝑗𝑗ℎ /𝑑𝑑𝑖𝑖ℎ ) (5)

Fig. 4. Real-Time Sign Language Detection

5.1 Dataset and Experimental Setup

5.2 Results and Discussion

Fig. 5. Loss incurred at different steps

Table 1. Confidence rate of each alphabet

94% 98% 90% 90% 70% 96% 73% 97% 95%

57% 87% 93% 91% 55% 78% 95% 95% 83%

86% 81% 87% 86% 87% 88% 90% 80%

6 Conclusion and Future Works

1. Kapur, R.: The Types of Communication. MIJ. 6, (2020).

Otero-Potosi, S., Benitez-Pereira, H., Peluffo-Ordonez, D.H.: Sign Language Recognition

( ISLRTC), Government of India.

You might also like