0% found this document useful (0 votes)

32 views

Recognizing Sign Language Using Machine Learning and Deep Learning Models

Individuals with hearing impairments communicate mostly through sign language. Our goal was to create an American Sign Language recognition dataset and utilize it in a neural network-based machine learning model that can interpret hand gestures and positions into natural language. In our study, we incorporated the SVM, CNN and Resnet-18 models to enhance predictability when interpreting ASL signs through this new dataset, which includes provisions such as lighting and distance limitations.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

Recognizing Sign Language Using Machine Learning and Deep Learning Models

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

Recognizing Sign Language using Machine

Learning and Deep Learning Models
Sohan Maurya1; Sparsh Doshi2; Harsh Jaiswar3; Sahil Karale4; Sneha Burnase5; Dr. Poonam. N. Sonar6
Department of Electronics and Telecommunication, Rajiv Gandhi Institute of Technology, Mumbai, India

Abstract:- Individuals with hearing impairments occur. These programs aim towards interpreting any sign
communicate mostly through sign language. Our goal language gesture accurately and translating it into either
was to create an American Sign Language recognition written or spoken language as required, thereby promoting
dataset and utilize it in a neural network-based machine smoother integration among those involved, irrespective of
learning model that can interpret hand gestures and varying individual preferences regarding method preference
positions into natural language. In our study, we during personal communications scenarios going forward.
incorporated the SVM, CNN and Resnet-18 models to
enhance predictability when interpreting ASL signs This study investigates the use of a CNN model to
through this new dataset, which includes provisions such improve sign language recognition. The goal is to break
as lighting and distance limitations. Our research also down communication barriers between people with hearing
features comparison results between all the other models impairments and the rest of society by developing an
implemented under invariant conditions versus those efficient, trustworthy system capable of quickly identifying
using our proposed CNN model. As demonstrated by its and understanding sign language motions. By training on
high levels of precision at 95.10% despite changes vast amounts of American Sign Language (ASL) data, our
encountered during testing procedures like varying data proposed CNN will gain insights into various subtle shifts
sets or scene configurations where losses are minimal within sign characteristics for increased precision when
(0.545), there exists great potential for future recognizing differences among numerous signs.
applications in image recognition systems requiring deep
learning techniques. Furthermore, these advancements II. RELATED WORKS
may lead to significant improvements within various
fields related explicitly to speech-language therapy Sign language identification is an important field of
sessions designed specifically around helping people research because it facilitates communication for people
overcome challenges associated with deafness while with hearing difficulties. Various deep learning-based
building bridges towards improved social integration approaches have been investigated over time to improve sign
opportunities. language recognition systems. This literature analysis seeks
to consolidate present research findings and highlight
Keywords:- Image Recognition, Image Classification, prospective future research paths in the field of sign
Feature Extraction, Deep Learning, Convolutional Neural language recognition with Convolutional Neural Networks
Network (CNN), Sign Language Translation, American Sign (CNNs).
Language (ASL), Real-Time Recognition.
Koller, Zargaran, Ney, and Bowden (2016) delivered a
I. INTRODUCTION hybrid CNN-HMM version for non-stop signal language
popularity [4], which combines the discriminative talents
Sign language is a type of communication used by deaf modern day CNNs with the series modelling competencies
and hard-of-hearing people that relies on visual cues. While modern day Hidden Markov models (HMMs). They tested
it features its own unique grammar, vocabulary, and syntax, that their stop-to-give up embedding improved the
the majority of people in the world are not fluent in sign performance on three challenging benchmark continuous
language, which can make interaction between members signal language reputation duties, reaching relative upgrades
who use different forms difficult at times. The advancement among 15% and 38% and up to 13.3% absolute. This has a
in machine learning technology offers a potential solution to look at sheds mild at the capacity cutting-edge hybrid CNN-
this challenge. Machine learning involves teaching HMM fashions for enhancing the overall accuracy state-of-
computers how to learn from experience without needing the-art signal language recognition systems.
explicit programming or instruction throughout the process.
This includes training computer algorithms using extensive Koller, Zargaran, Ney, and Bowden (2018) [5]
datasets so they can recognize recurring patterns within their conducted additional research into the usage of hybrid CNN-
environment while making predictions based upon such past HMMs for robust statistical continuous sign language
input. Therefore, sign language recognition systems utilizing recognition. Although data from this work are not available,
these advances offer an opportunity for bridging gaps it is clear that the study expands on the investigation of
amongst community members with differing abilities to CNN-HMM models in sign language identification,
communicate effectively when face-to-face interactions

IJISRT24MAY500 www.ijisrt.com 93
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

highlighting the necessity for strong and accurate challenge is real-time recognition, as many existing studies
recognition systems. focus only on offline scenarios. However, seamless
communication requires accurate real-time applications,
Furthermore, Wadhawan and Kumar (2020) discussed especially when interacting dynamically with individuals
the application of 3D-CNNs to train spatio-temporal features who have hearing impairments. Moreover, recognizing
from raw video data for sign language recognition [1]. The variations in gestures that arise due to the diversity and
authors used spatial attention in the network to focus on complexity of different sign languages poses a considerable
areas of interest during feature extraction, and temporal obstacle that warrants further investigation. Furthermore,
attention to choose meaningful motions for categorization. integrating multimodal approaches such as incorporating
This approach demonstrates how attention mechanisms can additional physiological cues could enhance the overall
improve the overall accuracy of sign language recognition accuracy and robustness of current systems significantly
by allowing the network to focus on essential spatiotemporal towards large-vocabulary scenarios common among people
information. with varying levels of deafness or cultural backgrounds.
Investigating various techniques while considering
Huang, Zhou, Li, and Li (2019) also investigated contextual factors would offer insights into generalizing
attention-based 3D-CNNs for large-vocabulary sign models across diverse user populations beyond their native
language recognition [8]. Although no specific research sign-language habitat. In summary, future research must
outcomes were presented, the attention-based method prioritize addressing these knowledge gaps if we want to
emphasizes the necessity of focusing on significant features realize more effective inclusive solutions for individuals
in sign language recognition, which contributes to the living with hearing disabilities through advanced sign
investigation of advanced CNN architectures for this job. language recognition technology.

Barbhuiya, Karsh, and Jain (2020) proposed a CNN- III. PROPOSED METHODOLOGY
based feature extraction and classification method for sign
language recognition [7]. While specific results were not A. Dataset Description
available, the study adds to our understanding of CNNs' We used two widely available datasets and one custom
potential for feature extraction and classification in sign dataset to train the model. The American Sign Language
language recognition systems. Dataset, published on Kaggle.com, consists of 2515 JPEG
files totaling 32.46 MB in size. It contains graphics of
Furthermore, Masood, Srivastava, Thuwal, and Ahmad numbers 0 through 9, as well as all alphabets from A to Z,
(2018) studied real-time sign language gesture detection making it a comprehensive resource for ASL recognition and
with CNN and Recurrent Neural Networks (RNNs) [2]. interpretation. The Indian sign language dataset contains 42k
Their research focuses on detecting gestures from video JPEG files, totaling 80mb in size. Similar to the ASL dataset,
sequences, demonstrating the power of merging CNNs and it contains graphics representing digits 0 to 9 as well as all
RNNs for real-time sign language recognition. alphabets from A to Z. These datasets are publicly available
to the public, making them useful for study, education, and
Additionally, Katoch, Singh, and Tiwary (2022) the development of sign language-related applications and
investigated the use of CNNs for American Sign Language technology. The "American Sign Language Dataset" is
identification, combining Speeded Up Robust Features meticulously crafted, containing images captured in the
(SURF) with Support Vector Machines (SVM) and CNN ".JPEG" format. Spanning a considerable size of 320 MB,
[6]. Their work adds to our understanding of CNNs' the dataset is rich with detail, consisting of 3600 files
applicability in recognizing various sign languages. carefully organized to encompass a wide array of sign
language gestures. It covers numeric hand signs from 0 to 9,
Finally, Rastgoo, Kiani, and Escalera (2020) as well as the complete set of alphabets in American Sign
underlined the use of Long Short-Term Memory (LSTM) Language, offering a comprehensive resource for sign
models for isolated hand sign language recognition [3]. language recognition and related studies.
Their method entailed linking LSTM to the fully linked layer
of a CNN, revealing the potential of combining RNNs and B. Pre-Processing
CNNs for sequence learning tasks in sign language Pre-processing involves three stages. The first is
recognition. resizing which refers to adjusting the image size, according
to predetermined architecture requirements. Second, the
In conclusion, this literature review explores various scaling functionality changes the pixel range from 0-255 to
CNN-based techniques for sign language recognition and 0-1, improving data setup and speeding up processing time.
suggests areas for future research to improve overall Lastly, normalization occurs where we applied various
accuracy and robustness. Despite significant progress in sign transformations like random resizing or cropping using
language recognition using CNN-based approaches, there PyTorch's 'transforms' module on training and validation
are still research gaps and areas for improvement. One key datasets for optimal results.

IJISRT24MAY500 www.ijisrt.com 94
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

…. (2)

Fig 1: ASL Dataset

Fig 2: Proposed Methodology for Sign Language Recognition

C. Feature Extraction  Prewitt, Canny and Sobel

 HOG (Histogram of Oriented Gradients)

Dalal and Triggs proposed HOG, a feature extraction
technique for human detection, in 2005[10]. Firstly, the
image gradients are calculated using derivative filters to
determine the magnitude and direction of gradients for each
pixel in both horizontal and vertical directions. The image is
then separated into small cells and for each cell, a histogram
of gradient orientations is produced, quantizing the
orientations into discrete bins. Next, histograms within each
block are normalized to enhance robustness against changes
in illumination and contrast. Blocks of adjacent cells are
formed, and their normalized histograms are concatenated to
form block descriptors, capturing spatial gradient orientation
patterns. Finally, all block descriptors are concatenated to
generate the complete feature vector which represents the
HOG features of the entire image, which can be utilized for
various computer vision tasks. Fig 3: HOG Feature Extraction

IJISRT24MAY500 www.ijisrt.com 95
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

Prewitt, Canny, and Sobel are edge detection Compared with its counterparts, this method is less
algorithms that are widely utilized as feature extraction vulnerable to interference; hence, it's widely believed among
methods in image classification. These methods detect rapid experts that it's one of the superlative techniques currently
changes in intensity or color in an image, which are available for detecting an object's boundary reliably.
commonly associated with borders or boundaries between
different objects or regions.

The Prewitt operator computes both gradient

magnitude and direction for every pixel using a
straightforward filter. A threshold is then utilized on the
computed gradient amplitude to identify edge occurrences.
The technique boasts high speed and efficiency; however, it
may be vulnerable to inaccuracies caused by noise. A Prewitt
mask is shown in the equation (1).

John F. Canny created the advanced technique known

as the Canny edge detector in 1986[11], which has superior
capabilities. The process involves utilizing a Gaussian filter
to minimize noise and determining gradient magnitude and Fig 5: Canny Feature Extraction
direction for each pixel before applying high and low
thresholds that establish resilient and feeble edges, The Sobel operator is like the Prewitt operator but
respectively; afterward, frail edges attach together to instead it uses a different kernel for calculating the gradient
generate a comprehensive perimeter map efficiently. magnitude and direction. The equation (4) shows a Sobel
Equation (2) depicts the equation for a Gaussian filter kernel mask.
with dimensions of (2k+1)×(2k+1) and we can find edge
gradient and direction for each pixel using equation (3)

The Sobel operator is comparatively more sensitive to

noise but is faster and more efficient.

Fig 4: Prewitt Feature Extraction Fig 6: Sobel Feature Extraction

IJISRT24MAY500 www.ijisrt.com 96
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

 Harris Corner Detection.  Contour Tracing.

Harris Corner Detection is a common technique for The extraction technique of contour tracing is based on
locating important areas in photos, which is applied to many identifying and outlining the shapes of objects within an
computer vision tasks[12], such as sign language image [13]. By detecting changes in intensity or color along
recognition. By identifying and tracking the corners of the object boundaries, this method can effectively identify and
hand and fingers in sign language, this method allows for the classify various objects. The shape information obtained
detection of key points and subsequent analysis of hand through contour tracing enables valuable insights into a
movements. The Harris Corner Detection method works by given objects.
analyzing the intensity gradients at each pixel in an image,
identifying those with significant changes in intensity.
It calculates the intensity difference for a displacement of
(u,v) in any direction. The expression for intensity gradients
is shown in equation (5).

To detect corners, we must maximize the function

E(u,v). These are then designated as corners, which serve as
distinguishing characteristics for tracking and recognition.
This approach has proven to be quite effective in detecting
sign language since it can deal with differences in lighting,
orientation, and hand form, all of which are typical in real-
world contexts.

Fig 8: Contour Features

Fig 7: Harris Corner Detection

Fig 9: CNN Architecture

D. Models simplest form of an SVM is achieved through a linear

formulation where hyperplanes lie in the input data space x
 Support Vector Machine subset; hence, hypothesis spaces become subsets involving
SVMs are supervised learning techniques for all hyperplane forms f(x) = w⋅x +b. SVMs are empowered
classification and regression that are part of the generalized by the kernel trick to effectively manage non-linear
linear classifier family. SVM's capacity to decrease connections in data without involving a direct feature
empirical classification error while maximizing geometric transformation. Standard kernels include linear, radial basis
margin distinguishes it as a maximum margin classifier. functions (RBF), polynomial, and sigmoidal functions.
Based on structural risk minimization (SRM), SVMs strive Linearly separable data is best suited for the simplistic linear
towards increasing separation between decision boundaries, kernel. The expression for linear kernel is shown in equation
hyperplanes, and nearest points from different classes. The (6).

IJISRT24MAY500 www.ijisrt.com 97
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

K(xi, xj) = xi•xj ……………………………………….....(6) Table 1. CNN Architecture

The RBF kernel is widely used for handling non-linear

relationships in the data.

K(xi, Xj) = exp (-ɤ||xi- xj||^2)…………………………….(7)

Our overall approach is a conventional CNN

architecture with many convolutional and dense layers. Each
Fig 10: Linear and RBF Kernels [14] CNN is three layers deep. The architecture has two
convolutional layers, each with 32 filters and a 3x3 window
In this classification, we employed SVM with a linear size, followed by a max-pool and dropout layer. It is
kernel. We used the extracted characteristics as feature followed by a pair of convolutional layers, each with 64
vectors in the SVM to classify the signs. The training uses a filters, as well as a max pooling and dropout layer. The third
total of 1760 photos. After training, the classifier's block consists of two convolutional layers with 128 filters
performance is tested on a testing set of 755 images using and a max pooling layer. A fully linked hidden layer with
criteria such as overall accuracy, precision, and recall. 512 neurons of the ReLU activation function is then applied,
followed by a SoftMax output layer.
 Convolutional Neural Network
CNNs i.e. convolutional neural networks are also  ResNet-18
called deep learning systems that learn directly from input ResNet-18 is a convolutional neural network belonging
without requiring human feature extraction. These networks to the Residual Networks (ResNet) family, designed for
excel at identifying patterns in pictures to distinguish objects image classification. Microsoft Research created this
and environments, but they can also categorize non-image architecture by combining an initial convolutional layer with
data like audio and signal information. Unlike fully max pooling, which helps reduce spatial dimensions. Its
connected multilayer perceptron models prone to overfitting revolutionary feature lies in its residual blocks, which have
with complete connectivity across all neurons of each layer, two 3x3 kernels each and contain skip connections, allowing
CNN's utilize regularization techniques for reducing information to flow from one layer to another without
connections, such as skipped connection dropouts, instead of encountering vanishing gradient problem issues, leading to
punishing parameters during training. With a hierarchical the successful training of deep networks. The residuum
approach toward constructing increasingly complex patterns block structure has batch normalization functions plus ReLU
using simple filters imprinted by smaller ground-level activation ones, adding capacity to capture complex patterns,
structures found earlier on the process chain, making them followed by average pooling reducing size before fully
less demanding on complexity than regular MLPs, this connected layers predict classes. Researchers appreciate how
architecture was inspired initially through biological ingenious incorporating skipping connections was in
processes where animal visual-brain involves intricated developing models trained on large datasets like images,
sparsity-processing capabilities driven by selective sensory giving superlative results trending globally among
stimuli based mainly thought neurophysiology’s Receptive- professionals participating across industries and successfully
Field Theory, whereby Responsive-Cortical-Neurons only using it as their go-to algorithm solution. cutting-edge
reacted upon certain sections, forming maximum perimeter digitized processes workflows involving machine vision
coverage regarding field perception. aesthetics enabling while avoiding both overfitting and
underfitting exceptional performance, especially when
processing challenging digital inputs. AI applications open
doors. new expanded opportunities for businesses and

IJISRT24MAY500 www.ijisrt.com 98
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

industries counting upon state-of-the-art tools available Table 3. shows overall accuracy evolution of models.
today, representing future progress moving forward. In which it has been seen that accuracy is increasing
noticeably. CNN exhibits a significantly higher initial
Table 2: Architecture of ResNet-18 accuracy of 37.5%, which rapidly escalates to 99.20% over
the course of the epochs, indicating its remarkable efficiency
in learning and capturing complex patterns within the
training data. Overall, CNN outperforms ResNet and SVM,
showcasing the highest training accuracy rates across all
epochs.

The Resnet-18 network we have employed consists of

two-dimensional convolutional layers, batch normalization,
max pooling, and fully linked layers. Each residual black
contains 64,128,236 and 512 filters, respectively. The final Fig 11: Overall Training Accuracy
fully connected layer of the original ResNet-18 is replaced
by a new linear layer based on our output classes. Table 4: Comparison of training loss of models

E. Training
During the training process, iterating over the training
dataset for a specified number of epochs is crucial for
optimizing model performance. Stochastic Gradient Descent
(SGD) serves as the optimizer, adjusting model parameters
to minimize the Cross-Entropy Loss function.
Backpropagation computes the loss and updates the model's
parameters accordingly. To align with the number of classes
in the dataset, the last fully connected layer is replaced, and
the evaluation mode is set. Data splitting involves allocating
70 percent for training and 30 percent for testing across all
three datasets, ensuring robust model evaluation and
generalization.

Table 3: Training Performance of Different Models

Fig 12: Training Loss

IJISRT24MAY500 www.ijisrt.com 99
Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

This trend indicates that all models effectively learn IV. RESULTS AND ANALYSIS
from the training data over the epochs, with CNN
consistently exhibiting the most substantial decrease in loss, The overall accuracy comparison table illustrates the
implying its superior ability to optimize and generalize performance of three models CNN, ResNet, and SVM across
compared to ResNet and SVM. various feature extraction methods: Canny, contour, Harris,
Prewitt, watershed, and Sobel. Across the different feature
extraction techniques, CNN consistently achieves
competitive accuracy rates, with the highest scores observed
in Sobel (96.03%), Prewitt (95.03%), and contour (95.03%)
methods.

Fig 13: Feature Based Analysis

Table 5: Model Performance based on Different Features

ResNet also demonstrates strong performance, most feature extraction techniques, indicating its robustness
particularly excelling with the Harris feature extraction and effectiveness in handling diverse image features.
method (96.45%). SVM generally lags behind CNN and
ResNet but still maintains respectable overall accuracy rates, The study analyzes categorization using the following
with the highest score achieved using the Harris method parameters: recall value or sensitivity (R), precision (P), F1
(97.45%). Overall, while ResNet and SVM show notable score (F1), accuracy (A), and error.
accuracy levels, CNN consistently performs well across

Table 6: Comparison of Models

IJISRT24MAY500 www.ijisrt.com 100

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

In terms of precision, ResNet exhibits the highest value Overall, ResNet consistently demonstrates competitive
at 95.83%, closely followed by SVM at 95.16%, with CNN performance across all metrics, while CNN closely follows,
slightly lower at 95.14%. In terms of overall accuracy, and SVM lags slightly behind in terms of precision, F1 score,
ResNet stands out with a score of 94.3%, followed closely overall accuracy, and recall.
by CNN at 94.74%, and SVM slightly lower at 92.26%.

Fig 14: Quantitative Analysis

Table 7: Class Wise Comparison of Each Model

IJISRT24MAY500 www.ijisrt.com 101

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

From table (7), we observe a comparative analysis of The CNN classifier maintains a robust performance across
three distinct classifiers - SVM, CNN and Resnet-18 across the dataset, with perfect scores in numerous classes, but it is
a diverse set of 36 classes, encompassing both numerical not without its shortcomings, as evidenced by lower
digits (0-9) and alphabetical letters (A-Z). The ResNet accuracies in classes such as 'W', 'U', and 'R'. Despite these
architecture demonstrates a commendable level of accuracy, individual variances, the overall performance of each
particularly excelling with a flawless 100% in the majority classifier is impressive, showcasing their ability to
of classes. It does, however, exhibit some challenges, most effectively discern and classify a wide range of characters.
notably with the numerical class '0' where it achieves only In summary, ResNet appears to be the most consistent and
65% accuracy, and to a lesser extent with the classes '1', 'O', accurate across the majority of classes, with only a few
'T', and 'V'. The SVM classifier, while achieving perfect instances of reduced accuracy. SVM, while achieving high
scores in several instances, shows a more erratic accuracy in certain classes, shows more pronounced dips in
performance profile with significant dips in accuracy for performance, particularly with specific numerical and
certain classes. It struggles considerably with the numerical alphabetical classes. CNN generally performs well, with a
class '0' at 64% and the alphabetical classes 'N' at 70%, 'M' few exceptions where its accuracy falls below that of
at 79%, and 'K' at 81%, indicating potential weaknesses in ResNet.
its classification capabilities for these particular characters.

Fig 15: Illustration of Misclassification of Gestures

V. CONCLUSION precision, and adaptability, making it better suited for sign

language recognition systems.
Resnet-18 and CNN are deep learning models that can
learn and recognize complicated patterns and features in Despite achieving commendable accuracy rates,
data.. On the other hand, SVM is a machine learning model certain sign language gestures pose challenges due to their
that focuses on identifying the best border between data resemblance in landmarks. For instance, distinguishing
points, which may not be suitable for recognizing complex between the hand gestures for the alphabet "W" and the
hand gestures in sign language. In sign language detection number “6", or between "O" and “0", often leads to
system, overall accuracy denotes the model's capacity to misclassifications, as evidenced by their confusion matrices.
precisely identify and categorize different signs. The higher Future research could focus on developing neural network
the accuracy, the greater the reliability of the system. Our designs for sign language recognition to improve efficiency
study revealed that CNN achieved the highest accuracy rate in order to capture and interpret the unique properties of sign
at 95.10%, surpassing ResNet 18 (94.3%) and SVM language motions with even greater accuracy.
(94.27%).Our research underscores the consistent
performance of CNN and ResNet. While CNN exhibits
superiority over ResNet 18 and SVM in overall accuracy,

IJISRT24MAY500 www.ijisrt.com 102

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

REFERENCES [12]. Ram, P., and Padmavathi, S. Analysis of harris corner

detection for color images. In 2016 International
[1]. Wadhawan, Ankita., & Kumar, Parteek. (2020). Conference on Signal Processing, Communication,
Deep learning-based sign language recognition Power and Embedded System (SCOPES) (2016),
system for static signs. Neural Computing and IEEE, pp. 405410.
Applications , 32 , 7957 - 7968. [doi: 10.1109/SCOPES.2016.7955862]
[doi.org/10.1007/s00521-019-04691-y] [13]. Chang, F., and Chen, C.-J. A component labelling
[2]. Masood, S.., Srivastava, Adhyan., Thuwal, H.., & algorithm using contour tracing technique. In
Ahmad, Musheer. (2018). Real-Time Sign Language Seventh International Conference on Document
Gesture (Word) Recognition from Video Sequences Analysis and Recognition, 2003. Proceedings.
Using CNN and RNN. , 623-632. (2003), vol. 3, Citeseer, pp. 741741.
[doi.org/10.1007/978-981-10-7566-7_63] [doi:10.1109/ICDAR.2003.1227760]
[3]. Rastgoo, R.., Kiani, K.., & Escalera, Sergio. (2020). https://miro.medium.com/v2/resize:fit:1400/format:
Video-based isolated hand sign language recognition webp/1*Ha7EfcfB5mY2RIKsXaTRkA.png
using a deep cascaded model. Multimedia Tools and
Applications, 79, 22965 - 22987.
[doi.org/10.1007/s11042-020-09048-5]
[4]. Koller, Oscar., Zargaran, Sepehr., Ney, H.., &
Bowden, R.. (2016). Deep Sign: Hybrid CNN-HMM
for Continuous Sign Language Recognition.
[doi.org/10.5244/C.30.136]
[5]. Koller, Oscar., Zargaran, Sepehr., Ney, H.., &
Bowden, R.. (2018). Deep Sign: Enabling Robust
Statistical Continuous Sign Language Recognition
via Hybrid CNN-HMMs. International Journal of
Computer Vision, 126, 1311-1325.
[doi.org/10.1007/s11263-018-1121-3]
[6]. Katoch, Shagun., Singh, Varsha., & Tiwary, U..
(2022). American Sign Language recognition system
using SURF with SVM and CNN. Array , 14 ,
100141. [doi.org/10.1016/j.array.2022.100141]
[7]. Barbhuiya, Abul Abbas., Karsh, R.., & Jain, Rahul.
(2020). CNN based feature extraction and
classification for sign language. Multimedia Tools
and Applications , 80 , 3051 – 3069.
[doi.org/10.1007/s11042-020-09829-y]
[8]. Huang, Jie., Zhou, Wen-gang., Li, Houqiang., & Li,
Weiping. (2019). Attention-Based 3D-CNNs for
Large-Vocabulary Sign Language
Recognition. IEEE Transactions on Circuits and
Systems for Video Technology , 29 , 2822-2832.
[doi.org/10.1109/TCSVT.2018.2870740]
[9]. Sasikala, N., Swathipriya, V., Ashwini, M., Preethi,
V., Pranavi, A., and Ranjith, M. Feature extraction of
real-time image using sift algorithm. European
Journal of Electrical Engineering and Computer
Science 4, 3 (2020).
[doi.org/10.24018/ejece.2020.4.3.206.]
[10]. Dalal, N., and Triggs, B. Histograms of oriented
gradients for human detection. In 2005 IEEE
computer society conference on computer vision and
pattern recognition (CVPR'05) (2005), vol. 1, Ieee,
pp. 886893. [doi 10.1109/CVPR.2005.177]
[11]. Rekha, J., Bhattacharya, J., and Majumder, S. Shape,
texture and local movement hand gesture features for
Indian sign language recognition. In 3rd international
conference on trends in information sciences &
computing (TISC2011) (2011), IEEE, pp. 3035.
[dx.doi.org/10.1109/tisc.2011.6169079]

IJISRT24MAY500 www.ijisrt.com 103

Sign Language Recognition Using Machine Learning A Survey
No ratings yet
Sign Language Recognition Using Machine Learning A Survey
5 pages
Gravity Dam Report
100% (2)
Gravity Dam Report
27 pages
Sign Language Detection Using Machine Learning
No ratings yet
Sign Language Detection Using Machine Learning
6 pages
IJRAR23B3375
No ratings yet
IJRAR23B3375
5 pages
IJCRT2402668
No ratings yet
IJCRT2402668
7 pages
Paper 2728
No ratings yet
Paper 2728
10 pages
Aml 22 Sign Language Recognition Using Deep Learning
No ratings yet
Aml 22 Sign Language Recognition Using Deep Learning
5 pages
Indian Sign Language Recognition System
No ratings yet
Indian Sign Language Recognition System
3 pages
Convolutional Neural Networks For Indian Sign Language Recognition
No ratings yet
Convolutional Neural Networks For Indian Sign Language Recognition
6 pages
Sign Speak: Recogninzing Sign Language With Machine Learning
No ratings yet
Sign Speak: Recogninzing Sign Language With Machine Learning
12 pages
Sign Language Recognition System Using DL-CNN Model Using VGG16 and Image Net With Mobile Application
No ratings yet
Sign Language Recognition System Using DL-CNN Model Using VGG16 and Image Net With Mobile Application
5 pages
Sign Language Recognition Using
No ratings yet
Sign Language Recognition Using
22 pages
Sign Language Recognition Using Machine Learning
No ratings yet
Sign Language Recognition Using Machine Learning
8 pages
Sign Language Recognition Using Deep Learning Through LSTM and CNN
No ratings yet
Sign Language Recognition Using Deep Learning Through LSTM and CNN
5 pages
Visual Language Interpreter
No ratings yet
Visual Language Interpreter
7 pages
Sem 8th 2nd - Merged
No ratings yet
Sem 8th 2nd - Merged
33 pages
Updated PPT (1)
No ratings yet
Updated PPT (1)
30 pages
Retrieve
No ratings yet
Retrieve
21 pages
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
No ratings yet
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
18 pages
Sign Language
No ratings yet
Sign Language
5 pages
Silent Signals AI Power Sign Language Recognization
No ratings yet
Silent Signals AI Power Sign Language Recognization
8 pages
Visualizing Language: CNNs For Sign Language Recognition
No ratings yet
Visualizing Language: CNNs For Sign Language Recognition
6 pages
divyesh-1
No ratings yet
divyesh-1
4 pages
Sign Language AI With Box Detectors
No ratings yet
Sign Language AI With Box Detectors
29 pages
A Survey of Sign Language Recognition
No ratings yet
A Survey of Sign Language Recognition
6 pages
Sign Language Recognition
No ratings yet
Sign Language Recognition
24 pages
JOURNAL sign
No ratings yet
JOURNAL sign
2 pages
SIGNLANGUAGE PPT
100% (1)
SIGNLANGUAGE PPT
15 pages
American_Sign_Language_Real_Time_Detection_Using_TensorFlow_and_Keras_in_Python
No ratings yet
American_Sign_Language_Real_Time_Detection_Using_TensorFlow_and_Keras_in_Python
6 pages
Sign Language
No ratings yet
Sign Language
79 pages
Sign Language Recognition Using Deep Learning and Computer Vision
No ratings yet
Sign Language Recognition Using Deep Learning and Computer Vision
6 pages
Enhancing accessibility with long short-term memory-based sign language detection systems
No ratings yet
Enhancing accessibility with long short-term memory-based sign language detection systems
8 pages
PFX-48420843 (1)
No ratings yet
PFX-48420843 (1)
6 pages
Sign Language Recognition
No ratings yet
Sign Language Recognition
9 pages
Sign Language Detection
No ratings yet
Sign Language Detection
6 pages
Bantupalli and Xie
No ratings yet
Bantupalli and Xie
3 pages
Animated Sign Language For People With Speaking and Hearing Disability Using Dee
No ratings yet
Animated Sign Language For People With Speaking and Hearing Disability Using Dee
5 pages
PPTT
No ratings yet
PPTT
35 pages
Deep Learning Approach For Sign Language Gesture Recognition Using Convolutional Neural Networks
No ratings yet
Deep Learning Approach For Sign Language Gesture Recognition Using Convolutional Neural Networks
6 pages
American Sign Language Detection Using YOLOv5 and
No ratings yet
American Sign Language Detection Using YOLOv5 and
16 pages
Paper 3+ijisae
No ratings yet
Paper 3+ijisae
15 pages
PAPER015571-1
No ratings yet
PAPER015571-1
13 pages
Research Paper3
No ratings yet
Research Paper3
8 pages
11 IX September 2023
No ratings yet
11 IX September 2023
8 pages
Sign Language Recognition Synopsis
No ratings yet
Sign Language Recognition Synopsis
4 pages
Final PPT Capstone Project
No ratings yet
Final PPT Capstone Project
17 pages
Real Time Sign Language Interpreter Report
No ratings yet
Real Time Sign Language Interpreter Report
48 pages
Abstract
No ratings yet
Abstract
1 page
Mathematics 11 03729
No ratings yet
Mathematics 11 03729
20 pages
Sign Language Paper
No ratings yet
Sign Language Paper
7 pages
Report 3
No ratings yet
Report 3
32 pages
Hand Sign Language Translator For Speech Impaired
No ratings yet
Hand Sign Language Translator For Speech Impaired
4 pages
Sign recognition research paper
No ratings yet
Sign recognition research paper
16 pages
Development of An End-To-End Deep Learning Framework For Sign Language Recognition Translation and Video Generation
No ratings yet
Development of An End-To-End Deep Learning Framework For Sign Language Recognition Translation and Video Generation
17 pages
Deep Learning Based Sign Language Recognition System Using Convolutional Neural Network
No ratings yet
Deep Learning Based Sign Language Recognition System Using Convolutional Neural Network
68 pages
Sign Doc 2 - Merged
No ratings yet
Sign Doc 2 - Merged
42 pages
Report-1
No ratings yet
Report-1
30 pages
2021A1R002-1
No ratings yet
2021A1R002-1
14 pages
Sign Language Detection and Recognizatio
No ratings yet
Sign Language Detection and Recognizatio
7 pages
conference paper
No ratings yet
conference paper
5 pages
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
No ratings yet
Assessment of Underground Water Quality of Gosa Landfill Site of the Federal Capital Territory, Abuja Nigeria
11 pages
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
No ratings yet
Optimal Voltage Regulation in Standalone Photovoltaic Systems Using Model Predictive Control and MOGA
8 pages
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
No ratings yet
Monte Carlo-Based Modeling of 2-D Ising Systems Using Metropolis Algorithm, Simulation Techniques, Thermodynamic Behavior and Magnetization Patterns
16 pages
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
No ratings yet
Transition to Telepsychotherapy: Experiential Perspective of Debutant Therapists
6 pages
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
No ratings yet
Investigating the Interplay between Climate Change and Sustainable Environment Development: Challenges, Strategies and Future Directions
11 pages
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
No ratings yet
Developing Gamified Educational Technologies to Enhance Learning and Motivate Student Engagement in Education: A Quantitative Study Using Human-Computer Interaction (HCI)
16 pages
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
No ratings yet
A Phytochemical Evaluation of Sierra Leonean Cassia siamea: A Source of Bioactive Compounds
5 pages
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption among Academic Librarians in the Bicol Region Librarians Council (BRLC)
6 pages
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
No ratings yet
Crude Oil Price Volatility and its Impact on Nigeria’s Balance of Trade: An Empirical Assessment (2000–2023)
13 pages
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
No ratings yet
A Review on Gold Nanoparticles: Properties, Synthesis and Biomedical Application in Drug Delivery and Cancer Therapy
6 pages
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
No ratings yet
Unlocking the Therapeutic Power of Coriander: A Review of Coriandrum Sativum’s Bioactive Compounds and Health Benefits
15 pages
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
No ratings yet
Cost Comparative Analysis of Solar/Utility and Diesel/Utility Hybrid Power System for a Typical Residential Building
8 pages
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
No ratings yet
Development of Mirror Biosensor in Saliva pH Measurement in Health Services
7 pages
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
No ratings yet
Analysis of the Role of Websites, Design, and Performance Metrics in Improving Company Performance in Medan City
4 pages
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
No ratings yet
A MIC-MAC-Based Structural Exploration of Determinants Impacting Investment Sensitivity
8 pages
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
No ratings yet
Analyzing Social Communication Deficits in Autism Using Wearable Sensors and Real-Time Affective Computing Systems
17 pages
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
No ratings yet
Enhancing Model Accuracy for Keypoint-Based Sign Language Recognition using Optimized Neural Network Architectures
7 pages
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
No ratings yet
Assessing the Achievements of the Re-Alignment of an Industry Educatiocal Based System in Society
5 pages
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
No ratings yet
Smart Narrator Robot: Enhancing Experiential Learning through Conditional Autonomy
6 pages
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
No ratings yet
ResumeMatch: Intelligent Resume Enhancement & Job Fit Analysis
7 pages
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
No ratings yet
Architecture as a Reflection of Cultural Continuity: A Study of Traditional Trends
3 pages
EduTech Portal: An AI-Powered Student Assistant Chatbot
No ratings yet
EduTech Portal: An AI-Powered Student Assistant Chatbot
12 pages
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
No ratings yet
Real - Time Recognition of Cardiovascular Conditions from ECG Images with Deep Learning
10 pages
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
No ratings yet
Design and Implementation of a GPS-GSM based Real-Time Vehicle Theft Tracking System for Urban Security in Uganda
7 pages
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
No ratings yet
Analysis of the Export Competitiveness of Indonesia's Horticultural Fruit Products in the International Market
8 pages
A Decade of Genome Editing: Comparative Review of Zfn, Talen, and Crispr/Cas9
No ratings yet
A Decade of Genome Editing: Comparative Review of Zfn, Talen, and Crispr/Cas9
10 pages
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
No ratings yet
Continuing Training and Professional Performance of Primary School Teachers in Tchad: The Case of Teachers in the Farchana Refugee Camp
7 pages
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
No ratings yet
Behavior Addiction in Adolescents Post COVID 19: A Systematic Mental Health Review
8 pages
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
No ratings yet
Evaluating the Impact of Shopee Mall on Consumer Purchase: Basis for Developing an Effective Marketing Plan
61 pages
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
No ratings yet
Enhancing the Robustness of Computer Vision Models to Adversarial Perturbations Using Multi-Scale Attention Mechanisms
14 pages
Basic Power Electronics Notes 2.1 To 2.3 PDF
No ratings yet
Basic Power Electronics Notes 2.1 To 2.3 PDF
8 pages
Gilboa and Marlatte - 2017 - Neurobiology of Schemas and Schema-Mediated Memory
No ratings yet
Gilboa and Marlatte - 2017 - Neurobiology of Schemas and Schema-Mediated Memory
14 pages
Dyadic Coping and Coparenting
No ratings yet
Dyadic Coping and Coparenting
14 pages
3 ActivatingtheEnlightenmentCircuit
No ratings yet
3 ActivatingtheEnlightenmentCircuit
2 pages
Dept. Library B.tech, M.tech, PHD Projects
No ratings yet
Dept. Library B.tech, M.tech, PHD Projects
9 pages
Programacion para Robot Zumo
No ratings yet
Programacion para Robot Zumo
3 pages
COLLISION PREVENTION ASSIST PLUS, Function
No ratings yet
COLLISION PREVENTION ASSIST PLUS, Function
3 pages
UGProspectus11 12 English
No ratings yet
UGProspectus11 12 English
39 pages
0075-MAN-017-01V00E Video User's_Manual view (2)
No ratings yet
0075-MAN-017-01V00E Video User's_Manual view (2)
51 pages
Global Divides: The North and The South
No ratings yet
Global Divides: The North and The South
16 pages
Execution and Business Plan: Technopre 323
No ratings yet
Execution and Business Plan: Technopre 323
37 pages
Lifting Device For Diverter Switch Inserts
No ratings yet
Lifting Device For Diverter Switch Inserts
16 pages
Mahi River Basin
No ratings yet
Mahi River Basin
33 pages
Low Shut Off Vacuum Reason
No ratings yet
Low Shut Off Vacuum Reason
4 pages
Edexcel GCSE Science Revision Plan
No ratings yet
Edexcel GCSE Science Revision Plan
7 pages
Cst204 Dbms July 2021
No ratings yet
Cst204 Dbms July 2021
3 pages
Edexcel-Physics-Paper-2-Equation-Practice
No ratings yet
Edexcel-Physics-Paper-2-Equation-Practice
4 pages
Name of Candidate: Abigail Mclean Candidate# School: Mavis Bank High School Subject: Social Studies Territory: Jamaica Teacher: MS, Afflick
No ratings yet
Name of Candidate: Abigail Mclean Candidate# School: Mavis Bank High School Subject: Social Studies Territory: Jamaica Teacher: MS, Afflick
21 pages
Sales and Distribution Management: Unit I
No ratings yet
Sales and Distribution Management: Unit I
22 pages
Chapter 14
No ratings yet
Chapter 14
23 pages
Sensor Operated Paper Counting Machine
No ratings yet
Sensor Operated Paper Counting Machine
43 pages
ĐỀ CHUYÊN ANH SỐ 15 - 2023-2024
No ratings yet
ĐỀ CHUYÊN ANH SỐ 15 - 2023-2024
8 pages
Wide Area Network (WAN) Interview Questions and Answers
No ratings yet
Wide Area Network (WAN) Interview Questions and Answers
1 page
Enbal
No ratings yet
Enbal
16 pages
Earth - Whose in Situ Weight Is 112 LB - CF, Whose ...
No ratings yet
Earth - Whose in Situ Weight Is 112 LB - CF, Whose ...
3 pages
Comparing The Tone System in Chinese and Other Asian Languages - Edited
No ratings yet
Comparing The Tone System in Chinese and Other Asian Languages - Edited
12 pages
Perfect Education: Physics
No ratings yet
Perfect Education: Physics
20 pages
Estacion Manual FMM-100 PDF
No ratings yet
Estacion Manual FMM-100 PDF
2 pages
3M Sheet Roller
No ratings yet
3M Sheet Roller
15 pages

Recognizing Sign Language Using Machine Learning and Deep Learning Models

Uploaded by

Recognizing Sign Language Using Machine Learning and Deep Learning Models

Uploaded by

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY500

Recognizing Sign Language using Machine

Fig 1: ASL Dataset

Fig 2: Proposed Methodology for Sign Language Recognition

C. Feature Extraction  Prewitt, Canny and Sobel

 HOG (Histogram of Oriented Gradients)

The Prewitt operator computes both gradient

John F. Canny created the advanced technique known

The Sobel operator is comparatively more sensitive to

Fig 4: Prewitt Feature Extraction Fig 6: Sobel Feature Extraction

 Harris Corner Detection.  Contour Tracing.

To detect corners, we must maximize the function

Fig 8: Contour Features

Fig 7: Harris Corner Detection

Fig 9: CNN Architecture

D. Models simplest form of an SVM is achieved through a linear

K(xi, xj) = xi•xj ……………………………………….....(6) Table 1. CNN Architecture

The RBF kernel is widely used for handling non-linear

K(xi, Xj) = exp (-ɤ||xi- xj||^2)…………………………….(7)

Our overall approach is a conventional CNN

The Resnet-18 network we have employed consists of

Table 3: Training Performance of Different Models

Fig 12: Training Loss

Fig 13: Feature Based Analysis

Table 5: Model Performance based on Different Features

Table 6: Comparison of Models

IJISRT24MAY500 www.ijisrt.com 100

Fig 14: Quantitative Analysis

Table 7: Class Wise Comparison of Each Model

IJISRT24MAY500 www.ijisrt.com 101

Fig 15: Illustration of Misclassification of Gestures

V. CONCLUSION precision, and adaptability, making it better suited for sign

IJISRT24MAY500 www.ijisrt.com 102

REFERENCES [12]. Ram, P., and Padmavathi, S. Analysis of harris corner

IJISRT24MAY500 www.ijisrt.com 103

You might also like