Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

(7-14) Journal of Soft Computing and Computational Intelligence5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Journal of Soft Computing and Computational Intelligence

Vol. 1, Issue 1 (January – April, 2024) pp: (7-14)

Real-Time Sign Language Translation using the KNN Algorithm


Sridevi G M1, Manish Kumar2*, Gourav Mishra3, Ayush Kumar4, Aditi Patni5
1
Assistant Professor, Department of Information Science and Engineering, Dayananda Sagar
Academy of Technology and Management, Bengaluru, Karnataka, India
2, 3, 4, 5
Under Graduate Student, Department of Information Science and Engineering, Dayananda
Sagar Academy of Technology and Management, Bengaluru, Karnataka, India
*
Corresponding Author: talk2manish.me@gmail.com

Received Date: March 25, 2024; Published Date: April 06, 2024

Abstract
Addressing the imperative of bridging communication barriers between the deaf and non-verbal
communities this project centres on advancing automated American Sign Language (ASL)
recognition through key-point detection-based methodologies. A comprehensive analysis of the
model‘s efficacy is conducted, employing rigorous testing methodologies and metrics like F1
score, precision, and recall to ascertain optimal performance. By delving into the nuances of ASL
recognition, the project seeks to enhance the accuracy and reliability of machine learning models
in deciphering sign language gestures. Additionally, the implementation of a user-friendly
graphical user interface (GUI) facilitates seamless interaction, empowering users to effortlessly
engage with the system and generate predictions utilizing the most proficient machine learning
algorithms. Through this endeavour, the aim is not only to enhance accessibility for the deaf and
non-verbal communities but also to foster inclusivity by providing a platform for effective
communication between individuals utilizing ASL and those who rely on verbal communication.
This interdisciplinary approach merges technological innovation with social responsibility, paving
the way for a more inclusive and connected society.

Keywords- American Sign Language (ASL), Effective communication, Graphical User Interface
(GUI), Implementation, Key-point

INTRODUCTION challenges. Additionally, we aim to establish a


training program tailored for individuals
Globally, more than 70 million interested in learning sign language. This
individuals are affected by hearing loss, with initiative aims to alleviate barriers faced by
approximately 80% residing in developing families unable to afford formal education for
nations, as reported by the World Federation of their children, enabling seamless communication
the Deaf. Sign language stands as the preferred with those experiencing hearing difficulties. By
mode of communication for many with hearing fostering widespread adoption and
impairments. Recognizing the significance of understanding of sign language, our endeavour
sign language comprehension, various methods, seeks to promote inclusivity and accessibility.
including computer vision, have been explored The primary objective is to devise an optimal
for its recognition. Advocates assert that sign training tool tailored for individuals with
language comprises distinct movements, each impairments, utilizing the K-nearest neighbours
conveying specific meanings, facilitating (KNN) algorithm.
communication not only with the hearing world The subsequent portion comprises of Literature
but also among deaf and hard-of-hearing Survey, Methodology, Results and Discussion,
individuals. Conclusion and References used.
In pursuit of enhancing sign language
comprehension, we propose the development of LITERATURE REVIEW
a system leveraging support vector machines
(SVM), random forest algorithms (RF), and K- Bar Bhuiya et al. introduced a method
nearest neighbours (KNN) for classification with deep learning-based convolutional neural

7 © MAT Journals 2024. All Rights Reserved


Real-Time Sign Language Translation Sridevi G M et al.

networks (CNN) to statically model language worldwide with hearing loss [3]. Focusing on
recognition [1]. This work specifically uses Indian Sign Language (ISL), the study highlights
CNNs for Hand Gesture Recognition (HGR) the challenges faced in developing countries,
while taking into account the letters and numbers such as limited educational resources and high
of American Sign Language (ASL). The article unemployment rates among adults with hearing
carefully discusses the advantages and loss. The research aims to bridge the gap in sign
disadvantages of using CNNs in the context of language recognition technology, specifically for
HGR. The CNN architecture was developed by ISL, by utilizing computer vision and machine
modifying the AlexNet and VGG16 models for learning algorithms instead of high-end
classification. Feature extraction uses pre-trained technologies like gloves or Kinect. The project's
modified AlexNet and VGG16 architectures, primary objective is to identify alphabets in
which are then fed into a multi-modal support Indian Sign Language through gesture
vector machine (SVM) classifier. Measurement recognition, contributing to the broader
tools are based on different methods to accessibility and understanding of sign
demonstrate performance. Validity evaluation of languages in the context of Indian
HGR plans involved leave-one-out and random communication and education.
70–30 cross-validation. In addition, this study D.M.M.T. et al. in their paper study the
investigates the unique recognition of unique problem of vision-based Sign Language
characters and explores similarities between Translation (SLT), which bridges the
identical movements. To highlight the good communication gap between the deaf-mute and
results of the proposed method, it is worth noting normal people [4]. It is related to several video
that the experiment was performed on a simple understanding topics that target to interpret
CPU system and the use of a high-end GPU was video into understandable text and language.
avoided. More importantly, the proposed method Sign Language is a form of communication that
achieved recognition accuracy of 99.82%, uses visual gestures to convey meaning. It
outperforming some state-of-the-art methods. involves using hand shapes, movements, facial
Sandrine Tor nay et al. proposed a expressions, and lip patterns to communicate
technique that delves into the realm of sign instead of relying on sound. There are many
language recognition, focusing on the challenge different sign languages around the world, each
of resource scarcity in the field [2]. The primary with its own set of gestures and vocabulary. For
obstacle lies in the diversity of sign languages, instance, ASL (American Sign Language), GSL
each with its vocabulary and grammar, creating (German Sign Language), and BSL (British Sign
a limited user base. The paper proposes a Language) are some examples.
multilingual approach, drawing inspiration from Merlin Huro et al. proposed a system for
recent advancements in hand-shape modelling. language recognition using convolutional neural
By leveraging resources from various sign networks and computer vision [5]. He also
languages and integrating hand movement developed a similar algorithm using 2D CNN
information through Hidden Markov Models models from the Tensorflow library. The
(HMMs), the study aims to develop a convolution technique is used to extract the main
comprehensive sign language recognition features from the input image, the image is
system. The research builds upon prior work that printed with a 3 x 3 filter, and the point object
demonstrated the language independence of and weight indicator of the frame pixels are
discrete hand movement subunits. The validation calculated. Layer after layer is used to reduce the
of this approach is conducted on the Swiss- activation map of the previous layer and
German Sign Language (DSGS) corpus SMILE, integrate all learned features into the functional
German Sign Language (DGS) corpus, and map of the previous layer. This helps reduce
Turkish Sign Language corpus HospiSign, overfitting to the training data and helps
paving the way for a more inclusive and generalize the features represented by the
versatile sign language recognition technology. network. The input method of our convolutional
Shirbhate et al. proposed a technique in neural network consists of 32 maps of size 3 x 3,
their research paper that addresses the vital role and the activation function is a correction unit.
of sign language as a communication medium The maximum pooling layer size is 2x2 and the
for the deaf and dumb community, emphasizing throughput is set to 50%. The layer is then
its significance for the 466 million people flattened and the last layer of the network is a

8 © MAT Journals 2024. All Rights Reserved


J. of Sof. Comp. And Comput. Intell. Vol. 1, Issue 1

fully connected output layer consisting of ten picture. The rational part of this exploration is to
units and the activation function is Softmax. adjust the pictures utilizing the multi-centre
Finally, he computed the model using picture combination system. The framework C
categorical cross-entropy as unemployment and language utilizes the pixel level combination
Adam as the optimizer. calculation to assess the consequence of shading
Chengji Liu et al. proposed a pictures dependent on the Xilinx Spartan 3
generalized object detection network that was Embedded Development Kit (EDK) field
developed by applying complex degradation programmable door cluster (FPGA) standard.
processes on training sets like noise, blurring,
rotating and cropping of images [6]. The model METHODOLOGY
was trained with the degraded training sets
which resulted in better generalizing ability and The project utilizes a comprehensive
higher robustness. The experiment showed that methodology encompassing Gesture Training,
the model trained with the standard sets does not Gesture Recognition and Translation, Prediction
have good generalization ability for the degraded Output, Retraining and Copying Sentences, and
images and has poor robustness. Then the model an intuitive User Interface to enable real-time
was trained using degraded images which translation of sign language gestures with high
resulted in improved average precision. It was accuracy and user-friendliness, thereby
proved that the average precision for degraded enhancing communication accessibility for sign
images was better in the general degenerative language users. Through personalized training,
model compared to the standard model. advanced gesture recognition algorithms, and
Horn et al. proposed a technique that continuous user feedback, the system ensures
optical flow detects moving objects even when efficient and adaptable communication,
the camera is also in motion [7]. It deals with the representing an innovative solution to bridging
pattern of lights in the image for detection. It can the gap in communication for individuals reliant
deal with a sequence of images that can be on sign language.
classified as a set rather than unshaped regions  Gesture Training: This is the initial phase
in spatial arrangements. It is insensitive to noise where users train the system to recognize
and brightness levels. their unique sign language gestures. Each
Singh and Khare proposed a redundant user can add as many words as they want,
wavelet transform (RWT and R-DWT) for the and associate each word with a specific
image fusion method in multimodal medical gesture. This personalized training allows
images [8]. In their method, they found that the the system to accurately interpret the sign
shift-invariance of the R-DWT produces quality language of individual users, taking into
image fusions. They experimented using several account the variations in how different
multimodal MRI, CT, and PET medical images people make the same gestures.
and the results were drawn using mutual  Gesture Recognition and Translation:
information and strength metrics. Comparison of Once the gestures are trained, the system is
the said method was done using spatial and ready to translate them in real-time. This is
wavelet fusions such as principal component achieved using Google Tensor Flow’s
analysis (PCA), discrete wavelet transform implementation of the K-Nearest Neighbors
(DWT), lifting wavelet transform (LWT), and (KNN) algorithm. The KNN algorithm
discrete cosine transform (DCT), which proved works by classifying a query based on the
that the R-DWT method was far better than labels of the K points (gestures in this case)
other methods in medical image fusion. that are closest to it in the feature space.
Bhanusree et al. concentrated on second- The ‘closeness’ is determined by a distance
age wavelet transform for picture combination metric, such as Euclidean distance. The
and researched the quality coefficients at various predicted words corresponding to the
recurrence areas [9]. Low-recurrence gestures are then passed on to the prediction
coefficients are generally utilized in a output class.
neighbourhood to select the estimating criteria,  Prediction Output: In the prediction
while coefficients of a high recurrence are output class, the system reads the sentence
utilized for the window property and for formed by the sequence of predicted words
watching the qualities of nearby pixels in the and displays the confidence level for each

9 © MAT Journals 2024. All Rights Reserved


Real-Time Sign Language Translation Sridevi G M et al.

predicted word. This confidence level is  User Interface: The sign language
essentially a measure of how closely the translator is equipped with an intuitive user
user’s gesture matches the trained gestures interface, making it easy to use even for
for the predicted word. This feedback people who are not tech-savvy. The
mechanism allows users to adjust their interface includes additional features that
gestures if necessary, thereby improving the enhance the user experience, such as easy
accuracy of the translation over time. navigation, clear instructions, and visual
 Retraining and Copying Sentences: The feedback.
framework gives clients the adaptability to This comprehensive methodology ensures that
return and retrain words. This assumption the sign language translator is not only accurate
for enduring a client sees that particular and efficient but also user-friendly and adaptable
improvement is continually being perplexed to the unique needs of each user. It represents a
by the development, they can retrain that creative and innovative solution to the challenge
sign. Clients can similarly copy sentences of translating sign language in real-time. This
made through their hand signals. This part project has the potential to significantly improve
can be particularly useful in circumstances the quality of life for people who rely on sign
where the client needs to pass comparative language for communication as shown in Fig. 1.
sentences on to different people.

Figure 1: Flow chart of the process.

10 © MAT Journals 2024. All Rights Reserved


J. of Sof. Comp. And Comput. Intell. Vol. 1, Issue 1

RESULTS AND DISCUSSION This remarkable accuracy rate signifies the


effectiveness of the kNN algorithm in accurately
The implementation of the k-Nearest recognizing and translating sign language
Neighbors (kNN) algorithm for sign language gestures. Notably, even when using the same
translation has yielded exceptional results. dataset for both training and testing, the system
Through rigorous experimentation, utilizing both maintained its high accuracy, ensuring a
training and testing datasets, the system guarantee of 88.8-91% recognition rate for users
consistently achieved an 88.8% accuracy rate. who have previously contributed to the dataset.

Figure 2: Start gesture for training.

Furthermore, the system's real-time translation accessibility and fosters inclusivity, empowering
capabilities enable immediate communication individuals with hearing impairments to engage
between individuals using sign language and in more effective and inclusive communication
those who may not understand it. This as shown in Fig. 2, 3, 4, 5, 6 and 7.
functionality enhances communication

Figure 3: Stop gesture for training.

11 © MAT Journals 2024. All Rights Reserved


Real-Time Sign Language Translation Sridevi G M et al.

Figure 4: Model training gesture: three.

Figure 5: Model translating gesture: one.

Figure 6: Model translating gesture: two.

12 © MAT Journals 2024. All Rights Reserved


J. of Sof. Comp. And Comput. Intell. Vol. 1, Issue 1

Figure 7: Model translating gesture: three.

CONCLUSION nearest neighbours (KNN) algorithm has shown


promising results in real-time sign language
In conclusion, the implementation of the translation, there are several avenues for future
kNN algorithm for sign language translation research and development in this field. Enhanced
heralds a profound leap forward in fostering Gesture Recognition can be achieved through
communication accessibility and inclusivity for further advancements in computer vision and
the deaf and hard-of-hearing community. This machine learning techniques, such as exploring
innovative system not only boasts remarkable deep learning architectures like convolutional
accuracy but also offers real-time translation neural networks (CNNs) or recurrent neural
capabilities, thereby facilitating seamless and networks (RNNs) to improve the system's ability
effective communication between sign language to recognize complex hand movements and
users and individuals unfamiliar with the gestures. Extending the system to support
language. multiple sign languages beyond American Sign
Furthermore, the system's unwavering focus on Language (ASL) would greatly enhance its
usability and user experience stands as a utility and accessibility on a global scale,
testament to its commitment to inclusivity. By requiring research into adapting the existing
prioritizing intuitive design and user-friendly framework to different sign languages and
interfaces, the system ensures that individuals dialects. Addressing the challenge of variability
with diverse technical backgrounds can in sign language gestures among different users
effortlessly harness its translation capabilities. remains crucial, with the research needed to
This emphasis on accessibility extends beyond develop techniques to account for variations in
mere functionality, as it fosters a sense of hand shapes, movements, and speeds, thus
empowerment and independence among users. improving the system's adaptability and
Moreover, the incorporation of a well-designed accuracy across diverse user populations.
interface plays a pivotal role in enhancing user Integrating real-time feedback mechanisms into
engagement and satisfaction. By providing an the system can empower users to correct
intuitive and visually appealing platform, the misinterpreted gestures and improve overall
system not only facilitates communication but translation accuracy, fostering interactive
also cultivates a positive user experience. This, interfaces that provide instant feedback on
in turn, reinforces the system’s effectiveness and gesture recognition results. Exploring the
fosters greater adoption among both sign integration of sign language translation
language users and non-users alike. technology into mobile applications or wearable
devices would enhance its accessibility and
FUTURE SCOPE usability in various settings, requiring the
development of lightweight and portable
While the implementation of the K- solutions that users can carry with them.

13 © MAT Journals 2024. All Rights Reserved


Real-Time Sign Language Translation Sridevi G M et al.

Establishing platforms for collaborative learning Research Journal of Engineering and


and data sharing among users could enrich the Technology (IRJET), 7(03), 2122-2125.
system's training dataset and improve its 4. Mishra, D., Tyagi, M., Gupta, A., & Dubey,
performance over time, fostering crowdsourced G. (2020). Sign Language Translator.
data collection initiatives and online International Journal of Advanced Science
communities dedicated to sign language and Technology, 29(5s), 246-253.
translation. Continuation of efforts to ensure http://sersc.org/journals/index.php/IJAST/ar
accessibility and inclusivity in the design and ticle/view/7129
implementation of sign language translation 5. Hurroo , M., & Walizad, M. E. (2020). Sign
technology is essential, conducting user-centered Language Recognition System using
design research and usability testing with diverse Convolutional Neural Network and
user groups to identify and address usability Computer Vision. International Journal of
barriers and enhance the overall user experience. Engineering Research & Technology,
In summary, the future of sign language 9(12), 59-64.
translation holds tremendous potential for https://www.ijert.org/research/sign-
innovation and advancement, with efforts language-recognition-system-using-
focused on improving accuracy, accessibility, convolutional-neural-network-and-
and inclusivity to empower individuals with computer-vision-IJERTV9IS120029.pdf
hearing impairments to engage more effectively 6. Liu, C., Tao, Y., Liang, J., Li, K., & Chen,
in society. Y. (2018, December). Object detection
based on YOLO network. In 2018 IEEE 4th
REFERENCES information technology and mechatronics
engineering conference (ITOEC) (pp. 799-
1. Barbhuiya, A. A., Karsh, R. K., & Jain, R. 803). IEEE.
(2021). CNN based feature extraction and https://doi.org/10.1109/ITOEC.2018.87406
classification for sign language. Multimedia 04
Tools and Applications, 80(2), 3051-3069. 7. Horn, B. K., & Schunck, B. G. (1981).
https://doi.org/10.1007/s11042-020-09829- Determining optical flow. Artificial
y Intelligence.
2. Tornay, S., Razavi, M., & Doss, M. M. 8. Singh, R., & Khare, A. (2013). Multiscale
(2020, May). Towards multilingual sign medical image fusion in wavelet
language recognition. In ICASSP 2020- domain. The Scientific World
2020 IEEE International Conference on Journal, 2013.
Acoustics, Speech and Signal Processing https://doi.org/10.1155/2013/521034
(ICASSP) (pp. 6309-6313). IEEE. 9. Bhanusree, C., & RatnaChowdary, P. A.
https://doi.org/10.1109/ICASSP40776.2020 (2013). A Novel Approach of image fusion
.9054631 MRI and CT image using Wavelet
3. Shirbhate, R. S., Shinde, V. D., Metkari, S. family. International Journal of Application
A., Borkar, P. U., & Khandge, M. A. or Innovation in Engineering &
(2020). Sign language recognition using Management (IJAIEM), 2(8), 1-4.
machine learning algorithm. International

CITE THIS ARTICLE

Manish Kumar, et al. (2024). Real-Time Sign Language Translation using the KNN Algorithm,
Journal of Soft Computing and Computational Intelligence,1(1),7-14.

14 © MAT Journals 2024. All Rights Reserved

You might also like