This document summarizes a project to develop a sign language recognition system using Python and OpenCV. The system is trained on a dataset of images of numbers 1-10 in American Sign Language. Images of hand gestures are captured and preprocessed, then features are extracted to train a convolutional neural network (CNN) model. The trained model can then recognize numbers from 1-10 from new sign language images by comparing them to the training dataset. The goal is to help deaf and mute people communicate by converting sign language to text. Computer vision and deep learning techniques like CNNs are applied to image processing steps such as segmentation, feature extraction and classification.
This document summarizes a project to develop a sign language recognition system using Python and OpenCV. The system is trained on a dataset of images of numbers 1-10 in American Sign Language. Images of hand gestures are captured and preprocessed, then features are extracted to train a convolutional neural network (CNN) model. The trained model can then recognize numbers from 1-10 from new sign language images by comparing them to the training dataset. The goal is to help deaf and mute people communicate by converting sign language to text. Computer vision and deep learning techniques like CNNs are applied to image processing steps such as segmentation, feature extraction and classification.
Original Description:
SIGN LANGUAGE RECOGNITION
Original Title
Sign Language Recognition Using Python and OpenCV (1)
This document summarizes a project to develop a sign language recognition system using Python and OpenCV. The system is trained on a dataset of images of numbers 1-10 in American Sign Language. Images of hand gestures are captured and preprocessed, then features are extracted to train a convolutional neural network (CNN) model. The trained model can then recognize numbers from 1-10 from new sign language images by comparing them to the training dataset. The goal is to help deaf and mute people communicate by converting sign language to text. Computer vision and deep learning techniques like CNNs are applied to image processing steps such as segmentation, feature extraction and classification.
This document summarizes a project to develop a sign language recognition system using Python and OpenCV. The system is trained on a dataset of images of numbers 1-10 in American Sign Language. Images of hand gestures are captured and preprocessed, then features are extracted to train a convolutional neural network (CNN) model. The trained model can then recognize numbers from 1-10 from new sign language images by comparing them to the training dataset. The goal is to help deaf and mute people communicate by converting sign language to text. Computer vision and deep learning techniques like CNNs are applied to image processing steps such as segmentation, feature extraction and classification.
Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 22
At a glance
Powered by AI
The key takeaways are that sign language recognition using computer vision and deep learning can help deaf and mute people communicate more easily. The project aims to develop a model that can recognize numbers from 1 to 10 using OpenCV and Keras in Python.
The aim of the project is to produce a model that can recognize hand gestures and signs, specifically numbers from 1 to 10, to help people converse with those who are deaf or have disabilities.
The main steps involved are: 1) Creating a dataset, 2) Training a CNN model on the dataset, and 3) Predicting gestures by loading the trained model and detecting hand regions in input images.
Sign Language Recognition
Using Python and OpenCV
Group members: Ketan Bhoir Bhushan Galande Prashant Jadhav Sandip Sur yavanshi Introduction There have been several advancements in technology and a lot of research has been done to help the people who are deaf and dumb. Aiding the cause, Deep learning, and computer vision can be used too to make an impact on this cause. This can be very helpful for the deaf and dumb people in communicating with others as knowing sign language is not something that is common to all, moreover, this can be extended to creating automatic editors, where the person can easily write by just their hand gestures. In this sign language recognition project, we create a sign detector, which detects numbers from 1 to 10 that can very easily be extended to cover a vast multitude of other signs and hand gestures including the alphabets. We have developed this project using OpenCV and Keras modules of python. Aim Our aim is to produce a model that can recognize hand gestures and signs. We will train a model for the purpose of sign language conversion, a simple gesture recognizing model; this will help people converse with people who are innately deaf and mentally disabled. This project can be implemented in several ways such as KNN, Logistic Regression, Naïve Bayes Classification, Support vector machine and can be implemented with CNN. The method we have chosen is CNN as it gives better accuracy compared to the rest of the methods. A computer program is developed using python language which is used to train the model based on the CNN algorithm. The program will be able to recognize hand gestures by comparing the input with preexisting dataset formed using the American sign Language. We will be able to convert Sign Language into text as output for users to recognize the signs presented by the sign language speaker. IMAGE PROCESSING Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. It is a type of signal processing in which input is an image and output may be image or characteristics/features associated with that image. Nowadays, image processing is among rapidly growing technologies. It forms core research area within engineering and computer science disciplines too. Image processing basically includes the following three steps: • Importing the image via image acquisition tools. • Analysing and manipulating the image. • Output in which result can be altered image or report that is based on image analysis. There are two types of methods used for image processing namely, analogue and digital image processing. Analogue image processing can be used for the hard copies like printouts and photographs. Image analysts use various fundamentals of interpretation while using these visual techniques. Digital image processing techniques help in manipulation of the digital images by using computers. The three general phases that all types of data have to undergo while using digital technique are preprocessing, enhancement, and display, information extraction. Digital Image Processing Digital Image Processing means processing digital image by means of a digital computer. We can also say that it is a use of computer algorithms, in order to get enhanced image either to extract some useful information[ ]. Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning[].This system is important as in some conditions it is necessary to separate objects from the images as the image may include several objects. It contains three phases. The first phase stage detects and separate the objects from background. The second phase is the feature selection. In this phase the object is measured. The third phase is classifier or classification, it classify the object belongs to which category. Block Diagram of Pattern Recognition System SIGN LANGUAGE AND HAND GESTURE RECOGNITION The process of converting the signs and gestures shown by the user into text is called sign language recognition. It bridges the communication gap between people who cannot speak and the general public. Image processing algorithms along with neural networks is used to map the gesture to appropriate text in the training data and hence raw images/videos are converted into respective text that can be read and understood. PROBLEMSTATEMENT Speech impaired people use hand signs and gestures to communicate. Normal people face difficulty in understanding their language. Hence there is a need of a system which recognizes the different signs, gestures and conveys the information to the normal people. It bridges the gap between physically challenged people and normal people METHODOLOGY First we need to collect all the required data. For our system we make use of the web camera to detect the hand gestures. The captured image has many unwanted objects in the background hence are detected and removed as shown in figure below. Training Dataset: In Training Dataset, first we have to collect or have to feed the images which we will be detected by the software. The images can be inserted by the internet or by live video cam and is saved in the directory dataset folder of alphabet and numbers and is further divided into two parts/folder, one is training and other is testing. Each time a command is given. The images which are obtain are saved in the PNG format. We have to make sure the collected image should be in high quality. Image Pre-processing & Segmentation:
Since the image captured is colourspaces due to background
environment because of this it becomes difficult to recognize the gesture. Therefore it is necessary to convert the image into HSV colourspace. It is to increase the accuracy level of detection. The image is transformed in to graysacle. Now the segmentation process takes palce which then transform the image more colour it will also enhance the robustness of our system to changes in lighting or illumination. Non-black pixels in the transformed image are binarised while the others remain unchanged, at the end of process the image is resized, the white area is the hand gesture and the black is the rest. Feature Extraction:
One of the most crucial part in image processing is to
select and extract important features from an image. Images whencaptured and stored as a dataset usually take up a whole lot of space as they are comprised of a huge amount of data. Feature extraction helps us solve this problem by reducing the data after having extracted the important features automatically.It also contributes in maintaining the accuracy of the classifier and simplifies its complexity. In our case, the features found to be crucial are the binary pixels of the images [2]. SOFTWARE Tensorflow Open CV: Keras: Numpy: IDE (Jupyter): Python: Example Steps to develop sign language recognition project This is divided into 3 parts: 1. Creating the dataset 2. Training a CNN on the captured dataset 3. Predicting the data All of which are created as three separate .py fi les. The fi le structure is given below: 1. Creating the Dataset It is fairly possible to get the dataset we need on the internet but in this project, we will be creating the dataset on our own. We will be having a live feed from the video cam and every frame that detects a hand in the ROI (region of interest) created will be saved in a directory (here gesture directory) that contains two folders train and test, each containing 10 folders containing images captured using the create_gesture_data.py 2. Training CNN ( Convolution Neural Networks) Now on the created data set we train a CNN. First, we load the data using ImageDataGenerator of keras through which we can use the flow_from_directory function to load the train and test set data, and each of the names of the number folders will be the class names for the imgs loaded.plotImages function is for plotting images of the dataset loaded. Now we design the CNN as follows (or depending upon some trial and error other hyperparameters can be used) 3. Predict the gesture In this, we create a bounding box for detecting the ROI and calculate the accumulated_avg as we did in creating the dataset. This is done for identifying any foreground object. Now we find the max contour and if contour is detected that means a hand is detected so the threshold of the ROI is treated as a test image. We load the previously saved model using keras.models.load_model and feed the threshold image of the ROI consisting of the hand as an input to the model for prediction.ror Application 1. Sign Language Recognition is a breakthrough for helping deaf-mute people and has been researched for many years 2. In Sign Language Recognition Image processing is used to better extract features from input images. 3. Visualization - Observe the objects that are not visible. 4. Image sharpening and restoration - To create a better image. 5. Image retrieval - Seek for the image of interest. 6. Measurement of pattern – Measures various objects in an image. 7. Image Recognition – Distinguish the objects in an image.
References
Skeleton Aware Multi-modal Sign Language Recognition,
arXiv:2103.08833v5,2021. [2] Mehreen Hurroo, Mohammad Elham Walizad, Sign Language Recognition System using Convolution Neural Network and Computer Vision, Vol. 9 Issue 12, December- 2020. [3] Yuecong Min, Aiming Hao, Xiujuan Chai3, Xilin Chen Visual Alignment Constraint for Continuous Sign Language Recognition, arXiv:2104.02330v2,18 Aug 2021. [4] Ashok K Sahoo, Gouri Sankar Mishra and Kiran Kumar Ravulakollu, Sign Language Recognitions: State of Art, ISSN 1819-6608, VOL. 9, NO. 2, FEBRUARY 2014. [5] www.wikipedia.org Thankyou