Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
69 views

Text Reader For Visually Impaired Person Using Image Processing Open-CV

The main issue that visually impaired individualsconfront these days is that they are unable to do text recognition on their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, referring to books, and so on. This issue may erode their confidence because they are unable to cope on their own. Theproject's ultimate goal is to assist visually challenged personswith text recognition. This goal is accomplished by creating amodule that converts text int
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Text Reader For Visually Impaired Person Using Image Processing Open-CV

The main issue that visually impaired individualsconfront these days is that they are unable to do text recognition on their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, referring to books, and so on. This issue may erode their confidence because they are unable to cope on their own. Theproject's ultimate goal is to assist visually challenged personswith text recognition. This goal is accomplished by creating amodule that converts text int
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

11 V May 2023

https://doi.org/10.22214/ijraset.2023.53004
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

Text Reader for Visually Impaired Person using


Image Processing/Open-CV
Aparna. V. Mote1, Rutuja Akhare2, Vaishnavi Barde3, Gitanjali Bhujbal4
1
Assistant Professor, 2Student ,Department of Computer Engineering, Zeal College of Engineering and Research , Narhe , Pune,
Maharashtra ,India

Abstract: The main issue that visually impaired individuals confront these days is that they are unable to do text recognition on
their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, referring to books, and
so on. This issue may erode their confidence because they are unable to cope on their own. The project's ultimate goal is to assist
visually challenged persons with text recognition. This goal is accomplished by creating a module that converts text into speech
and speaks it through the provided headphone/speaker. The code is written in Python after importing pytesseract and gtts. For
character recognition, this project employs the concept of image processing and the OCR approach.
Keywords: Text to speech, Image to Text, Optical Character Recognition,gTTS and Speech output, Python Programming.

I. INTRODUCTION
The project's ultimate goal is to assist visually challenged persons in recognizing text. When a printed text is displayed in front of
the web cam, it must capture the image, extract the text from the image, and read out the text via computer audio or headphone. The
code is written in Python after importing pytesseract and gtts. For character recognition, this project employs the concept of image
processing and the OCR approach. The main issue that visually impaired persons have these days is that they are unable to do text
recognition on their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, and
referring to websites referring books etc. This problem may reduce their confidence as they could not withstand independently. The
project's ultimate goal is to assist visually challenged persons with text recognition. This goal is accomplished by creating a module
that converts text into speech and speak through the provided headphone/speaker. The image is captured using the system's webcam,
and the text isextracted using the built-in application. The text is then identified for words and spoken out loud using headphones or
the system's audio. The Python programming language offers (Python Imaging Library), which is used to do simple image
operations such as creating thumbnails, resizing, rotating, and converting between different file formats.

II. OBJECTIVE
This project is designed to overcome Braille problem using IoT technology. This Project is built using a small size and low cost
single board computer, named Raspberry Pi.The visual data is sent to the single board computer using WiFi connection. The imageis
processed to perform image to text conversion and text to voice conversion using available converters from the online site.Book
reader will capture the picture of book pages using a camera and then process the images using OCR software. When the image is
recognized, book reader will read it aloud. Thus, the blind people or those who have low vision will hear it without the needto touch
using their fingertips. Book Reader will read aloud a book without need to touchlike braille.

This System has following Modules:


1) Requirements Planning
2) Pre-processing
3) Character recognition
4) Development
5) Text to Speech Synthesis
III. LITERATURE SURVEY
OCR based facilitator for the visually challenged. The paper encouraged us to do this project. From this paper we got to know that
there are many people who are facing the BVI problem. Also this paper gave us brief idea about OCR technology and the
implementation details which were very useful.[2] We found this as reference and have tried to approach in a efficient way.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5787
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

Smart Reader for Visually Impaired People Using Raspberry Pi: This paper propose that how to convert image into text and text
into audio. Also this system give complete information about hardware and software implementation for blind reader.
The software Implementation and programming along with the details of ocr engine were very useful from this paper. This paper
gave the detail information about which engines to be used for image to text conversion and text to speech.“OCR based automatic
book reader for the visually impaired using Raspberry PI”–This paper provided the case study and from this paper we learn to build
a system on English language, and we were able to think that in other language can also be done, which we put it in
advancement[3]. The system accepts a page of printed text with English numerals, scans it into a digital document which is then
subjected to skew correction, segmentation, before feature extraction to perform classification. Once classified, the text is read
out by a text to speech conversion unit. An innovative, efficient and real-time cost beneficial technique that enables user to hear the
contents of text images instead of reading through them as been introduced.[6] It combines the concept of Optical Character
Recognition (OCR) and Text to Speech Synthesiser (TTS) in Raspberry pi. Text Image using Raspberry Pi”. Optical Character
recognition is used to digitize and reproduce texts that have been produced with non computerized system. Digitizing texts also
helps reduce storage space[7].
Design and implementation of Automatic Scene text detection and recognition system for visually impaired people has been
discussed. Combining different techniques for Text detection and extraction results into accurate and better system than using single
technique for overall system. Text recognition is successfully performed using pattern matching technique. After successful
recognition, text is converted into audio output. A prototype system to read printed text and hand held objects for assisting the blind
people is proposed.[9] To extract text regions from complex backgrounds, novel text localization algorithm based on models of
stroke orientation and edge distributions is adopted. An image to speech conversion technique using Raspberry Pi was
implemented. Output has been tested using different samples. The algorithm successfully processes the image and reads it out
clearly.
IV. METHODOLOGY
The proposed system is a kind of software module that takes input using the system’s inbuilt camera or a webcam and extract the
text content using the code developed and convert the text to speech and read it out using the headphone/ webcam. This project
removes the usage of raspberry pi board which is considered as one of the greatest advantages of the proposed system board.Speech
and text are the main medium for human communication. A person needsvision to access the information in a text. However, those
who have poor vision can gather information from voice. This paper proposes a camera based assistive text reading to help visually
impaired person in reading the text present on the captured image. The proposed idea involves text extraction from scanned image
using Tesseract Optical Character Recognition (OCR) and the image is read using the open cv2 provided by python library and
converting the text to speech by gtts (Google Text To Speech) which translates the text to speech., a process which makes visually
impaired persons to read the text. This is a prototype for blind people to recognize the products in real world by extracting the text
on image and converting it into speech. Proposed method is carried out by using the installation of a software thus makes it more
portable and less expensive. Optical character recognition (OCR) systems provide persons who are blind or visually impaired with
the capacity to scan printed text and then have it spoken in synthetic speech or saved to a computer file. There are three essential
elements to OCRtechnology—scanning, recognition, andreading text. The data that we collect or generate is mostly raw data, i.e.
it is not fitto be used in.

V. MODELING AND ANALYSISHARDWARE REUIREMENT:


 Webcam/ inbuilt camera with the system.
 Headphone / speaker.

A. Software Requirement
1) Python 3.8.1
2) Import pytesseract , gtts , os.

B. Technologies Used
1) Image processing
2) OCR technique(OpticalCharacterRecognition)
3) GTTS (Google Text To Speech Converter)IMAGE

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5788
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

C. Processing
Image processing library mainly focused on real-time computer vision with application in wide-range of areas like 2D and 3D
feature toolkits, facial & gesture recognition,Human-computer interaction, Mobile robotics, Object identification and others.
The image processing is done using the library open CV2.To perform basic operations on images like create thumnails,resize,
rotation, convert between different file formats etc we use PIL . the image is loaded directely using the open () function on Image
class. This returns an image object that contains the pixel data for the image as well as details about the image .The format property
on the image will report the image format(e.g png, jpeg), the mode will report the pixel channel format (e.g. CMYK or RGB) and
the size will report thedimensions of the image in pixels (e.g. 400*260).The show() function will display the image using operating
systems default application. One of the most popular and considered as default library of python for image processing is Pillow.
Pillow is an updated version of the Python Image Libraryor PIL and supports a range of simple and advanced image manipulation
functionality. It is also the basis for simple image support in other Python libraries such as SciPy and Matplotlib. OCR
TECHNIQUE: Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images
of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-
photo or from subtitle text superimposed on an image.It deals with recognizing text from the image files and storing it into a text
file. Here, we process the images and convert it into text. Once we have the text as a string variable, we can do any processing on
the text.Optical Character Recognition involves thedetection of text content on images and translation of the images to encoded text
that the computer can easily understand. An image containing text is scanned and analyzed in order to identify the characters in it.
Upon identification, the character is converted to machine-encoded text. The image is now split into zones identifying the areas of
interest such as where the images ortext are and this helps kick off the extraction process. The areas containing text can now be
broken down further into lines and words and characters and now the software is able to match the characters through comparison
and various detection algorithms. The final result is the text in the image that we're givenThe fundamental information gathered
from web sources is still presented in its unprocessed state as statements, numbers, and qualitative phrases. There are mistakes,
omissions, and discrepancies in the raw data. After carefully examining the filled questionnaires, modifications are necessary.
Processing the primary data involves the subsequent processes. Field surveys generate a tremendous amount of raw data, which
must be classified according to the similarity of the individual responses. Data preprocessing is a method for transforming unclean
data into clean data sets. In other words, anytime data are collected from several sources, they are combined into araw format that
is not useful for analysis. Asa result, specific actions are taken to reduce the data to a manageable and clean collection.
This technique is performed before theexecution of Iterative Analysis. These set of steps is known as Data Preprocessing. After
this it includes Data Cleaning, Preprocessing, Feature Extraction,Classification.
Two Modules: User and Doctor are been developed. The incremental build model is a method of software development where the
product is designed, implemented, and tested incrementally (a little more is added each time) until the product is finished. It
involves both development and maintenance.

Fig 1. OCR PROCESS FLOW

D. Google Text To SpeechConverter


gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate's text-to-speech API. There are
several APIs available to convert text to speech in python. One of such APIs is the Google Text to Speech API commonlyknown
as the gTTS API. gTTS is a very easy to use tool which converts the text entered, into audio which can be saved as a mp3 file.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5789
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

The gTTS API supports several languages including English, Hindi, Tamil, French, German and many more. The speech can be
delivered in any one of the two available audio speeds, fast or slow. However, as of the latest update, it is not possible to change the
voice of the generated audio.
VI. EXISTING SYSTEM
In the running world there is a growing demand for the users to convert the printed documents in to electronic documents for
maintaining the security of their data. Hence the basic OCR system was invented to convert the data available on papers in to
computer process able documents, So that the documents can be editable and reusable. The existing system/the previous system of
OCR on a grid infrastructure is mostly based on expensive and complex hardware setup. This leads to cost of system and uses
become limited based on availability and affordability of blind person. Then the images are refined in order to eliminate any noise
that is present in it. A feature called segmentation is used in order to separate each character from other in the text. Graphical details
such as icons or logos, if any, are eliminated. Each obtained character is compared with the datasets that are created as a part of the
Tesseract library. The Tesseract OCR is the most efficient algorithm available that checks for the obtained character in ten
dimensions. Once, the character is recognized, it must be made available as an audio output. For this, we use a software called
festival. The festival is used to provide the audio output for the recognized character. Apart from these features, an extra feature is
added, that enables the blind to know the type of object that he/she interacts with. (a menu, newspaper and the like). An ultrasonic
sensor is included as a part of the project, that makes the project obtain characters only within a particular distance.

VII. PROPOSED WORK


The suggested system is a software module that accepts input from the user extracts the text content using the code produced,
converts the text to speech, and reads it out using the headphone/speaker. Our System can read multiple languages e.g. Hindi ,
English, Russian etc. This project eliminates the use of the Raspberry Pi board, which is regarded as one of the most significant
advantages of the suggested system board. The proposed idea entails extracting text from a scanned image using Tesseract Optical
Character Recognition (OCR), reading the image with the open cv2 provided by the Python library, and converting the text to
speech using gtts(Google Text To Speech), a process that allows visually impaired people to read the text. The proposed solution is
carried out simply installing software, making it more portable and less expensive. Those who are blind or visually handicapped
can use optical character recognition (OCR) equipment to scan printed text and have it read in synthetic speech or saved to a
computer file. The data we collect or generate is generally raw data, which means it cannot be used directly in applications for a
variety of reasons. As a result, we must first examine it, then do the Necessary processing, and then use it modeling and analysis.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5790
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

VIII. RESULTS

IX. CONCLUSION
As a result, the suggested system's ultimate goal has been met. This technology can translate speech to text and serve as a text reader
for the visually handicapped. The text is displayed in front of the system's webcam or in front of the integrated camera. The Image
Processing technique is used to examine the acquired image. OCR (Optical Character Recognition) separates and identifies the
words in the image to recognise the characters. Consequently, the words acquired are transformed to speech using GTTS (google
text to speech converter). Lastly, the collected text is read out via the speaker or headphones. As a result, visually handicapped
persons benefit from the environment.

X. FUTURE SCOPE
1) Integration with Mobile Device: As Smartphones and tablets become more powerful integrating text readers with mobile
devices will become easier. This will allow visually impaired persons to Access text from wider range of sources, including
social media, e-books, and websites.
2) Multi-language Support: With globalization, the need for text readers that can support multiple languages will continue to
grow. Using OpenCV, it is possible to train Text recognition models for various languages.
3) Improved Accessibility in public Spaces: Public spaces can be challenging for visually impaired person to navigate.The use of
text readers in signage, for example, could make it easier for them to access information.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5791
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com

REFERENCES
[1] Bindu Philip and r. d. Sudhaker Samuel 2009 “Human machine interface- a smart ocr for the visually challenged” International journal of recent
trends in engineering, vol no.3, November.
[2] K Nirmala Kumari, Meghana Reddy J [2016]. Image Text to Speech Conversion Using OCR Technique in Raspberry Pi. International Journal of Advanced
Research in Electrical, Electronics and Instrumentation Engineering Vol. 5, Issue 5, May 2016.
[3] V. Ajantha devi, dr. Santhosh baboo “Embedded optical character recognition on tamil text image using raspberry pi” international journal of computer
science trends and technology (ijcst)” volume 2 issue 4, jul- aug 2014
[4] Jaiprakash verma, khushali desai “Image to sound conversion” International journal of advance research.
[5] R. Mithe, S. Indalkar and N. Divekar. “ Optical Character Recognition" International Journal of Recent Technology ”
[6] Character Detection and Recognition System for Visually Impaired People by Akhilesh A. Panchal, Shrugal Varde, M.S. Panse .
[7] Giudice, N. A., & Legge, G. E, Blind navigation and the role of technology. In A. Helal, M. Mokhtari & B. Abdulrazak (Eds.), Engineering handbook of smart
technology for aging, disability, and independence (pp. 479- 500): John Wiley & Sons.
[8] Sunil Kumar, Rajat Gupta , Nitin Khanna, SantanuChaudhury and Shiv Dutt Joshi, Text Extraction and Document Image Segmentation Using Matched
Wavelets and MRF Model, IEEE Transactions on Image Processing ( Volume: 16 , Issue: 8 , Aug. 2007 ) 2117 – 2128.
[9] Ray Kurzweil K Reader Mobile User Guide, knfb Reading Technology Inc. (2008). [Online].Available: http://www.knfbReading.com
[10] Ms.AthiraPanicker Smart Shopping assistant label reading system with voice output for blind using raspberry pi, Ms.Anupama Pandey, Ms.VrunalPatil YTIET,
University of Mumbai ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol. 5, Issue 10,
Oct 2016 2553 www.ijarcet.org
[11] Raspberry pi 3b, Optical Character Recognition (OCR), Text to speech (TTS), Pi-camera, Speaker,Headphone
[12] Gurav, Mallapa D., et al.”B-LIGHT:A Reading aid for the Blind People using OCR and OpenCV. ”International Journal of Scientific Research Engineering
&Technology(IJSRET),ISSN(2017).
[13] Goel, Anush, et al. "Raspberry Pi Based Reader for Blind People." International Research Journal ofEngineering and Technology 5.6
[14] Chaudhari, Harshada. "Raspberry Pi technology: a review." International Journal of Innovative andEmerging Research in Engineering 2.3

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5792

You might also like