Text Reader For Visually Impaired Person Using Image Processing Open-CV
Text Reader For Visually Impaired Person Using Image Processing Open-CV
https://doi.org/10.22214/ijraset.2023.53004
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Abstract: The main issue that visually impaired individuals confront these days is that they are unable to do text recognition on
their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, referring to books, and
so on. This issue may erode their confidence because they are unable to cope on their own. The project's ultimate goal is to assist
visually challenged persons with text recognition. This goal is accomplished by creating a module that converts text into speech
and speaks it through the provided headphone/speaker. The code is written in Python after importing pytesseract and gtts. For
character recognition, this project employs the concept of image processing and the OCR approach.
Keywords: Text to speech, Image to Text, Optical Character Recognition,gTTS and Speech output, Python Programming.
I. INTRODUCTION
The project's ultimate goal is to assist visually challenged persons in recognizing text. When a printed text is displayed in front of
the web cam, it must capture the image, extract the text from the image, and read out the text via computer audio or headphone. The
code is written in Python after importing pytesseract and gtts. For character recognition, this project employs the concept of image
processing and the OCR approach. The main issue that visually impaired persons have these days is that they are unable to do text
recognition on their own, forcing them to rely on others for day-to-day tasks such as reading newspapers, writing letters, and
referring to websites referring books etc. This problem may reduce their confidence as they could not withstand independently. The
project's ultimate goal is to assist visually challenged persons with text recognition. This goal is accomplished by creating a module
that converts text into speech and speak through the provided headphone/speaker. The image is captured using the system's webcam,
and the text isextracted using the built-in application. The text is then identified for words and spoken out loud using headphones or
the system's audio. The Python programming language offers (Python Imaging Library), which is used to do simple image
operations such as creating thumbnails, resizing, rotating, and converting between different file formats.
II. OBJECTIVE
This project is designed to overcome Braille problem using IoT technology. This Project is built using a small size and low cost
single board computer, named Raspberry Pi.The visual data is sent to the single board computer using WiFi connection. The imageis
processed to perform image to text conversion and text to voice conversion using available converters from the online site.Book
reader will capture the picture of book pages using a camera and then process the images using OCR software. When the image is
recognized, book reader will read it aloud. Thus, the blind people or those who have low vision will hear it without the needto touch
using their fingertips. Book Reader will read aloud a book without need to touchlike braille.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5787
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Smart Reader for Visually Impaired People Using Raspberry Pi: This paper propose that how to convert image into text and text
into audio. Also this system give complete information about hardware and software implementation for blind reader.
The software Implementation and programming along with the details of ocr engine were very useful from this paper. This paper
gave the detail information about which engines to be used for image to text conversion and text to speech.“OCR based automatic
book reader for the visually impaired using Raspberry PI”–This paper provided the case study and from this paper we learn to build
a system on English language, and we were able to think that in other language can also be done, which we put it in
advancement[3]. The system accepts a page of printed text with English numerals, scans it into a digital document which is then
subjected to skew correction, segmentation, before feature extraction to perform classification. Once classified, the text is read
out by a text to speech conversion unit. An innovative, efficient and real-time cost beneficial technique that enables user to hear the
contents of text images instead of reading through them as been introduced.[6] It combines the concept of Optical Character
Recognition (OCR) and Text to Speech Synthesiser (TTS) in Raspberry pi. Text Image using Raspberry Pi”. Optical Character
recognition is used to digitize and reproduce texts that have been produced with non computerized system. Digitizing texts also
helps reduce storage space[7].
Design and implementation of Automatic Scene text detection and recognition system for visually impaired people has been
discussed. Combining different techniques for Text detection and extraction results into accurate and better system than using single
technique for overall system. Text recognition is successfully performed using pattern matching technique. After successful
recognition, text is converted into audio output. A prototype system to read printed text and hand held objects for assisting the blind
people is proposed.[9] To extract text regions from complex backgrounds, novel text localization algorithm based on models of
stroke orientation and edge distributions is adopted. An image to speech conversion technique using Raspberry Pi was
implemented. Output has been tested using different samples. The algorithm successfully processes the image and reads it out
clearly.
IV. METHODOLOGY
The proposed system is a kind of software module that takes input using the system’s inbuilt camera or a webcam and extract the
text content using the code developed and convert the text to speech and read it out using the headphone/ webcam. This project
removes the usage of raspberry pi board which is considered as one of the greatest advantages of the proposed system board.Speech
and text are the main medium for human communication. A person needsvision to access the information in a text. However, those
who have poor vision can gather information from voice. This paper proposes a camera based assistive text reading to help visually
impaired person in reading the text present on the captured image. The proposed idea involves text extraction from scanned image
using Tesseract Optical Character Recognition (OCR) and the image is read using the open cv2 provided by python library and
converting the text to speech by gtts (Google Text To Speech) which translates the text to speech., a process which makes visually
impaired persons to read the text. This is a prototype for blind people to recognize the products in real world by extracting the text
on image and converting it into speech. Proposed method is carried out by using the installation of a software thus makes it more
portable and less expensive. Optical character recognition (OCR) systems provide persons who are blind or visually impaired with
the capacity to scan printed text and then have it spoken in synthetic speech or saved to a computer file. There are three essential
elements to OCRtechnology—scanning, recognition, andreading text. The data that we collect or generate is mostly raw data, i.e.
it is not fitto be used in.
A. Software Requirement
1) Python 3.8.1
2) Import pytesseract , gtts , os.
B. Technologies Used
1) Image processing
2) OCR technique(OpticalCharacterRecognition)
3) GTTS (Google Text To Speech Converter)IMAGE
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5788
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
C. Processing
Image processing library mainly focused on real-time computer vision with application in wide-range of areas like 2D and 3D
feature toolkits, facial & gesture recognition,Human-computer interaction, Mobile robotics, Object identification and others.
The image processing is done using the library open CV2.To perform basic operations on images like create thumnails,resize,
rotation, convert between different file formats etc we use PIL . the image is loaded directely using the open () function on Image
class. This returns an image object that contains the pixel data for the image as well as details about the image .The format property
on the image will report the image format(e.g png, jpeg), the mode will report the pixel channel format (e.g. CMYK or RGB) and
the size will report thedimensions of the image in pixels (e.g. 400*260).The show() function will display the image using operating
systems default application. One of the most popular and considered as default library of python for image processing is Pillow.
Pillow is an updated version of the Python Image Libraryor PIL and supports a range of simple and advanced image manipulation
functionality. It is also the basis for simple image support in other Python libraries such as SciPy and Matplotlib. OCR
TECHNIQUE: Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images
of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-
photo or from subtitle text superimposed on an image.It deals with recognizing text from the image files and storing it into a text
file. Here, we process the images and convert it into text. Once we have the text as a string variable, we can do any processing on
the text.Optical Character Recognition involves thedetection of text content on images and translation of the images to encoded text
that the computer can easily understand. An image containing text is scanned and analyzed in order to identify the characters in it.
Upon identification, the character is converted to machine-encoded text. The image is now split into zones identifying the areas of
interest such as where the images ortext are and this helps kick off the extraction process. The areas containing text can now be
broken down further into lines and words and characters and now the software is able to match the characters through comparison
and various detection algorithms. The final result is the text in the image that we're givenThe fundamental information gathered
from web sources is still presented in its unprocessed state as statements, numbers, and qualitative phrases. There are mistakes,
omissions, and discrepancies in the raw data. After carefully examining the filled questionnaires, modifications are necessary.
Processing the primary data involves the subsequent processes. Field surveys generate a tremendous amount of raw data, which
must be classified according to the similarity of the individual responses. Data preprocessing is a method for transforming unclean
data into clean data sets. In other words, anytime data are collected from several sources, they are combined into araw format that
is not useful for analysis. Asa result, specific actions are taken to reduce the data to a manageable and clean collection.
This technique is performed before theexecution of Iterative Analysis. These set of steps is known as Data Preprocessing. After
this it includes Data Cleaning, Preprocessing, Feature Extraction,Classification.
Two Modules: User and Doctor are been developed. The incremental build model is a method of software development where the
product is designed, implemented, and tested incrementally (a little more is added each time) until the product is finished. It
involves both development and maintenance.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5789
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
The gTTS API supports several languages including English, Hindi, Tamil, French, German and many more. The speech can be
delivered in any one of the two available audio speeds, fast or slow. However, as of the latest update, it is not possible to change the
voice of the generated audio.
VI. EXISTING SYSTEM
In the running world there is a growing demand for the users to convert the printed documents in to electronic documents for
maintaining the security of their data. Hence the basic OCR system was invented to convert the data available on papers in to
computer process able documents, So that the documents can be editable and reusable. The existing system/the previous system of
OCR on a grid infrastructure is mostly based on expensive and complex hardware setup. This leads to cost of system and uses
become limited based on availability and affordability of blind person. Then the images are refined in order to eliminate any noise
that is present in it. A feature called segmentation is used in order to separate each character from other in the text. Graphical details
such as icons or logos, if any, are eliminated. Each obtained character is compared with the datasets that are created as a part of the
Tesseract library. The Tesseract OCR is the most efficient algorithm available that checks for the obtained character in ten
dimensions. Once, the character is recognized, it must be made available as an audio output. For this, we use a software called
festival. The festival is used to provide the audio output for the recognized character. Apart from these features, an extra feature is
added, that enables the blind to know the type of object that he/she interacts with. (a menu, newspaper and the like). An ultrasonic
sensor is included as a part of the project, that makes the project obtain characters only within a particular distance.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5790
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
VIII. RESULTS
IX. CONCLUSION
As a result, the suggested system's ultimate goal has been met. This technology can translate speech to text and serve as a text reader
for the visually handicapped. The text is displayed in front of the system's webcam or in front of the integrated camera. The Image
Processing technique is used to examine the acquired image. OCR (Optical Character Recognition) separates and identifies the
words in the image to recognise the characters. Consequently, the words acquired are transformed to speech using GTTS (google
text to speech converter). Lastly, the collected text is read out via the speaker or headphones. As a result, visually handicapped
persons benefit from the environment.
X. FUTURE SCOPE
1) Integration with Mobile Device: As Smartphones and tablets become more powerful integrating text readers with mobile
devices will become easier. This will allow visually impaired persons to Access text from wider range of sources, including
social media, e-books, and websites.
2) Multi-language Support: With globalization, the need for text readers that can support multiple languages will continue to
grow. Using OpenCV, it is possible to train Text recognition models for various languages.
3) Improved Accessibility in public Spaces: Public spaces can be challenging for visually impaired person to navigate.The use of
text readers in signage, for example, could make it easier for them to access information.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5791
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
REFERENCES
[1] Bindu Philip and r. d. Sudhaker Samuel 2009 “Human machine interface- a smart ocr for the visually challenged” International journal of recent
trends in engineering, vol no.3, November.
[2] K Nirmala Kumari, Meghana Reddy J [2016]. Image Text to Speech Conversion Using OCR Technique in Raspberry Pi. International Journal of Advanced
Research in Electrical, Electronics and Instrumentation Engineering Vol. 5, Issue 5, May 2016.
[3] V. Ajantha devi, dr. Santhosh baboo “Embedded optical character recognition on tamil text image using raspberry pi” international journal of computer
science trends and technology (ijcst)” volume 2 issue 4, jul- aug 2014
[4] Jaiprakash verma, khushali desai “Image to sound conversion” International journal of advance research.
[5] R. Mithe, S. Indalkar and N. Divekar. “ Optical Character Recognition" International Journal of Recent Technology ”
[6] Character Detection and Recognition System for Visually Impaired People by Akhilesh A. Panchal, Shrugal Varde, M.S. Panse .
[7] Giudice, N. A., & Legge, G. E, Blind navigation and the role of technology. In A. Helal, M. Mokhtari & B. Abdulrazak (Eds.), Engineering handbook of smart
technology for aging, disability, and independence (pp. 479- 500): John Wiley & Sons.
[8] Sunil Kumar, Rajat Gupta , Nitin Khanna, SantanuChaudhury and Shiv Dutt Joshi, Text Extraction and Document Image Segmentation Using Matched
Wavelets and MRF Model, IEEE Transactions on Image Processing ( Volume: 16 , Issue: 8 , Aug. 2007 ) 2117 – 2128.
[9] Ray Kurzweil K Reader Mobile User Guide, knfb Reading Technology Inc. (2008). [Online].Available: http://www.knfbReading.com
[10] Ms.AthiraPanicker Smart Shopping assistant label reading system with voice output for blind using raspberry pi, Ms.Anupama Pandey, Ms.VrunalPatil YTIET,
University of Mumbai ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Vol. 5, Issue 10,
Oct 2016 2553 www.ijarcet.org
[11] Raspberry pi 3b, Optical Character Recognition (OCR), Text to speech (TTS), Pi-camera, Speaker,Headphone
[12] Gurav, Mallapa D., et al.”B-LIGHT:A Reading aid for the Blind People using OCR and OpenCV. ”International Journal of Scientific Research Engineering
&Technology(IJSRET),ISSN(2017).
[13] Goel, Anush, et al. "Raspberry Pi Based Reader for Blind People." International Research Journal ofEngineering and Technology 5.6
[14] Chaudhari, Harshada. "Raspberry Pi technology: a review." International Journal of Innovative andEmerging Research in Engineering 2.3
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 5792