Visveswaraya Technological University: Design of An Automatic Reader For The Visually Impaired Using Raspberry Pi

VISVESWARAYA TECHNOLOGICAL UNIVERSITY
Jnana sangama, Khanapur road, Belagavi – 590018
Project Review Phase 1 [Second Review] A REPORT ON

DESIGN OF AN AUTOMATIC READER FOR THE VISUALLY IMPAIRED USING
RASPBERRY Pi
Submitted in partial fulfillment of the requirements for the award of degree of
BACHELOR OF ENGINEERING
IN
ELECTRICAL AND ELECTRONICS ENGINEERING
SUBMITTED BY
CHETHAN V R : 1OX19EE005
MOHAMMED TAMEEM : 1OX19EE018
SURYANARAYANAREDDY : 1OX19EE034
BHARATHKUMAR K P : 1OX19EE043
UNDER THE GUIDANCE OF

DR. BHARATH V S
Professor and Head of Department
Department of EEE, TOCE, Bengaluru
DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING

THE OXFORD COLLEGE OF ENGINEERING
BOMMANAHALLI, HOSUR ROAD, BENGALURU – 560 068
(Affiliated to VTU and approved by AICTE Accredited by NAAC and NBA)
2022 – 2023
BOMMANAHALLI, HOSUR ROAD, BENGALURU-560068
(Affiliated to VTU and approved by AICTE accredited by NAAC and NBA)
2020-2021
CERTIFICATE
This is to certify that the Project entitled “ DESIGN OF AN AUTOMATIC READER

FOR THE VISUALLY IMPAIRED USING RASPBERRY Pi ” is a bonafide work
carried out by CHETHAN V R (1OX19EE005), MOHAMMED TAMEEM
(1OX19EE018), SURYANARAYANAREDDY (1OX19EE034), BHARATHKUMAR K
P (1OX19EE043), in partial fulfillment for the award of the degree of Bachelor of
Engineering in the course of Electrical and Electronics Engineering at The Oxford
College of Engineering under Visvesvaraya Technological University, Belagavi for the
year 2022-2023. It is certified that all corrections and suggestions indicated for internal
assessment have been incorporated in the report. The Project report has been approved as it
satisfies the academic requirements for the award of Bachelor of Engineering degree.
SIGNATURE OF GUIDE SIGNATURE OF HOD SIGNATURE OF PRINCIPAL
…………………………… ………………………… …………………………………
INTERNAL GUIDE HOD PRINCIPAL

DR.V.S. BHARATH DR. V.S. BHARATH DR. KANNAN
PROFESSOR & HOD DEPT OF EEE, TOCE PRINCIPAL TOCE
DEPT. OF. EEE
NAME OF THE EXAMINERS SIGNATURE WITH DATE

 …………………………
 …………………………
BOMMANAHALLI, HOSUR ROAD, BENGALURU-560068
(Affiliated to VTU and approved by AICTE accredited by NAAC and NBA)
DECLARATION
We the student of 7th semester B.E in Electrical and Electronics Engineering, The Oxford
college of Engineering, Bengaluru, hereby declare that the project work entitled “DESIGN
OF AN AUTOMATIC READER FOR THE VISUALLY IMPAIRED USING
RASPBERRY Pi” has been done and submitted in partial fulfilment of the requirements for
the award of the degree in Bachelor of Engineering in ELECTRICAL AND
ELECTRONICS ENGINEERING Of Visvesvaraya Technological University, Belagavi
during the year 2022-2023
NAME USN NO SIGNATURE
CHETHAN V R 1OX19EE005
MOHAMMED TAMEEM 1OX19EE018
SURYANARAYANAREDDY 1OX19EE034
BHARATHKUMAR K P 1OX19EE043
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompany the successful completion of any task would be
incomplete without the mention of the people who made it possible with continuous guidance and
encouragement and crowned our effort with success.
We have great pleasure in expressing our deep sense of gratitude to Late Shri. S. NARASARAJU,
Founder chairman and we consider ourselves proud to be a part of the Oxford family the institution
that stood by our way in all our endeavors. So, we express our gratitude to Shri. S. N. V. L.
NARASIMHA RAJU, Chairman, The Oxford Educational Institutions and Dr. K M
RAVIKUMAR, Director, The Oxford Educational Institutions for providing all facilities for our
work to be a better one.
We would like to express our gratitude to Dr. N KANNAN, Principal, The Oxford College of
Engineering for providing congenial environment surrounding to work in, we heartly thank to our
beloved HOD, Dr. V. S. BHARATH, Department of EEE for his encouragement and support.
Words are inadequate in offering our thanks to our guide Dr. V. S. BHARATH, professor & HOD.
Project Coordinator Mr. Jayakumar N, Assistant Professor, Department of EEE with profound
gratefulness for a moral inspiration, encouragement and valuable guidance throughout course of this
work.
We also thank all the staff members of Electrical and Electronics department and all those who have
directly and indirectly helped us with their valuable suggestions in the successful completion of this
Project. Last but not the least we would thank our beloved parents for their support and
encouragement to successfully complete the task by meeting the entire requirement.
Name of the student USN
CHETHAN V R 1OX19EE005
MOHAMMED TAMEEM 1OX19EE018
SURYANARAYANAREDDY 1OX19EE034
BHARATHKUMAR K P 1OX19EE043
DESIGN OF AN AUTOMATIC READER FOR THE VISUALLY IMPAIRED USING RASPBERRY Pi
CHAPTER – 1
ABSTRACT
This project is an automatic document reader for visually impaired people, developed on the
Raspberry Pi processor board. It controls the peripherals like camera, a speaker which act as an
interface between the system and the user. Optical character recognition (OCR) technology is
used for the identification of the printed characters using image sensing devices and computer
programming. The OCR process is done using online and offline methods.
It converts images of typed or printed text into machine-encoded text. These encoded texts are
then converted into the audio output (Speech). Raspberry Pi is used for the translation of printed
document into data files using Tesseract library and Python programming. These data files are
then computed by OpenCV library and Python programming language to get the audio output.
This project explains the automated document reader for blind people with the help of Raspberry
Pi. This process takes place with the help of Optical Character Recognition (OCR) technology
for identification of the printed characters using image sensing devices and computerized
programming. It converts images of typed, handwritten, or printed text into machine encoded
text with the help of OCR. In this project these images are converted into audio output
(Speech)with the help of OCR and Text-to-speech synthesis. The conversion of printed
document into text files is done with the help of Raspberry Pi which again uses Tesseract library
and Python programming. The text files are processed with the help of OpenCV library and
python programming language and hence the audio output is obtained.
Keywords: Character recognition, Low power, Document Image Analysis (DIA), Raspberry Pi
4B, Speech Output, OCR based book reader, OpenCV, Python Programming.
DEPT OF EEE | TOCE 1

CHAPTER – 2
INTRODUCTION
Today there is a huge amount of written material available everywhere, and converting that into
Braille is not a suitable option due to the vastness of the content that is available and the lack of
return on such a huge investment. But that would mean denying the blind an access to these
huge quantities of scholarly material. A suitable and viable alternative for this predicament is
designing a smart device which can convert this print media into speech format and it can in turn
play it out for the visually impaired. Such a device is radical in its kind and is a huge benefit,
although the initial investment may be high but in the long run it is pretty cost effective. The
main objective of this project is to make use of such a system specifically the Raspberry Pi
system along with its ancillaries in unison and use it to convert text into speech and thus assist
the visually impaired.
The proposed algorithm uses a camera module with which it can capture the desired text and
then convert it to binary representation [1], i.e., converting the image into a Gray-scale image.
From this grayscale image the individual characters are extracted and recognized all of which is
carried out by the Optical Character Recognition Algorithm. Upon undergoing the processes of
Scanning, Pre-processing, Segmentation and Feature Extraction, the scanned text is finally ready
to give its output by means of the speaker connected to the Pi module. Even though such
systems are present, most of them are in the crude forms and developing a commercially viable
setup will be a huge aid for the visually impaired thus giving them access to unprecedented
amounts of text and written media. Such as system which involves only a one-time investment is
thus a vital assist tool. The main objective of this project is converting print and written media
into playable audio with high efficiency. A unique addition in this device is to record speech in
the memory and replay these audio files at a convenient time.
There are many existing solutions to the problem of assisting individuals who are blind to read,
however, none of them provide an efficient reading. We focus on improving the competence of
blind people by providing them with a solution where the details are given in the form of audio
signal. Raspberry Pi-Based Reader is an automatic document reader for visually impaired people
using OCR technology. The proposed project uses a camera-based assistive device which can be
used by individuals to read printed text. The scheme is to implement an embedded system-based
image capturing technique using Raspberry Pi board. The design is inspired by prior research

with visually impaired people, and it is small and portable, that helps in achieving result in little
setup. Here, we have put forward a text read out system for visually impaired people. OCR and
Text-to-Speech synthesis is used to convert images into audio output (Speech). The proposed
apparatus has a camera which act as the input device for digitization and this digitized script is
processed by OCR (software module). A procedure is followed for recognition of characters and
the line of reading. In the context of software development, the Open CV (Open-source
Computer Vision) libraries are employed to capture image of text and character recognition. The
final identified text document is given to the output devices based on the choice of the user.
Headset connected to the Raspberry

CHAPTER – 3
BACKGROUND / EXISTING SYSTEM
3.1 BRAILLE
 Braille is a reading and writing system for blind and vision impaired people. It is made
up of raised dots that can be 'read' by touch. The basic component is a rectangular 'cell'
of six dots, arranged in two vertical columns of three dots.
 Each dot arrangement represents a different letter or number. For example, the letter 'A'
is a single dot (the first dot in the top left column. 'B' is two vertically aligned dots (first
and second dots in the left column), while 'C' is two horizontally aligned dots (top dot in
both columns).
 In this way, braille offers 63 different dot combinations to form the alphabet, numbers,
punctuation marks and abbreviations. Braille is used around the world in many
languages. Just about any written information can be presented in braille including
books, music, mathematics and knitting patterns.
3.2 BRAILLE GRADES

The two grades of braille are:
 Grade 1 - the braille alphabet, numbers and punctuation. This is equivalent to the print
alphabet. People learning braille usually start with Grade 1. However, this form takes up
a lot of space, which makes Grade 1 braille books much bulkier than print books.
 Grade 2- braille that, in addition to the alphabet, uses abbreviations and contractions
(similar to that of shorthand). Grade 2 braille is used for more complicated texts, such as

novels and large documents, because it takes up less space. For example, the word
'braille' is written as 'brl'. The shorter words mean less finger travel across a line and a
faster reading speed. Grade 2 is the most popular form of braille.
3.3 MODERN BRAILLE TECHNOLOGY

The Perkins Braille is a kind of typewriter that is commonly used throughout the world to write
braille. It has six keys representing each one of the six braille dots. To write a letter, the correct
combination of keys has to be pressed simultaneously. The dots are punched into the paper from
underneath, which means that braille can be written as it is read, from left to right.
Computer technology has revolutionised the use of braille. For example:
 A braille display (also known as a screen reader) - this is a piece of equipment connected
to the computer that reads screen text and presents it to the user via one line of
refreshable braille.
 A braille embosser - this is a type of printer that prints text in braille dots. It relies on a
braille translator to translate text.
 A braille keyboard - this is a keyboard consisting of six keys for producing braille dots, a
space bar, carriage return and backspace key. It allows the user to type in braille.
 Scanners - text can be converted into braille using a scanner and a computerised braille
translation program.
 Tele braille III - this device attaches to a telephone typewriter (TTY). The TTY is a
small screen and typewriter that is used in place of the telephone handset, so that the
conversation is typed rather than spoken. The Tele braille III transcribes the written text
and displays it in braille.
3.4 DRAWBACKS OF EXISTING SYSTEM

• It is a time-consuming Process.
• Errors cannot be erased.
• Braille isn’t used to transcribe and write books and publications alone.
• uses much more space on a page than a sighted writing system.
• Not all blind and visually impaired people use braille.

• It’s recommended to learn braille by touch if you’re losing your vision but still have
some sight remaining.
• Cannot be read by a sighted person who has not learned it.

CHAPTER – 4
LITERATURE SURVEY
SL NO AUTHORS REMARKS YEAR
A Review Paper on Raspberry Pi and its

Applications
• The module utilizes various kinds of the
processor; therefore, it can only install
opensource operating systems and apps on
it.
• Raspberry Pi support various programming
languages such as Python, C, C++, BASIC,
Hirak Dipak Ghael,
Perl and Ruby.
Dr. L Solanki,
1 • Raspberry Pi support various programming 2021
Gaurav Sahu
languages such as Python, C, C++, BASIC,
Perl and Ruby
• Multiple Sensors
• Supports all kind of Codes
• Fast Processor
• Can be Used a Portable Computer
• Missing eMMC Internal Storage
• Missing Graphics Processor
A Review of Rechargeable Batteries for Portable

Electronic Devices
• PEDs are important platforms for realizing
efficient information collection, processing,
and dissemination. They have experienced a
rapid growth during the previous three
decades. The performance of these PEDs
Yeru Liang, has been becoming more and more sensitive
Chen-Zi Zhao, to their energy consumption, which relies
Hong Yuan, on their energy storage components, that is,
Yuan Chen, Weicai batteries.
Zhang, Jia-Qi • Lithium-ion battery as the most commonly
Huang, Dingshan used rechargeable batteries nowadays, Li-
2 Yu5 Yingliang Liu. ion batteries bring PEDs to a new age since 2019
Maria-Magdalena 1991
Titirici, • High specific energy, low self-discharge
Yu-Lun Chueh, rate, high voltage of about 3.6 V,
Haijun Yu, maintenance free, lightweight, good safety,
Qiang Zhang. and excellent cycling performance.
• High life cycle of about 5+ years.

Camera Reading For Blind People

• This paper presents the development of the
project Camera Reading for Blind People,
considering OCR and TTS stages, to create
an application that was gradually improved
and refined over the project
• Blind people are unable to perform visual
tasks. For instance, text reading requires the
use of a braille reading system or a digital
speech synthesizer
Roberto Netoa, • The technology of speech synthesis (TTS)
3 Nuno Fonseca enables a text in digital format to be 2014
synthesized into human voice and played
through an audio system. The objective of
the TTS is the automatic conversion of
sentences, without restrictions, into spoken
discourse in a natural language.
• Optical Character Recognition (OCR)
• Voice synthesis, defined as TTS (acronym
for Text-To-Speech), is a computer system
that should be able to read aloud any text,
regardless of its origin
He proposes Wearable Obstacle Avoidance

Electronic Travel Aids for Blind that presents a
comparative survey among portable or wearable
Dimitrios
obstacle detection or avoidance systems to aware
4 Dakopoulos 2020
users about the progress in assistive technology for
blind people.
He proposes a system for converting English text

into speech using an inexpensive computer but it is
William A not suitable for all memory range of computers as
5 Ainsworth only small amount of stored data has scrutinized.
She proposes a Vision Based Assistive System for

Label Detection with Voice Output. This camera-
based assistive system helps blind persons read text
labels and product packaging from a hand-held
Vasanthi G object.
6

He proposes a Navigation System for blind people

to navigate safely and quickly. In the system,
ultrasonic sensors and USB camera are used to
detect and recognize obstacles. Once the obstacles
Marut Tripathi
7 detected via ultrasonic sensors, it sends feedback in
the form of beep sound through an earphone to
inform the person about the hindrance.
He proposes a smart shopping assistant label

reading system with voice output for the blind
using the raspberry pi. This system only reads
Athira Panicker documents from clear flat surface and it does not
8 read from complex backgrounds

CHAPTER – 5
BLOCK DIAGRAM
Fig 1: Block Diagram of Proposed System

5.1 BLOCK DIAGRAM DESCRIPTION

• The first part is booting the Raspberry Pi board by installing the Operating system
• Next is the image acquisition system, in which we have interfaced a camera, to
capture the image of the text document.
• The data is transfer to the OCR algorithm which converts the image data to text data.
• The OCR algorithm scans the image and checks each alphabet or letter.
• Then it gives a corresponding text output after verifying it with its database.
• We can use a dictionary to compare the words detected by the algorithm for auto-
correction.
• We have chosen text to speech engine which convert the text data to an audio output
and is plays through the earphones connected to the audio jack.
• Using G-translate we can give output in various Indian languages like kannada,
Tamil, Telagu, Malayalam, Hindi.
• The proposed project is implemented on the raspberry pi 3 board.
• The raspberry pi 3 is a mini computer, it consists of 4 USB ports, 40 GPIO pins for
input or output, CSI camera interface, full HDMI port, DSI display interface, SOC
(system on a chip), LAN controller, micro-SD card slot, Bluetooth 4.1, audio jack,
and video socket, 5V micro-USB connector and Ethernet port.
• The block diagram of raspberry pi based blind reader.
It consists of raspberry pi 3, battery, camera, press button, headset.
• Raspberry Pi is the control unit which controls the entire system.
• A battery of 9V is use as the power source which turns on camera for capturing the
image.
• Press button is used to activate the program and headset for audio output.
• The printed text is to be placed under the camera to ensure the image of good quality
and fewer distortions.
• Then an applicable blind-assistive system, an algorithm might prefer.
• It checks the availability of all the devices and also for the connection, when the
applications starts.
• Raspberry Pi 3 uses a Linux based operating system named Raspbian.

CHAPTER – 6
PROPOSED SYSTEM
Fig 6.1 Proposed System Diagram

The framework of the proposed project is the raspberry pi board. The raspberry pi B+ is a
single board computer which has 4 USB ports, an Ethernet port for internet connection, 40
GPIO pins for input/ output, CSI camera interface, HDMI port, DSI display interface, SOC
(system on a chip), LAN controller, SD card slot, audio jack, and RCA video socket and 5V

micro-USB connector. The power supply is given to the 5V micro-USB connector of

raspberry pi through the Switched Mode Power Supply (SMPS). The SMPS converts the
230V AC supply to 5V DC. The web camera is connected to the USB port of raspberry pi.
The raspberry pi has an OS named RASPION which process the conversions. The audio
output is taken from the audio jack of the raspberry pi. The converted speech output is
amplified using an audio amplifier. The Internet is connected through the Ethernet port in
raspberry pi. The page to be read is placed on a base and the camera is focused to capture the
image. The captured image is processed by the OCR software installed in raspberry pi. The
captured image is converted to text by the software. The text is converted into speech by the
TTS engine. The final output is given to the audio amplifier from which it is connected to the
speaker. Speaker can also be replaced by a headphone for convenience. Here it is shown how
to style a subsection and sub sub-section also.

CHAPTER – 7
FLOW OF PROCESS
7.1IMAGE CAPTURING
The first step in which the device is moved over the printed page and the inbuilt camera captures
the images of the text. The quality of the image captured will be high so as to have fast and clear
recognition due to the high-resolution camera.
7.2 PRE-PROCESSING
Pre-processing stage consists of three steps: Skew Correction, Linearization and Noise removal.
The captured image is checked for skewing. There are possibilities of image getting skewed
with either left or right orientation. Here the image is first brightened and binarized. The
function for skew detection checks for an angle of orientation between ±15 degrees and if
detected then a simple image rotation is carried out till the lines match with the true horizontal
axis, which produces a skew corrected image. The noise introduced during capturing or due to
poor quality of the page has to be cleared before further processing.
7.3 SEGMENTATION
After pre-processing, the noise free image is passed to the segmentation phase. It is an operation
that seeks to decompose an image of sequence of characters into sub-image of individual symbol
(characters). The binarized image is checked for inter line spaces. If inter line spaces are
detected then the image is segmented into sets of paragraphs across the interline gap. The lines
in the paragraphs are scanned for horizontal space intersection with respect to the background.
Histogram of the image is used to detect the width of the horizontal lines. Then the lines are
scanned vertically for vertical space intersection. Here histograms are used to detect the width of
the words. Then the words are decomposed into characters using character width computation.
7.4 FEATURE EXTRACTION

Feature extraction is the individual image glyph is considered and extracted for features. First a
character glyph is defined by the following attributes:
 Height of the character;

 Width of the character;

 Numbers of horizontal lines-short and long;

 Numbers of vertical lines-short and long;
 Numbers of circles present;
 Numbers of horizontally oriented arcs;
 Numbers of vertically oriented arcs;
 Position of the various features;
 Pixels in the various regions.
7.5 IMAGE TO TEXT CONVERTER

The ASCII values of the recognized characters are processed by Raspberry Pi board. Here
each of the characters is matched with its corresponding template and saved as normalized
text transcription. This transcription is further delivered to audio output.
7.6 TEXT TO SPEECH

The scope of this module is initiated with the conclusion of the receding module of Character
Recognition. The module performs the task of conversion of the transformed text to audible
form. The Raspberry Pi has an on-board audio jack, the on-board audio is generated by a
PWM output and is minimally filtered. A USB audio card can greatly improve the sound
quality and volume. Two options of attaching a microphone into Raspberry Pi. One is to have
USB mic, another to have an external USB sound card.
Fig – 7 Flow of Process

CHAPTER – 8
RELATED WORK
 the authors proposed a prototype which helps any blind person to listen any text
images of English and Tamil language. Reading of text is happened by taking images
of the text and converting the image to audio output in the above-mentioned
languages. This was done with the help of Raspberry Pi 3 model B, a web-camera,
Tesseract OCR (Optical Character Recognition) engine and Google Speech API
(Application Program Interface) which is the text to speech engine. The disadvantage
of the systems that it produces unclear output with incorrect regional accent and also
problem in speech for the Tamil language.
 the authors proposed a smart book reader for visually challenged based on Optical
Character Recognition. The Raspberry Pi3 kit and Raspberry Pi Camera Module are
used here. The Google Tesseract is used for OCR and Pico is used for text to speech
in this project. On pre-processing stage this method uses binarization, de-noising,
desk Ewing and segmentation techniques for image clarity purpose. Sometimes a
mobile application allows blind people to “read” text by using a Photo-to-speech
application. A combination of OCR and Text to-Speech (TTS) framework is
integrated. With the help of smart phone take a picture and hear the text that exists in
the picture. Some drawbacks are, it’s not providing any automatic system for
capturing images.
 Generally, optical character recognition (OCR) recognizes the texts from the data of
captured images. Conversion of scanned or photographed document into electronic
transcript is happening here. Digital text synthesized into voice by using the
technology of speech synthesis (TTS) and played through any audio system. This
system is constructed by using raspberry pi, HD camera and Bluetooth headset.
 the authors proposed a model which enables any user to hear any text in real-time
without taking any pain of reading. The whole process is established with the help of
OCR (Optical Character Recognition) and TTS (Text-to-Speech) frameworks. This
combination is happening into Raspberry Pi v2. The disadvantage of the system is that
captured image was blurred and for that reason sometimes OCR gives wrong result.
The proposed system. Guaranteed to read the text present in anywhere for assisting
blind persons. The disadvantage of the system is that spell problem for OCR output.

 the authors proposed a camera-based label reader for blind persons to read any text. A
camera is used for capturing the image of text or board, and then the image is pre-
processed and separates the label from that processed image with the help of open CV
library. After identifying the text, pronunciation of text is happened through voice. A
motion-based method is applied to detect the object or the text which is written on the
board or hoarding or in any places.
 proposed a self-assistive device where any live streaming speech is sent to Google
API and after conversion of speech to text, speak the speech via speaker and
displaying the result onto the LCD screen. But a good internet facility is needed for
this method purpose.
 the authors designed a voice-based navigation system for blind people in a hospital
environment. With the help of ultrasonic sensors and an RFID reader (which are
interfaced with Raspberry Pi3) an obstacle avoidance system is designed to locate the
exact place in the hospital. Most of the models depend on good internet and most of
these models have OCR problems, due to these shortcomings in our proposed method,
there is nothing related to internet. Also, there are some processing steps for better
result in OCR.

CHAPTER – 9
CIRCUIT DIAGRAM

CHAPTER – 10
ARCHITECTURE OF RASPBERRY Pi
RASPBERRY Pi HARDWARE ARCHITECTURE
RASPBERRY Pi PIN DESCRIPTION

It is a credit card sized minicomputer that plugs into a computer monitor or TV and it uses
standard keyboard and mouse. Raspberry Pi 2 and Raspberry Pi 3 are the 2 models of
Raspberry Pi. The hardware components of the Raspberry Pi include 4 USB ports,40 GPIO
pins for input or output, CSI camera interface, full HDMI port, DSI display interface, SOC
(system on a chip), LAN controller, micro-SD card slot, Bluetooth 4.1, audio jack, and video
socket, 5V micro-USB connector and Ethernet port, power supply. Power Supply Unit
supplies electrical energy to the output loads. In real time the camera feeds its images to a
computer or computer network, often via USB, Ethernet or Wi-Fi. The Raspberry Pi board is
connected to the Projectors, Monitors and TV through a HDMI to VGA Converter.
Raspbian being a free operating system based on Debian developed for the Raspberry Pi
module. The operating system is a set of basic programs and services that helps the Raspberry
Pi run. Many versions of Raspbian are available like Raspbian Stretch and Raspbian Jessie.
As of the latest update Raspbian uses PIXEL, Pi Improved X-Window Environment, and
Lightweight as its fundamental desktop environment.
RASPBERRY Pi (COMPONENTS EXPLANATION)

CHAPTER – 11
HARDWARE IMPLEMENTATION
11.1 Raspberry Pi 3 Model B
Raspberry pi nowadays is very important in embedded system development also its makes
development very fast and one can build the demo project within hours. Here we use
raspberry pi 3 model B as the processing platform as it acts as both processor and controller.
It supports Debian based OS and hence it is portable pocket computer.
The Raspberry pi 3 model B is 10 times more powerful than 1st generation. It has wireless
lan and Bluetooth connectivity. The Raspberry Pi 3 has quad-core ARM Cortex-A53
processor. The Raspberry pi is credit card size single board computer. It has 40 general
purpose input and output pins.
The technical specification of raspberry pi 3 model B include :
 802.11n Wireless LAN

 Bluetooth 4.1
 Bluetooth Low Energy (BLE)
 4 USB ports
 Full HDMI port

 Ethernet port
 Consolidated 3.5mm sound jack and
 composite video
 Camera interface (CSI)
 Display Interface (DSI)
 micro-SD card space
11.2 RASPBERRY Pi 3 CAMERA MODULE
Raspberry pi 3 camera is used to capture the image of printed text. The camera can be
directly plug into raspberrypi3 model B camera port. It is light weight and portable.
SPECIFICATION:
a. 5-megapixel OV5647 sensor in a fixed-focus module.
b. Picture resolution is 2592 × 1944.
c. Support 1080p30, 720p60 and 640x480p60/90 video record.
d. Dimension: 25mm x 24mm x 9mm.

11.3 SPEAKER/HEADPHONE
As an output device for listening to the speech output.
11.4 BENCH SUPPORT

A rectangular plywood for hosting the mechanism, Camera, and RIP system all together
tightly.
11.5 FLASH LIGHT
To get the best quality of image so the ocr can extract all the words clearly.
11.6 POWER SUPPLY
It is the device that supplies electrical energy to the output loads. And also, it gives a
regulated power supply of +5V with a output current compatibility of 100mA.

11.7 PRESS BUTTON
A press-button or simply button is a simple switch mechanism for controlling some aspect of
a machine or a process. Press button is used to activate the program and headset for audio
output. Press Buttons are typically made out of very hard materials like plastic or metal. The
surface of button is usually flat or shaped to accommodate the human finger. So it can be
easily depressed or pushed.

CHAPTER – 12
SOFTWARE IMPLEMENTATION
Software is a set of instructions and programme which decides the functionality of hardware.
In our design we have used python 3. also installed OpenCV library, tesseract OCR and gtts.
12.1 PYTHON 3 PROGRAMMING
In our project Raspberry pi is instructed to do the task using python. As it is easy to use and
user-friendly language with lot of features and packages available. There is no need of
installation of it because inpithereis inbuilt software of python 3 which comes with pip 3 as
its package installer.
There are mainly 4 algorithmic structures for image processing
Algorithm 1: Embedding actions to GPIO inputs.
Input:
(1) C = Clicks/Press the push-button to choose particular Language (English, Bengali and
Hindi)
(2) B1 = Push-button 1 with GPIO Pin (27)
(5) C = 0 # Initial Count is Zero
(6) B1press = 0 # Button1 is Not Pressed
(7) B2press = 0 # Button2 is Not Pressed
Output:
(1) Audio output to speak about pressing the push-buttons.

(2) Call Subprocess.

1: if (B1 = Pressed) then
2: For every clicks
3: C = C+1.
4: B1press = 1 # Means Button1 is Pressed.
5: end if
7: if (B1press = 0) then
8: Press the First Button to choose your language.
9: end if
11: Scan Button Pressed.
12: Call Subprocess (“ocr.sh”shell script).
13: B2press = 1 # Means Button2 is Pressed.
14: end if
15: end if
19: Press the Second Button for capturing image.
20: end if
22: end if

25: Read Button Pressed.
26: Call Subprocess (“audio.sh”shell script).
27: end if
28: end if
29: end if
Algorithm 2: Python Program for Converting Original Image to Grayscale with

Binarization. Input:
I = Image File, which is taken using Raspberry Pi Camera
Output:
I = Image File, which is generated after binarization.
1: if (I = Exist) then
2: Converting original image to Grayscale.
3: Calculating Weighted Average for RGB pixel:
4: 0.299*pixel [0] + 0.587*pixel [1] + 0.114*pixel[2]
5: Convert to Grayscale by using Weighted_Average.
6: Binarizing the Grayscale image.
7: Convert grayscale image to binary by thresholding.
8: Save the image.
9: end if
14 Design of an Automatic Reader for the Visually …
Algorithm 3: Shell Script for OCR (ocr.sh). Input:
(1) C = Total Clicks, which are generated from “switch.py”python program
(2) I = Image File, which is generated after binarization
(3) For every Language: cl1=1, cl2=2 and cl3=3; where cl1 for English, cl2 for Bengali and
cl3 for Hindi

Output:
Generated text file for English/Bengali/Hindi Language (“OutputLanguage.txt”).
1: if (C = 0) then
3: Exit.
4: end if
5: if (C >3) then
6: Maximum press must be Three.
7: Exit.
8: end if
9: Capturing Image using Raspberry Pi Camera.
10: if (I = Exists) then
11: Convert Image.
12: end if
13: Running the Tessseract-OCR Engine.
14: if (C = cl1) then
15: Tesseract converts “img.jpg”to “EnglishOutput.txt”.
16: end if
18: Tesseract converts “img.jpg”to “BengaliOutput.txt”.
19: end if
21: Tesseract converts “img.jpg”to “HindiOutput.txt”.
22: end if

Algorithm 4: Shell Script for Playing Audio(audio.sh).
Input:
O = Generated text file (“OutputLanguage.txt”)
Output:
Audio to speak the output text.
1: if (O = Exist) then
2: Speak “OutputLanguage.txt”text file by using eSpeak.
3: end if
12.2 OPEN CV LIBRARY:
It stands for open-source computer vision library which contains set of algorithms and special
inbuilt functions that handles computer vision. It supports wide variety of programming
languages. The most commonly used is python. We can install open cv library in raspberry pi
by typing “sudo apt-get install python-OpenCV” on command window. Once it is
successfully installed, one can run python code by importing cv2 library of open cv.
In this open CV is used for capturing the image of books page using pi camera and apply it
inbuilt functions for pre- processing like de-skewing, noise removing, binarizing etc. so that
we get clear image for converting it to text by OCR module.
12.3 OPTICAL CHARACTER RECOGNITION:
OCR has played important role in this module. OCR or Optical Character Recognition is a
technology which is used to recognize text from printed scanned documents through the
optical mechanism. Tesseract Is an optional character recognition engine. To install it type
“sudo apt-get install tesseract-ocr” command in the terminal.
OCR software is used to convert image into text format. It is a conversion of image of typed
or
handwritten or printed text into machine encoded text. It is use for blind and visually
impaired

people also use for automatic number plate recognition. Tesseract is a type of OCR engine
with
matrix matching. The selection of Tesseract engine is due to its flexibility and extensibility of
machines and the fact that many communities are active researchers to develop this OCR
engine and also due to this reason Tesseract OCR can support149 languages. In this project
we are identifying English alphabets and also the two basic languages Marathi and Hindi.
For Different Languages:
1. We can check the languages available by typing “$ tesseract --list-langs” on the

terminal.
2. To download tesseract for specific language use “$ sudo apt-get install tesseract-ocr
LANG”, LANG is the three-letter code for the languages.
3. Download the. train data file for the languages and place it in $
TESSDATA_PREFIX. Directory. this should be same as where the Tess data
directory is installed.
4. Tesseract does not have feature to detect languages. To detect languages, install lang
detect via pip by typing “$ pip install lang detect” command.
12.4 GOOGLE TEXT TO SPEECH:

Google Text to Speech is an API developed by Google for reading out text on the
screen. i.e., which is use as a screen reader application. It is open-source software and
could be used by various programmers on different platform. This software helped in
the project as it was implemented using python. This API supports over 50 languages
and also sounds very natural. In this the programmer also has the option to change the
accent, the speed and the many other things just by changing the source code
according the preference.
Installation of this API is simple by running the following command in terminal “sudo
pip install gTTS”. This API helps to read out the converted text from tesseract in the

form of audio output by speaker. TTS, is a form of speech synthesis that converts text
into audio output.
CHAPTER – 13
SIMULATION ENVIRONMENT
The image to text and text to speech conversion is done by the OCR software installed in
raspberry pi. The conversion which is done in OCR can be simulated in MATLAB. The
conversion process in MATLAB includes the following processes.
1.Binary image conversion.
2. Complementation.
3.Segmentation and labelling’s.
4.Isolating the skeleton of character.
13.1 SAMPLE IMAGE
The following image which is captured by the webcam contains the following word. This
image is in the jpeg format which has to be converted into text.
13. 2 BINARY CONVERSION
In this section sample image is converted into binary format. The image which was a 3D
image initially is converted to 2D image. Binary 0 represents black colour of the characters.
Binary 1 represents white colour of the characters.

13.3 BOUNDARY MARKING
The area of the text is bordered and the boundary for each character is isolated. The boundary
for each character is programmed and it can vary from 0 to 255 bits of characters occupying
memory in the database.
13.4 SEGMENTATION AND LABELLING
The isolated blocks of characters are segmented and are automatically labelled for identity.
Image segmentation is the process of partitioning a digital image into multiple segments (sets
of pixels, also known as super pixels).

The result of image segmentation is a set of segments that collectively cover the entire image,
or a set of contours extracted from the image (see edge detection). Each of the pixels in a
region are similar with respect to some characteristic or computed property, such as color,
intensity, or texture. Adjacent regions are significantly different with respect to the same
characteristics. Connected-component labelling is used in computer vision to detect
connected regions in binary digital images, although color images and data with higher
dimensionality can also be processed. When integrated into an image recognition system or
human-computer interaction interface, connected component labelling can operate on a
variety of information. Blob extraction is generally performed on the resulting binary image
from a thresholding step. Blobs may be counted, filtered, and tracked.
13.5 FORMING CHARACTER SKELETON
Skeletonization is a process for reducing foreground regions in a binary image to a skeletal
remnant that largely preserves the extent and connectivity of the original region while
throwing away most of the original foreground pixels. To see how this works, imagine that
the foreground regions in the input binary image are made of some uniform slow-burning
material.
Light fires simultaneously at all points along the boundary of this region and watch the fire
move into the interior. At points where the fire travelling from two different boundaries
meets itself, the fire will extinguish itself and the points at which this happens form the so
called `quench line’.

13.6 AUDIO OUTPUT

The programming codes are run in MATLAB and corresponding output is generated. The
output is in the form of audio. The audio is heard using headphone or speaker connected to
the system. Each character of the word is spelled out first and then the entire word is read out.
CHAPTER – 14
METHODOLOGY

The power supply is given to the 5V micro-USB connector of raspberry pi through the
Switched Mode Power Supply. The web camera is connected to the USB port of
raspberry pi. The audio output is taken from the audio jack of the raspberry pi. The
Internet is connected to the Ethernet port in raspberry pi. The page to be read is placed on
a base and the camera is focused to capture the image. The captured image is processed.
The captured image is converted to text by the software. The software library converts the
given input to desired language text. The text is converted into speech by using voice to
text conversion module. The final output is taken by speaker. Speaker can also be
replaced by a headphone for convenience.
CHAPTER – 15

OBJECTIVES
Portable System
Fig: Regular Power Supply
Fig: Battery Power Supply

• Create efficient reading system for Visually impaired people.
• Efficient reader for students who prefer listening to concepts.
•

• Create a possibly low-cost alternative for all income groups.
• Create a reader which can automatically translate in required language.

CHAPTER – 16
CONCLUSION AND FUTURE WORK
The paper ‘Design of an Automatic Reader for the Visually Impaired using Raspberry Pi’
stepwise improved over some similar type projects. This model is built by using different
parts: image is captured by using a Raspberry Pi camera, recognize the text by using
Tesseract OCR framework and after that read the text through eSpeak TTS.To create a
more efficient model with good outcome, processing and optimization are more important.
Sometimes OCR gives incorrect text due to processing problem and for that reason final
result represents a meaningless text. Due to this reason preprocessing is an important part
in the whole system. Actually, if characters are cleared, big then there is no problem, but,
due to light issue or small character or presence of images in between text gives little bit
unexpected results. In our proposed method due to binarization, it gives better result over
English, Bengali and Hindi languages; also, there is no need of internet.
The proposed model is already in a feasible state of use due to the fixed box type model.
But there are some future works for this model:
• Recognizing process of large texts is sometimes slow, for that reason introducing
of some distributed technique is reserved for future work.
• Now this model can read only English, Bengali and Hindi language, but in future
it should allow many other languages.
• The increment of the menu options which allow the user to play and pause the
audio containing synthesized text.
• If any image or diagram is present in between text, then recognition of that image
or diagram is an important task, which is also reserved for future work.

CHAPTER – 17
REFERENCES
1. AmalJojie, Ashbin George, DhanyaDhanalalNayana J, Book Reader for Blind, IOSR
Journal of Engineering (IOSRJEN)
2. S. Aditi, SP. Annapoorani, A. Kanchana, Book Reader Using Raspberry Pi for Visually
Impaired, International Research Journal of Engineering and Technology (IRJET), Volume
05, Issue 03, March 2018
3. KA. Aslam, TanmoyKumarRoy, Sridhar rajan, T. Vijayan, B. KalaiSelviAbhinayathri,

Smart Reading System for Visually Impaired People, International Journal of MC Square
Scientific Research, Volume 09, Issue 02, 2017
4. V. Ajantha Devi, Dr. S SanthoshBaboo, Embedded optical character recognition on Tamil

text image using Raspberry Pi, International Journal of Computer Science Trends and
Technology (IJCST), Volume 02, Issue 04, Jul-Aug 2014
5. MallapaD. Gurav, Shruti S. Salimath, Shruti B. Hatti, Vijayalaxmi I. Byakod, B-LIGHT: A

Reading aid for the Blind People using OCR and OpenCV, International Journal of Scientific
Research Engineering & Technology (IJSRET), Volume06, Issue 05, May 2017
6. S. Rajkumar, SubbiahBharathi, Century Identification and Recognition of Ancient Tamil

Character Recognition, International Journal of Computer Applications, Volume 26,
Issue04, July 2011.
7. Rahul R. Patil, Adumbral R. Misal, Ketan R. Nalawade, Survey paper on Text Recognition
Using Image Processing, International Journal of Advanced Research in Electronics and
Communication Engineering (IJARECE), Volume 04, Issue 03, March 2015.
8. Praveen Choudhary, Dr.Vipin Kumar Jain, Text Extraction from an Image by using Digital
Image Processing, International Research Journal of Computer Science (IRJCS), Volume 05,
Issue 07, July 2018.
9. Nagaraja L, Nagarjun R S, Nishanth M Anand, Nithin D, Veena S Murthy, Vision based

Text Recognition using Raspberry Pi, International Journal of Computer Applications,
National Conference on Power Systems & Industrial Automation (NCPSIA), 2015.

10.AnushGoel,AkashSehrawat,AnkushPatil,PrashantChouguleSupriyaKhatavkar,Raspberry
Pi Based Reader for Blind People, International Research Journal of Engineering and
Technology (IRJET), Volume05, Issue 06,June2018
11. D. Velmurugan, M.S. Sonam, S. Umamaheswari ,S.Parthasarathy, K.R. Arun, A Smart

Reader for Visually Impaired People Using Raspberry PI, International Journal of
Engineering Science and Computing (IJESC), Volume 06, Issue 03, 2016.
12. Agalya, A., B. Nagaraj, and K. Rajasekaran. "Concentration control of continuous stirred
tank reactor using particle swarm optimization algorithm." Trans EngSci 1, no. 4 (2013): 57-
63.
13. Aaron James S, Sanjana S, Monisha M, OCR based automatic book reader for the
visually impaired using Raspberry PI, International Journal of Innovative Research in
Computer and Communication Engineering, Volume 04, Issue 7, January 2016.
14. AbhijithShaji, AbhishekAravindan, NishamRafeeque, Naveen K K, Reading assistant for

visually impaired people, International Research Journal of Engineering and Technology
(IRJET), Volume 05, Issue 04, April 2018.
15. Esra Ali Hassan, Esra Ali Hassan, Smart Glasses for the Visually Impaired People,
Computers Helping People with Special Needs, 15th International Conference, ICCHP, July
2016.

Visveswaraya Technological University: Design of An Automatic Reader For The Visually Impaired Using Raspberry Pi

Uploaded by

Copyright:

Available Formats

Visveswaraya Technological University: Design of An Automatic Reader For The Visually Impaired Using Raspberry Pi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Visveswaraya Technological University: Design of An Automatic Reader For The Visually Impaired Using Raspberry Pi

Uploaded by

Copyright:

Available Formats

VISVESWARAYA TECHNOLOGICAL UNIVERSITY

Jnana sangama, Khanapur road, Belagavi – 590018

Project Review Phase 1 [Second Review] A REPORT ON

UNDER THE GUIDANCE OF

DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING

DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING

This is to certify that the Project entitled “ DESIGN OF AN AUTOMATIC READER

SIGNATURE OF GUIDE SIGNATURE OF HOD SIGNATURE OF PRINCIPAL

…………………………… ………………………… …………………………………

INTERNAL GUIDE HOD PRINCIPAL

NAME OF THE EXAMINERS SIGNATURE WITH DATE

NAME USN NO SIGNATURE

MOHAMMED TAMEEM 1OX19EE018

Name of the student USN

MOHAMMED TAMEEM 1OX19EE018

DEPT OF EEE | TOCE 1

DEPT OF EEE | TOCE 2

DEPT OF EEE | TOCE 3

BACKGROUND / EXISTING SYSTEM

3.2 BRAILLE GRADES

DEPT OF EEE | TOCE 4

3.3 MODERN BRAILLE TECHNOLOGY

Computer technology has revolutionised the use of braille. For example:

3.4 DRAWBACKS OF EXISTING SYSTEM

• Errors cannot be erased.

• uses much more space on a page than a sighted writing system.

• Not all blind and visually impaired people use braille.

DEPT OF EEE | TOCE 5

• Cannot be read by a sighted person who has not learned it.

DEPT OF EEE | TOCE 6

SL NO AUTHORS REMARKS YEAR

A Review Paper on Raspberry Pi and its

A Review of Rechargeable Batteries for Portable

DEPT OF EEE | TOCE 7

Camera Reading For Blind People

He proposes Wearable Obstacle Avoidance

He proposes a system for converting English text

She proposes a Vision Based Assistive System for

DEPT OF EEE | TOCE 8

He proposes a Navigation System for blind people

He proposes a smart shopping assistant label

DEPT OF EEE | TOCE 9

Fig 1: Block Diagram of Proposed System

DEPT OF EEE | TOCE 10

5.1 BLOCK DIAGRAM DESCRIPTION

DEPT OF EEE | TOCE 11

Fig 6.1 Proposed System Diagram

DEPT OF EEE | TOCE 12

micro-USB connector. The power supply is given to the 5V micro-USB connector of

DEPT OF EEE | TOCE 13

7.4 FEATURE EXTRACTION

 Height of the character;

DEPT OF EEE | TOCE 14

 Numbers of horizontal lines-short and long;

7.5 IMAGE TO TEXT CONVERTER

7.6 TEXT TO SPEECH

Fig – 7 Flow of Process

DEPT OF EEE | TOCE 15

DEPT OF EEE | TOCE 17

4: 0.299pixel [0] + 0.587pixel [1] + 0.114*pixel[2]