0% found this document useful (0 votes)

7 views

Python Project

The document describes a Python project that uses OpenCV and Tesseract OCR to extract text from an image file. It imports required packages, reads an image, performs preprocessing like grayscale conversion and thresholding, finds contours, crops text blocks and applies OCR to recognize the text.

Uploaded by

study.aaaashishhh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Python Project

Uploaded by

study.aaaashishhh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

PYTHON PROJECT

CODE :

i# Import required packages

import cv2
import pytesseract

# Mention the installed location of Tesseract-OCR in your system

pytesseract.pytesseract.tesseract_cmd = '/opt/homebrew/bin/tesseract'

# Read image from which text needs to be extracted

img = cv2.imread("as.jpg")

# Preprocessing the image starts

# Convert the image to gray scale

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Performing OTSU threshold

ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

# Specify structure shape and kernel size.

# Kernel size increases or decreases the area
# of the rectangle to be detected.
# A smaller value like (10, 10) will detect
# each word instead of a sentence.
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))

# Applying dilation on the threshold image

dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)

# Finding contours
contours, hierarchy = cv2. ndContours(dilation, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE)

# Creating a copy of image

im2 = img.copy()

# A text le is created and ushed

le = open("recognized.txt", "w+")
le.write("")
le.close()

# Looping through the identi ed contours

# Then rectangular part is cropped and passed on
# to pytesseract for extracting text from it
# Extracted text is then written into the text le
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)

# Drawing a rectangle on copied image

rect = cv2.rectangle(im2, (x, y), (x + w, y + h), (0, 255, 0), 2)

# Cropping the text block for giving input to OCR

cropped = im2[y:y + h, x:x + w]

# Open the le in append mode

fi
fi
fi
fi
fi
fi
fl
fi
fi
le = open("recognized.txt", "a")

# Apply OCR on the cropped image

text = pytesseract.image_to_string(cropped)

# Appending the text into le

le.write(text)
le.write("\n")

# Close the le
le.close

OUTPUT :

IMAGE FILE :

RECOGNIZED TEXT :
fi
fi
fi
fi
fi
fi

AQ152986484649en 001301
No ratings yet
AQ152986484649en 001301
92 pages
Word Extraction-1
No ratings yet
Word Extraction-1
2 pages
Module # 10C - Text Recognition with Tesseract OCR
No ratings yet
Module # 10C - Text Recognition with Tesseract OCR
8 pages
madmaze_pytesseract_ A Python wrapper for Google Tesseract
No ratings yet
madmaze_pytesseract_ A Python wrapper for Google Tesseract
5 pages
Ahsbsdns
No ratings yet
Ahsbsdns
1 page
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
No ratings yet
We Used Tesseract OCR For Train The Data and Recognize The Character From Digital Image Under The Apache 2
1 page
Python Quebrar Captch Python Ocr
No ratings yet
Python Quebrar Captch Python Ocr
4 pages
Extracting Text From Scanned PDF Using Pytesseract & Open CV
No ratings yet
Extracting Text From Scanned PDF Using Pytesseract & Open CV
9 pages
Ocr Nanonets Tesseract
No ratings yet
Ocr Nanonets Tesseract
39 pages
Setting Up A Simple OCR Server: by Real Python 37 Comments
No ratings yet
Setting Up A Simple OCR Server: by Real Python 37 Comments
8 pages
OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
No ratings yet
OpenCV OCR and Text Recognition With Tesseract - PyImageSearch
65 pages
Python Tesseract
No ratings yet
Python Tesseract
2 pages
Efficient, Lexicon-Free OCR Using Deep Learning
No ratings yet
Efficient, Lexicon-Free OCR Using Deep Learning
7 pages
Code Snippets
No ratings yet
Code Snippets
2 pages
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
No ratings yet
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
7 pages
Remove Text from Images using CV2 and Keras-OCR _ by Carlo Borella _ Towards Data Science
No ratings yet
Remove Text from Images using CV2 and Keras-OCR _ by Carlo Borella _ Towards Data Science
18 pages
ML Report
No ratings yet
ML Report
5 pages
LẬP TRÌNH XỬ LÝ ẢNH
No ratings yet
LẬP TRÌNH XỬ LÝ ẢNH
8 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
(2022-MM) SPTS Single-Point Text Spotting
No ratings yet
(2022-MM) SPTS Single-Point Text Spotting
12 pages
HWTR
No ratings yet
HWTR
5 pages
Image To Text1
No ratings yet
Image To Text1
2 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Study of Tesseract OCR
No ratings yet
Study of Tesseract OCR
12 pages
Optical Character Recognition (OCR) in Python
No ratings yet
Optical Character Recognition (OCR) in Python
110 pages
98DSP-PPT
No ratings yet
98DSP-PPT
8 pages
Approach 4
No ratings yet
Approach 4
3 pages
ML Number Plate
No ratings yet
ML Number Plate
21 pages
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
No ratings yet
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
20 pages
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
No ratings yet
Text Extraction From Image: Team Members CH - Suneetha (19mcmb22) Mohit Sharma (19mcmb13)
20 pages
Step by Step Process
No ratings yet
Step by Step Process
8 pages
CV Lab Manual
No ratings yet
CV Lab Manual
45 pages
Exp.3
No ratings yet
Exp.3
21 pages
Optical Character Recognition Research: Index
No ratings yet
Optical Character Recognition Research: Index
6 pages
Tesseract Ocr
No ratings yet
Tesseract Ocr
3 pages
Refined Shape
No ratings yet
Refined Shape
2 pages
An Overview of Tesseract OCR Engine
No ratings yet
An Overview of Tesseract OCR Engine
15 pages
Capstonepres
No ratings yet
Capstonepres
12 pages
Ip Lab Programs
No ratings yet
Ip Lab Programs
34 pages
Programs 8,11,12
No ratings yet
Programs 8,11,12
5 pages
Tesseract I CD Ar 2007
No ratings yet
Tesseract I CD Ar 2007
5 pages
Handwritten Text Recognition Using Tensorflow 2.0: Computer Vision
No ratings yet
Handwritten Text Recognition Using Tensorflow 2.0: Computer Vision
37 pages
Removing Face Recognition Rectangle When The Program Doesn't Recognize A Face With Python OpenCV
0% (1)
Removing Face Recognition Rectangle When The Program Doesn't Recognize A Face With Python OpenCV
2 pages
Updated Code That Flags Faulty Jpgs
No ratings yet
Updated Code That Flags Faulty Jpgs
3 pages
Drawing Functions
No ratings yet
Drawing Functions
23 pages
Prac 2 ACV-merged
No ratings yet
Prac 2 ACV-merged
8 pages
Code
No ratings yet
Code
4 pages
Exp4 2
No ratings yet
Exp4 2
7 pages
Handwritten Text Recognition and Digital Text Conversion
No ratings yet
Handwritten Text Recognition and Digital Text Conversion
2 pages
Tesseract
No ratings yet
Tesseract
6 pages
CHANGELOG
No ratings yet
CHANGELOG
2 pages
Emgucv - OCRForm - Cs at Master Emgucv - Emgucv GitHub
No ratings yet
Emgucv - OCRForm - Cs at Master Emgucv - Emgucv GitHub
8 pages
IoT Report
No ratings yet
IoT Report
10 pages
Lab 04 Digital Image Processing Practice
No ratings yet
Lab 04 Digital Image Processing Practice
9 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
CV - Expt2
No ratings yet
CV - Expt2
21 pages
Dip Lab Short Code-1
No ratings yet
Dip Lab Short Code-1
7 pages
CV Record
No ratings yet
CV Record
48 pages
Manual
No ratings yet
Manual
2 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Venkat J - SR .Net Fullstack Developer-Ewtrenxty5kfcvyb
No ratings yet
Venkat J - SR .Net Fullstack Developer-Ewtrenxty5kfcvyb
7 pages
As WS CS G7 Ch 7 Python- Conditions and Loops
No ratings yet
As WS CS G7 Ch 7 Python- Conditions and Loops
4 pages
Module-5(Oops With Java)
No ratings yet
Module-5(Oops With Java)
19 pages
REST API - FLUTTERs
No ratings yet
REST API - FLUTTERs
16 pages
Internet & Web Design Concepts1
No ratings yet
Internet & Web Design Concepts1
26 pages
Friendly Map Android Application For Disabled People
No ratings yet
Friendly Map Android Application For Disabled People
5 pages
SJB Institute of Technology: "Web Technology Laboratory With Mini Project"
No ratings yet
SJB Institute of Technology: "Web Technology Laboratory With Mini Project"
50 pages
Define Custom Attributes
No ratings yet
Define Custom Attributes
6 pages
Introduction To PTC Windchill PDM Essentials 11.1 For Light Users
No ratings yet
Introduction To PTC Windchill PDM Essentials 11.1 For Light Users
6 pages
Chapter 7 A Quality Management
No ratings yet
Chapter 7 A Quality Management
49 pages
ORACLE PLSQL Midterm Part 4 SOLUTIONS
No ratings yet
ORACLE PLSQL Midterm Part 4 SOLUTIONS
17 pages
(Ebook) Python Distilled (Developer's Library) (1st Edition) by David Beazley ISBN 9780134173276, 0134173279 - Download the full ebook now to never miss any detail
100% (1)
(Ebook) Python Distilled (Developer's Library) (1st Edition) by David Beazley ISBN 9780134173276, 0134173279 - Download the full ebook now to never miss any detail
85 pages
OraFormsFaces Devguide
No ratings yet
OraFormsFaces Devguide
145 pages
hotel management system
No ratings yet
hotel management system
43 pages
Compiler-Interpreter-Compiled and Interpreted Language
No ratings yet
Compiler-Interpreter-Compiled and Interpreted Language
3 pages
Burp Suite: by - Gurashish Arneja, Bailey Kumar
No ratings yet
Burp Suite: by - Gurashish Arneja, Bailey Kumar
25 pages
3250817_E_20241219
No ratings yet
3250817_E_20241219
6 pages
MQ Questions Answers
No ratings yet
MQ Questions Answers
4 pages
Dokumen - Tips Advanced Debugging in Abap
No ratings yet
Dokumen - Tips Advanced Debugging in Abap
46 pages
Presentation On Android OS
No ratings yet
Presentation On Android OS
25 pages
Chapter Five: Stack and Queues
No ratings yet
Chapter Five: Stack and Queues
40 pages
College Erm System Report
No ratings yet
College Erm System Report
55 pages
Command Prompt - 11 Basic Commands You Should Know (CD, Dir, Mkdir, Etc.)
No ratings yet
Command Prompt - 11 Basic Commands You Should Know (CD, Dir, Mkdir, Etc.)
3 pages
Object-Oriented and Classical Software Engineering: Stephen R. Schach
No ratings yet
Object-Oriented and Classical Software Engineering: Stephen R. Schach
50 pages
Hytera SmartDispatch-Net Installation Guide V4.0
No ratings yet
Hytera SmartDispatch-Net Installation Guide V4.0
79 pages
Neuromodulation System Integration Program SAP Interfaces Test Plan
No ratings yet
Neuromodulation System Integration Program SAP Interfaces Test Plan
9 pages
1 Chapter 24 Quality Management
No ratings yet
1 Chapter 24 Quality Management
35 pages
Advanced SQL Injection Techniques
No ratings yet
Advanced SQL Injection Techniques
13 pages
bidhan proposal
No ratings yet
bidhan proposal
20 pages

Python Project

Uploaded by

Python Project

Uploaded by

PYTHON PROJECT

i# Import required packages

# Mention the installed location of Tesseract-OCR in your system

# Read image from which text needs to be extracted

# Preprocessing the image starts

# Convert the image to gray scale

# Performing OTSU threshold

# Specify structure shape and kernel size.

# Applying dilation on the threshold image

# Creating a copy of image

# A text le is created and ushed

# Looping through the identi ed contours

# Drawing a rectangle on copied image

# Cropping the text block for giving input to OCR

# Open the le in append mode

# Apply OCR on the cropped image

# Appending the text into le

You might also like