Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Franky1/Streamlit-Tesseract

Repository files navigation

Streamlit Tesseract OCR ๐Ÿ”Ž ๐Ÿ“„

Streamlit project with Tesseract OCR running on Streamlit Cloud.

Streamlit

App Usage ๐Ÿ’ป

  1. Upload an image with text on it
  2. Select the language
  3. Select the image preprocessing options (if needed) and check the result in the preview
  4. Crop the image to the text area (if needed)
  5. Run the OCR and check the result in the text preview
  6. Adjust the settings or image preprocessing and run the OCR again (if needed)
  7. Download the result as a text file or copy from the text preview

Languages ๐ŸŒ

Installed languages for Tesseract OCR

๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ฎ๐Ÿ‡น ๐Ÿ‡ต๐Ÿ‡น ๐Ÿ‡จ๐Ÿ‡ฟ ๐Ÿ‡ต๐Ÿ‡ฑ

Status โœ”๏ธ

Streamlit application is working - 04.06.2024

ToDo ๐Ÿ“

Future Ideas ๐Ÿ’ก

Libraries ๐Ÿ“š

Tesseract

EasyOCR

pdf2image

OpenCV

OpenCV is used for image preprocessing before running OCR with Tesseract.

Pillow

Image Preprocessing ๐Ÿ–ผ๏ธ

Grayscale Conversion

import cv2
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# or
coefficients = [1,0,0] # Gives blue channel all the weight
# for standard gray conversion, coefficients = [0.114, 0.587, 0.299]
m = np.array(coefficients).reshape((1,3))
blue = cv2.transform(im, m)

Brightness and Contrast

Image Rotation ๐Ÿ”„

Methods to rotate an image with different libraries.

... with Pillow ๐Ÿ”„

https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.rotate

from PIL import Image
with Image.open("hopper.jpg") as im:
    # Rotate the image by 60 degrees counter clockwise
    theta = 60
    white = (255,255,255)
    # Angle is in degrees counter clockwise
    im_rotated = im.rotate(angle=theta, resample=Image.Resampling.BICUBIC, expand=1, fillcolor=white)

... with OpenCV ๐Ÿ”„

destructive rotation, loses image data

import cv2
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1)
rotated = cv2.warpAffine(image, M, (w, h))

... with imutils ๐Ÿ”„

non-destructive rotation, keeps image data

import imutils
rotate = imutils.rotate_bound(image, angle)

... with scipy ๐Ÿ”„

destructive or non-destructive rotation, can be chosen py parameter reshape

from scipy.ndimage import rotate as rotate_image
rotated_img1 = rotate_image(input, angle, reshape, mode, cval)