pdf-ocr-extraction

Star

Here are 10 public repositories matching this topic...

skylander86 / lambda-text-extractor

Star

AWS Lambda functions to extract text from various binary formats.

pdf ocr aws-lambda lambda-functions tesseract text-extraction searchable-pdfs pdf-ocr-extraction

Updated Feb 7, 2018
Python

Clearedge-AI / clearedge

Star

Build a RAG preprocessing pipeline

pdf ocr haystack pdf-to-text document-parser pdf-ocr-extraction pdf-to-json table-recognition table-detection llm langchain llamaindex retrieval-augmented-generation rag-pipeline

Updated Apr 7, 2024
Jupyter Notebook

omaxel / pdf-ocr

Star

Recognize page content of a PDF as text using Tesseract and Ghostscript.

pdf ghostscript ocr csharp tesseract-ocr pdf-ocr-extraction

Updated Jan 9, 2018
C#

Achiwilms / OCR-Wizard

Star

A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.

python pdf ocrmypdf ocr-recognition pdf-ocr-extraction ocr-python searchable-pdf ocr-pdf pdf-ocr

Updated Oct 28, 2023
Python

BBC-Esq / Fast-PyOCR

Star

Simple and reliable script to conduct high-quality fast OCR on a PDF

pdf ocr tesseract-ocr pdf-ocr-extraction ocr-python tesseract-ocr-engine windows-ocr pdf-ocr

Updated Sep 22, 2024
Python

lakshay1296 / OCR_Django_App_Beta

Star

Example Django-Python project which contains OCR, PDF to OCR PDF, Text Similarity/Dissimilarity, PDF to PNG converter modules.

imagemagick django-application python27 html-css-javascript ocr-recognition django-project pdf-ocr-extraction ocr-python

Updated May 21, 2019
Python

fsdesa / pdf-ocr-service

Star

PDF OCR service in docker

java docker pdf ocr afip factura-afip pdf-ocr-extraction

Updated Oct 8, 2022
Java

VerisimilitudeX / ocr_pdf2txt

Star

Use Optical Character Recognition technology to convert scanned PDFs into TXT files locally.

ocr-recognition pdf-document-processor pdf-ocr-extraction ocr-python

Updated Jan 22, 2025
Python

Firefox-1998 / UtilityPDF

Star

Utility with collect in one place, some operations that are normally done on PDF files.

pdf utility ocr csharp convert pdf-converter merge rtf docx compress pdf-merge pdf-ocr-extraction pdf-compression

Updated Aug 18, 2024
C#

mcagriaksoy / diff_merge_pdf

Star

A tool for compare, merge, display difference and make OCR between the PDFs.

pdf-viewer pdf-generator pdf-merger ocr-recognition pdf-comparison x-ray-images ocr-text-reader diff-tool pdf-document-processor pdf-ocr-extraction pyqt6-desktop-application pymupdf-fitz pdf-ocr pdf-visual-testing diff-tool-pdf

Updated Jan 21, 2024
Python

Improve this page

Add a description, image, and links to the pdf-ocr-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pdf-ocr-extraction topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf-ocr-extraction

Here are 10 public repositories matching this topic...

skylander86 / lambda-text-extractor

Clearedge-AI / clearedge

omaxel / pdf-ocr

Achiwilms / OCR-Wizard

BBC-Esq / Fast-PyOCR

lakshay1296 / OCR_Django_App_Beta

fsdesa / pdf-ocr-service

VerisimilitudeX / ocr_pdf2txt

Firefox-1998 / UtilityPDF

mcagriaksoy / diff_merge_pdf

Improve this page

Add this topic to your repo