Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Tutorial: OCR with PaddleOCR (PP-OCR)

In this post, I will test PP-OCR system for the optical character recognition system.

Anh Tuan
4 min readJun 20, 2022
Photo by Fabian on Unsplash

PP-OCR is a practical ultra-lightweight OCR system and can be easily deployed on edge devices such as cameras, and mobiles,…I wrote reviews about the algorithms and strategies used in the model. You can read here:

1. Installation

Some configuration information on my computer:

  • OS: Windows 10 Pro 64-bit
  • CPU: Intel(R) Core(TM) i5–9500 CPU @ 3.00GHz 3.00 GHz
  • RAM: 64 GB
  • GPU: NVIDIA GeForce RTX 2060 6GB
  • Python Environment: Python 3.8.5

Firstly, install the official code from GitHub:

git clone https://github.com/PaddlePaddle/PaddleOCR.git

Next, I install PaddlePaddle. If you have CUDA 9 or CUDA 10 installed on your machine, run:

python3 -m pip install paddlepaddle-gpu

If you have only CPU, run:

python3 -m pip install paddlepaddle

Then, I install the pretrained model. You can find it here:

There are many trained models of different sizes.

If you want to use multiple languages (Korean, Japanese, …), you can download them from here:

2. Inference

For example, I use the English ultra-lightweight PP-OCRv3 model for inference. I download inference models for detection, direction classification, and recognition and save them to /inference/det, /inference/cls/, /inference/reg respectively and extract them.

After downloading, folder ./tools/infer contains files for prediction.

Run file predict_system.py for OCR. Run:

python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/254.jpg" --det_model_dir="./inference/det/en_PP-OCRv3_det_infer/" --cls_model_dir="./inference/cls/ch_ppocr_mobile_v2.0_cls_infer/" --rec_model_dir="./inference/reg/en_PP-OCRv3_rec_infer/"  --rec_char_dict_path="./ppocr/utils/en_dict.txt"

The parameter image_dir specifies image path, the parameter det_model_dir specifies the path to detect the inference model, and the parameter cls_model_dir specifies the path to angle classification inference model and the parameter rec_model_dir specifies the path to identify the inference model, the rec_char_dict_path specifies Engish dictionary path. The visualized recognition results are saved to the ./inference_results folder by default. There are many parameters for you to adjust. You can see the details in the file utility.py

You can also run each model separately in files predict_det.py , predict_cls.py , predict_rec.py

3. Training

For training, You can see details in the authors’s source:

  • Training text detection:
  • Traing Text Direction Classification:
  • Training text recoginition:

Conclusion

In this post, I wrote about how to set up, train, and test the PPOCR model. You can find the official source code at:

If you have any questions or want me to check out the other open sources, please comment below or contact me via linkedin or github

If you enjoyed this, please consider supporting me.

--

--

Anh Tuan

Artificial Intelligence and Data Science Enthusiast