pyimagesearch-com-2020-09-21-opencv-automatic-license-number-plate-recognition-anpr-with-python-
pyimagesearch-com-2020-09-21-opencv-automatic-license-number-plate-recognition-anpr-with-python-
AUTOMATIC LICENSE PLATE RECOGNITION OPENCV TUTORIALS OPTICAL CHARACTER RECOGNITION (OCR) TUTORIALS
OpenCV: Automatic
License/Number Plate
Recognition (ANPR) with
Python
by Adrian Rosebrock on September 21, 2020
Course information:
84+ total courses • 114+
hours of on-demand
video • Last updated:
February 2024
★★★★★
4.84 (128 Ratings) •
16,000+ Students
Enrolled
✓ 84 courses on
In this tutorial, you will build a basic Automatic License/Number Plate Recognition essential computer vision,
OpenCV topics
An ANPR-specific dataset, preferably with plates from various countries and in ✓ 94 Certificates of
different conditions, is essential for training robust license plate recognition Completion
systems, enabling the model to handle real-world diversity and complexities.
✓ 114+ hours of on-
demand video
Roboflow has free tools for each stage of the computer vision pipeline that will
✓ Brand new courses
streamline your workflows and supercharge your productivity.
released every month,
Sign up or Log in to your Roboflow account to access state of the art dataset ensuring you can keep up
libaries and revolutionize your computer vision pipeline. with state-of-the-art
techniques
You can start by choosing your own datasets or using our PyimageSearch’s
✓ Pre-configured
assorted library of useful datasets.
Jupyter Notebooks in
Bring data in any of 40+ formats to Roboflow, train using any state-of-the-art model Google Colab
architectures, deploy across multiple platforms (API, NVIDIA, browser, iOS, etc), and ✓ Run all code examples
works on Windows,
dev environment
configuration required!)
✓ Access to centralized
tutorials on
PyImageSearch
✓ Easy one-click
datasets, pre-trained
models, etc.
✓ Access on mobile,
ANPR is one of the most requested topics here on the PyImageSearch blog. Join Now
I’ve covered it in detail inside the PyImageSearch Gurus course, and this blog post
also appears as a chapter in my upcoming Optical Character Recognition book. If
you enjoy the tutorial, you should definitely take a look at the book for more OCR
Picked For You
educational content and case studies!
ANPR performed in controlled lighting conditions with predictable license plate OCR a document,
form, or invoice with
types can use basic image processing techniques.
Tesseract, OpenCV,
and Python
More advanced ANPR systems utilize dedicated object detectors, such as HOG
+ Linear SVM, Faster R-CNN, SSDs, and YOLO, to localize license plates in
images.
State-of-the-art ANPR software utilizes Recurrent Neural Networks (RNNs) and
Long Short-Term Memory networks (LSTMs) to aid in better OCR’ing of the text
from the license plates themselves.
Image alignment and
And even more advanced ANPR systems use specialized neural network registration with
architectures to pre-process and clean images before they are OCR’d, thereby OpenCV
Several compounding factors make ANPR incredibly challenging, including finding Recognizing digits
with OpenCV and
a dataset you can use to train a custom ANPR model! Large, robust ANPR datasets
Python
that are used to train state-of-the-art models are closely guarded and rarely (if ever)
released publicly:
For that reason, you’ll see ANPR companies acquired not for their ANPR system but
for the data itself!
To learn how to build a basic Automatic License Plate Recognition system with
OpenCV and Python, just keep reading.
It was a beautiful summer day. Sun shining. Not a cloud in the sky. A soft breeze
blowing. Perfect. Of course, I had my windows down, my music turned up, and I had
totally zoned out — not a care in the world.
I didn’t even notice when I drove past a small gray box discreetly positioned along
the side of the highway.
Sure enough, I had unknowingly driven past a speed-trap camera doing 78 MPH in
a 65 MPH zone.
That speeding camera caught me with my foot on the pedal, quite literally, and it
had the pictures to prove it too. There is was, clear as day! You could see the
license plate number on my old Honda Civic (before it got burnt to a crisp in an
electrical fire.)
Now, here’s the ironic part. I knew exactly how their Automatic License/Number
Plate Recognition system worked. I knew which image processing techniques the
developers used to automatically localize my license plate in the image and extract
the plate number via OCR.
In this tutorial, my goal is to teach you one of the quickest ways to build such an
Automatic License/Number Plate Recognition system.
Using a bit of OpenCV, Python, and Tesseract OCR knowledge, you could help your
homeowners’ association monitor cars that come and go from your neighborhood.
In the first part of this tutorial, you’ll learn and define what Automatic
License/Number Plate Recognition is. From there, we’ll review our project structure.
I’ll then show you how to implement a basic Python class (aptly named
PyImageSearchANPR) that will localize license plates in images and then OCR
the characters. We’ll wrap up the tutorial by examining the results of our ANPR
system.
Step #3: Apply some form of Optical Character Recognition (OCR) to recognize
the extracted characters
Fast-moving vehicles
Obstructions
Additionally, large and robust ANPR datasets for training/testing are difficult to
obtain due to:
Therefore, the first part of an ANPR project is usually to collect data and amass
enough example plates under various conditions.
So let’s assume we don’t have a license plate dataset (quality datasets are hard to
come by). That rules out deep learning object detection, which means we’re going
to have to exercise our traditional computer vision knowledge.
I agree that it would be nice if we had a trained object detection model, but today I
want you to rise to the occasion.
Before long, we’ll be able to ditch the training wheels and consider working for a
toll technology company, red-light camera integrator, speed ticketing system, or
parking garage ticketing firm in which we need 99.97% accuracy.
Given these limitations, we’ll be building a basic ANPR system that you can use as
a starting point for your own projects.
$ workon {your_env} # replace with the name of your Python virtual environment
$ pip install opencv-contrib-python
$ pip install imutils
$ pip install scikit-image
Then it’s time to install Tesseract and its Python bindings. If you haven’t already
installed Tesseract/PyTesseract software, please follow the instructions in the “How
to install Tesseract 4” section of my blog post OpenCV OCR and text recognition
with Tesseract. This will configure and confirm that Tesseract OCR and PyTesseract
bindings are ready to go.
Project structure
If you haven’t done so, go to the “Downloads” section and grab both the code and
dataset for today’s tutorial. You’ll need to unzip the archive to find the following:
$ tree --dirsfirst
.
├── license_plates
│ ├── group1
│ │ ├── 001.jpg
│ │ ├── 002.jpg
│ │ ├── 003.jpg
│ │ ├── 004.jpg
│ │ └── 005.jpg
│ └── group2
│ ├── 001.jpg
│ ├── 002.jpg
│ └── 003.jpg
├── pyimagesearch
│ ├── anpr
│ │ ├── __init__.py
│ │ └── anpr.py
│ └── __init__.py
└── ocr_license_plate.py
5 directories, 12 files
Now that we have the lay of the land, let’s walk through our two Python scripts,
which locate and OCR groups of license/number plates and display the results.
As I mentioned before, we’ll keep our code neat and organized using a Python
class appropriately named PyImageSearchANPR. This class provides a reusable
means for license plate localization and character OCR operations.
class PyImageSearchANPR:
def __init__(self, minAR=4, maxAR=5, debug=False):
# store the minimum and maximum rectangular aspect ratio
# values along with whether or not we are in debug mode
self.minAR = minAR
self.maxAR = maxAR
self.debug = debug
If you’ve been following along with my previous OCR tutorials, you might
recognize some of our imports. Scikit-learn’s clear_ border function may be
unfamiliar to you, though — this method assists with cleaning up the borders of
images.
minAR: The minimum aspect ratio used to detect and filter rectangular license
plates, which has a default value of 4
maxAR: The maximum aspect ratio of the license plate rectangle, which has a
default value of 5
The aspect ratio range (minAR to maxAR) corresponds to the typical rectangular
dimensions of a license plate. Keep the following considerations in mind if you
need to alter the aspect ratio parameters:
European and international plates are often longer and not as tall as United
States license plates. In this tutorial, we’re not considering U.S. license/number
plates.
Sometimes, motorcycles and large dumpster trucks mount their plates sideways;
this is a true edge case that would have to be considered for a highly accurate
license plate system (one we won’t consider in this tutorial).
Some countries and regions allow for multi-line plates with a near 1:1 aspect ratio;
again, we won’t consider this edge case.
Each of our constructor parameters becomes a class variable on Lines 12-14 so the
methods in the class can access them.
With our constructor ready to go, let’s define a helper function to display results at
various points in the imaging pipeline when in debug mode:
title: The desired OpenCV window title. Window titles should be unique;
otherwise OpenCV will replace the image in the same-titled window rather than
creating a new one.
waitKey: A flag to see if the display should wait for a keypress before
completing.
Lines 19-24 display the debugging image in an OpenCV window. Typically, the
waitKey boolean will be False. However, in this tutorial we have set it to True so
we can inspect debugging images and dismiss them when we are ready.
Our first ANPR method helps us to find the license plate candidate contours in an
image:
gray: This function assumes that the driver script will provide a grayscale image
containing a potential license plate.
keep: We’ll only return up to this many sorted license plate candidate contours.
We’re now going to make a generalization to help us simplify our ANPR pipeline.
Let’s assume from here forward that most license plates have a light background
(typically it is highly reflective) and a dark foreground (characters).
I realize there are plenty of cases where this generalization does not hold, but let’s
continue working on our proof of concept, and we can make accommodations for
inverse plates in the future.
If your debug option is on, you’ll see a blackhat visualization similar to the one in
Figure 2 (bottom):
As you can see from above, the license plate characters are clearly visible!
In our next step, we’ll find regions in the image that are light and may contain
license plate characters:
Using a small square kernel (Line 35), we apply a closing operation (Lines 36) to fill
small holes and help us identify larger structures in the image. Lines 37 and 38
perform a binary threshold on our image using Otsu’s method to reveal the light
regions in the image that may contain license plate characters.
Figure 3 shows the effect of the closing operation combined with Otsu’s inverse
binary thresholding. Notice how the regions where the license plate is located are
almost one large white surface.
Figure 3 shows the region that includes the license plate standing out.
The Scharr gradient will detect edges in the image and emphasize the boundaries
of the characters in the license plate:
As you can see above, the license plate characters appear noticeably different from
the rest of the image.
We can now smooth to group the regions that may contain boundaries to license
plate characters:
Here we apply a Gaussian blur to the gradient magnitude image (gradX) (Line 54).
Again we apply a closing operation (Line 55) and another binary threshold using
Otsu’s method (Lines 56 and 57).
Figure 5 shows a contiguous white region where the license plate characters are
located:
Figure 5: Blurring, closing, and thresholding operations using OpenCV and Python
result in a contiguous white region on top of the license plate/number plate
characters.
At first glance, these results look cluttered. The license plate region is somewhat
defined, but there are many other large white regions as well. Let’s see if we can
eliminate some of the noise:
Figure 6: Erosions and dilations with OpenCV and Python clean up our thresholded
image, making it easier to find our license plate characters for our ANPR system.
As you can see in Figure 6, the erosion and dilation operations cleaned up a lot of
noise in the previous result from Figure 5. We clearly aren’t done yet though.
Let’s add another step to the pipeline, in which we’ll put our light regions image
to use:
# take the bitwise AND between the threshold result and the
# light regions of the image
thresh = cv2.bitwise_and(thresh, thresh, mask=light)
thresh = cv2.dilate(thresh, None, iterations=2)
thresh = cv2.erode(thresh, None, iterations=1)
self.debug_imshow("Final", thresh, waitKey=True)
Back on Lines 35-38, we devised a method to highlight lighter regions in the image
(keeping in mind our established generalization that license plates will have a light
background and dark foreground).
This light image serves as our mask for a bitwise-AND between the thresholded
result and the light regions of the image to reveal the license plate candidates
(Line 68). We follow with a couple of dilations and an erosion to fill holes and clean
up the image (Lines 69 and 70).
Our "Final" debugging image is shown in Figure 7. Notice that the last call to
debug_imshow overrides waitKey to True, ensuring that as a user, we can
inspect all debugging images up until this point and press a key when we are
ready.
You should notice that our license plate contour is not the largest, but it’s far from
being the smallest. At a glance, I’d say it is the second or third largest contour in the
image, and I also notice the plate contour is not touching the edge of the image.
Reverse-sort them according to their pixel area while only keeping at most keep
contours
Return the resulting sorted and pruned list of cnts (Line 82).
Take a step back to think about what we’ve accomplished in this method. We’ve
accepted a grayscale image and used traditional image processing techniques with
an emphasis on morphological operations to find a selection of candidate contours
that might contain a license plate.
I know what you are thinking: “Why haven’t we applied deep learning object
detection to find the license plate? Wouldn’t that be easier?”
While that is perfectly acceptable (and don’t get me wrong, I love deep learning!), it
is a lot of work to train such an object detector on your own. We’re talking requires
countless hours to annotate thousands of images in your dataset.
But remember we didn’t have the luxury of a dataset in the first place, so the
method we’ve developed so far relies on so-called “traditional” image processing
techniques.
If you’re hungry to learn the ins and outs of morphological operations (and want to
be a more well-rounded computer vision engineer), I suggest you enroll in the
PyImageSearch Gurus course.
In this next method, our goal is to find the most likely contour containing a license
plate from our set of candidates. Let’s see how it works:
Before we begin looping over the license plate contour candidates, first we
initialize variables that will soon hold our license plate contour (lpCnt) and license
plate region of interest (roi) on Lines 87 and 88.
Starting on Line 91, our loop begins. This loop aims to isolate the contour that
contains the license plate and extract the region of interest of the license plate
itself. We proceed by determining the bounding box rectangle of the contour, c
(Line 94).
Computing the aspect ratio of the contour’s bounding box (Line 95) will help us
ensure our contour is the proper rectangular shape of a license plate.
As you can see in the equation, the aspect ratio is a relationship between the width
and height of the rectangle.
If the contour’s bounding box ar does not meet our license plate expectations,
then there’s no more work to do. The roi and lpCnt will remain as None, and it is
up to the driver script to handle this scenario.
Hopefully, the aspect ratio is acceptable and falls within the bounds of a typical
license plate’s minAR and maxAR. In this case, we assume that we have our
winning license plate contour! Let’s go ahead and populate lpCnt and our roi:
roi is extracted via NumPy slicing (Line 103) and subsequently binary-inverse
thresholded using Otsu’s method (Lines 104 and 105).
If our clearBorder flag is set, we can clear any foreground pixels that are
touching the border of our license plate ROI (Lines 110 and 111). This helps to
eliminate noise that could impact our Tesseract OCR results.
After that key is pressed, we break out of our loop, ignoring other candidates.
Finally, we return the 2-tuple consisting of our ROI and license plate contour to
the caller.
Figure 8: The results of our Python and OpenCV-based ANPR localization pipeline.
This sample is very suitable to pass on to be OCR’d with Tesseract.
It is now time to do just that. Shifting our focus to OCR, let’s define the
build_tesseract_options method:
Lines 127-130 concatenate both into a formatted string with these option
parameters. If you’re familiar with Tesseract’s command line arguments, you’ll
notice that our PyTesseract options string has a direct relationship.
Our final method brings all the components together in one centralized place so
our driver script can instantiate a PyImageSearchANPR object, and then make a
single function call. Let’s implement find_and_ocr:
# only OCR the license plate if the license plate ROI is not
# empty
if lp is not None:
# OCR the license plate
options = self.build_tesseract_options(psm=psm)
lpText = pytesseract.image_to_string(lp, config=options)
self.debug_imshow("License Plate", lp)
# return a 2-tuple of the OCR'd license plate text along with
# the contour associated with the license plate region
return (lpText, lpCnt)
image: The three-channel color image of the rear (or front) of a car with a
license plate tag
Determine our set of license plate candidates from our gray image via the
method we previously defined (Line 143)
Locate the license plate from the candidates resulting in our lp ROI (Lines
144 and 145)
Assuming we’ve found a suitable plate (i.e., lp is not None), we set our
PyTesseract options and perform OCR via the image_to_string method
(Lines 149-152).
Finally, Line 157 returns a 2-tuple consisting of the OCR’d lpText and lpCnt
contour.
Phew! You did it! Nice job implementing the PyImageSearchANPR class.
If you found that implementing this class was challenging to understand, then I
would recommend you study Module 1 of the PyImageSearch Gurus course, where
you’ll learn the basics of computer vision and image processing.
In our next section, we’ll create a Python script that utilizes the
PyImageSearchANPR class to perform Automatic License/Number Plate
Recognition on input images.
Let’s take a look in the project directory and find our driver file
ocr_license_plate.py:
Here we have our imports, namely our custom PyImageSearchANPR class that
we implemented in the “Implementing ANPR/ALPR with OpenCV and Python”
section and subsections.
def cleanup_text(text):
# strip out non-ASCII text so we can draw the text on the image
# using OpenCV
return "".join([c if ord(c) < 128 else "" for c in text]).strip()
Our cleanup_text function simply accepts a text string and parses out all non-
alphanumeric characters. This serves as a safety mechanism for OpenCV’s
cv2.putText function, which isn’t always able to render special characters during
image annotation (OpenCV will render them as “?”, question marks).
As you can see, we’re ensuring that only ASCII characters with ordinals [0, 127] pass
through. If you are unfamiliar with ASCII and alphanumeric characters, check out my
post OCR with Keras, TensorFlow, and Deep Learning or grab a copy of my
upcoming OCR book, which cover this extensively.
With our imports in place, text cleanup utility defined, and an understanding of our
command line arguments, now it is time to automatically recognize license plates!
We’ll process each of our imagePaths in hopes of finding and OCR’ing each
license plate successfully:
Looping over our imagePaths, we load and resize the image (Lines 32-35).
A call to our find_and_ocr method — while passing the image, --psm mode,
and --clear-border flag — primes our ANPR pipeline pump to spit out the
resulting OCR’d text and license plate contour on the other end.
You’ve just performed ANPR/ALPR in the driver script! If you need to revisit this
method, refer to the walkthrough in the “The central method of the
PyImageSearchANPR class” section, bearing in mind that the bulk of the work is
done in the class methods leading up to the find_and_ocr method.
Assuming that both lpText and lpCnt did not return as None (Line 42), let’s
annotate the original input image with the OCR result. Inside the conditional, we:
Calculate and draw the bounding box of the license plate contour (Lines 45-47)
You can now cycle through all of your --input directory images by pressing any
key (Line 59).
You did it! Give yourself a pat on the back before proceeding to the results section
— you deserve it.
Start by using the “Downloads” section of this tutorial to download the source
code and example images.
From there, open up a terminal and execute the following command for our first
group of test images:
As you can see, we’ve successfully applied ANPR to all of these images, including
license/number plate examples on the front or back of the vehicle.
Let’s try another set of images, this time where our ANPR solution doesn’t work as
well:
While the first result image has the correct ANPR result, the other two are wildly
incorrect.
We’re able to improve the ANPR OCR results for these images by applying the
clear_border function.
However, there is still one mistake in each example. In the top-right case, the letter
“Z” is mistaken for the digit “7”. In the bottom case, the letter “L” is mistaken for the
letter “E”.
While our system is a great start (and is sure to impress our friends and family!),
there are some obvious limitations and drawbacks associated with today’s proof of
concept. Let’s discuss them, along with a few ideas for improvement.
As the previous section’s ANPR results showed, sometimes our ANPR system
worked well and other times it did not. Furthermore, something as simple as
clearing any foreground pixels that touch the borders of the input license plate
improved license plate OCR accuracy.
Why is that?
The simple answer here is that Tesseract’s OCR engine can be a bit sensitive.
Tesseract will work best when you provide it with neatly cleaned and pre-
processed images.
As I mentioned in the introduction to this tutorial (and I’ll reiterate in the summary),
this blog post serves as a starting point to building your own Automatic
License/Number Plate Recognition systems.
This method will work well in controlled conditions, but if you want to build a
system that works in uncontrolled environments, you’ll need to start replacing
components (namely license plate localization, character segmentation, and
character OCR) with more advanced machine learning and deep learning models.
If you’re interested in more advanced ANPR methods, please let me know what
challenges you’re facing so I can develop future content for you!
Credits
The collection of images we used for this ANPR example was sampled from the
dataset put together by Devika Mishra of DataTurks. Thank you for putting
together this dataset, Devika!
Course information:
84 total classes • 114+ hours of on-demand code walkthrough videos • Last
updated: February 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master
computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-
consuming, overwhelming, and complicated? Or has to involve complex
mathematics and equations? Or requires a degree in computer science?
All you need to master computer vision and deep learning is for someone
to explain things to you in simple, intuitive terms. And that’s exactly what I
do. My mission is to change education and how complex Artificial
Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be
PyImageSearch University, the most comprehensive computer vision,
deep learning, and OpenCV course online today. Here you’ll learn how to
successfully and confidently apply computer vision to your work, research,
and projects. Join me in computer vision mastery.
✓ 84 Certificates of Completion
✓ Brand new courses released regularly, ensuring you can keep up with
state-of-the-art techniques
Summary
In this tutorial, you learned how to build a basic Automatic License/Number Plate
Recognition system using OpenCV and Python.
Our ANPR method relied on basic computer vision and image processing
techniques to localize a license plate in an image, including morphological
operations, image gradients, thresholding, bitwise operations, and contours.
This method will work well in controlled, predictable environments — like when
lighting conditions are uniform across input images and license plates are
standardized (such as dark characters on a light license plate background).
However, if you are developing an ANPR system that does not have a controlled
environment, you’ll need to start inserting machine learning and/or deep learning to
replace parts of our plate localization pipeline.
HOG + Linear SVM is a good starting point for plate localization if your input license
plates have a viewing angle that doesn’t change more than a few degrees. If you’re
working in an unconstrained environment where viewing angles can vary
dramatically, then deep learning-based models such as Faster R-CNN, SSDs, and
YOLO will likely obtain better accuracy.
Additionally, you may need to train your own custom license plate character OCR
model. We were able to get away with Tesseract in this blog post, but a dedicated
character segmentation and OCR model (like the ones I cover inside the
PyImageSearch Gurus course) may be required to improve your accuracy.
To download the source code to this post (and be notified when future tutorials
are published here on PyImageSearch), simply enter your email address in the
form below!
Hi there, I’m Adrian Rosebrock, PhD. All too often I see developers, students, and
researchers wasting their time, studying the wrong things, and generally struggling
to get started with Computer Vision, Deep Learning, and OpenCV. I created this
website to show you what I believe is the best possible way to get your start.
Previous Article:
Next Article:
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love
hearing from readers, a couple years ago I made the tough decision to no longer
offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post
comments. I simply did not have the time to moderate and respond to them all, and
the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and
OpenCV community at large by focusing my time on authoring high-quality blog
posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer
to my full catalog of books and courses — they have helped tens of thousands of
developers, students, and researchers just like yourself learn Computer Vision,
Deep Learning, and OpenCV.
Similar articles
Installation Guide
OpenCV Installation
Windows
Image Processing
Libraries
Deep Learning
Keras and TensorFlow
Tutorials
Get your FREE 17 page Computer Vision, OpenCV, and Deep Learning Resource Guide PDF. Inside you’ll find our hand-picked
tutorials, books, courses, and libraries to help you master CV and DL.
Topics
Deep Learning
Dlib Library
Embedded/IoT and Computer Vision
Face Applications
Image Processing
Interviews
Keras
OpenCV Install Guides
PyImageSearch University
FREE CV, DL, and OpenCV Crash Course
Practical Python and OpenCV
Deep Learning for Computer Vision with Python
PyImageSearch Gurus Course
Raspberry Pi for Computer Vision
PyImageSearch
Affiliates
Get Started
About
Consulting
Coaching
FAQ
YouTube
Blog
Contact
Privacy Policy