License Plate Detection - A write-up
What is license plate detection?
License plate detection is an example of an object detection task. At the simplest, it involves finding the region, or the boundary of a segment that contains the license plate of a vehicle. Note that this is different from "license plate recognition" whereby you need to perform optical character recognition (OCR) after license plate detection.
This sounds like a simple thing to do. After all, we all can do it. However, for a computer, it doesn't see the world the way we does. Instead of a nice picture of a car with license plate somewhere inside, it sees a matrix of numbers. Each cell in the matrix is called a pixel and it contains 3 numbers representing the intensity value of red, green and blue color respectively. Together, all the pixels made up the image the computer sees.
The challenge here, it seems, is basically to find out, of the many ways you can segment the image (a fancy way of saying how to cut a piece of rectangular image out from a bigger rectangle), you need to find one that contains the object we want (In this case being the license plate).
Before one proceed to understand the various methods for solving license plate detection, you need to understand what features are. A feature is a piece of information that tells you something about the object of interest. For example, if you are trying to determine if a person is smart, you may use his IQ or his CGPA. IQ and CGPA in this case is a feature. In license plate detection (or any other object detection problem) you need features that are able to tell if a region contains a license plate or not.
Think about this a little bit. If hypothetically, I give you a particular a region on the image, how do you know if you have found a license plate?
You may say that the region is not a license plate because there are just too many "white"-like pixels. Or you may answer that the region can't be a license plate because there are no characters in it. All these are called a feature.
One very common method for solving license plate detection is what we called a sliding windows approach. The idea is to try every possible way to segment the image and determine if a region is a license plate or not. Or to put this in another way, imagine you peeking inside a room through a small hole and you move your eyes around to look inside the room. At each time, you can only see a very small part of the room but eventually, you are bound to find what you want. You may be tempted to think this is impossible. After all, there are simply too many ways to segment the image. However, since a computer is fast, making sliding windows possible. In practice however, sliding windows is slow and undesirable.
SIFT, or Scale-Invariant Feature Transform, is one kind of feature that you can use to solve license plate detection. Without going too much into detail, all you need to know about SIFT on the high level is that it returns you a list of points in the image.
Each SIFT point in the image above is represented by a small red circle (in reality, there should be much more SIFT points). One thing you will noticed is that SIFT points tend to be in places like corners or edges. Each SIFT point also contains a descriptor. This descriptor is a feature that describes its surroundings. Another characteristic of SIFT points is that no matter how you rotate the picture, make the picture bigger or smaller, or chop the images into smaller pieces, it will gives you more or less the same SIFT points. We say that SIFT is translation, scale and rotation invariant.
Some useful and related links:
This write-up is a work in progress. Last updated: 14th June 2011.