|
@@ -16,7 +16,7 @@
|
|
|
|
|
|
|
|
\section*{Project members}
|
|
\section*{Project members}
|
|
|
Gijs van der Voort\\
|
|
Gijs van der Voort\\
|
|
|
-Richard Torenvliet\\
|
|
|
|
|
|
|
+Raichard Torenvliet\\
|
|
|
Jayke Meijer\\
|
|
Jayke Meijer\\
|
|
|
Tadde\"us Kroes\\
|
|
Tadde\"us Kroes\\
|
|
|
Fabi\"en Tesselaar
|
|
Fabi\"en Tesselaar
|
|
@@ -36,7 +36,7 @@ Reading license plates with a computer is much more difficult. Our dataset
|
|
|
contains photographs of license plates from various angles and distances. This
|
|
contains photographs of license plates from various angles and distances. This
|
|
|
means that not only do we have to implement a method to read the actual
|
|
means that not only do we have to implement a method to read the actual
|
|
|
characters, but given the location of the license plate and each individual
|
|
characters, but given the location of the license plate and each individual
|
|
|
-character, we must make sure we transform each character to a standard form.
|
|
|
|
|
|
|
+character, we must make sure we transform each character to a standard form.
|
|
|
|
|
|
|
|
Determining what character we are looking at will be done by using Local Binary
|
|
Determining what character we are looking at will be done by using Local Binary
|
|
|
Patterns. The main goal of our research is finding out how effective LBP's are
|
|
Patterns. The main goal of our research is finding out how effective LBP's are
|
|
@@ -57,9 +57,9 @@ In short our program must be able to do the following:
|
|
|
|
|
|
|
|
The actual purpose of this project is to check if LBP is capable of recognizing
|
|
The actual purpose of this project is to check if LBP is capable of recognizing
|
|
|
license plate characters. We knew the LBP implementation would be pretty
|
|
license plate characters. We knew the LBP implementation would be pretty
|
|
|
-simple. Thus an advantage had to be its speed compared with other license plate
|
|
|
|
|
|
|
+simple. Thus an advantage had to be its speed compared with other license plate
|
|
|
recognition implementations, but the uncertainty of whether we could get some
|
|
recognition implementations, but the uncertainty of whether we could get some
|
|
|
-results made us pick Python. We felt Python would not restrict us as much in
|
|
|
|
|
|
|
+results made us pick Python. We felt Python would not restrict us as much in
|
|
|
assigning tasks to each member of the group. In addition, when using the
|
|
assigning tasks to each member of the group. In addition, when using the
|
|
|
correct modules to handle images, Python can be decent in speed.
|
|
correct modules to handle images, Python can be decent in speed.
|
|
|
|
|
|
|
@@ -140,7 +140,7 @@ The outcome of this operations will be a binary pattern.
|
|
|
|
|
|
|
|
\item Given this pattern, the next step is to divide the pattern in cells. The
|
|
\item Given this pattern, the next step is to divide the pattern in cells. The
|
|
|
amount of cells depends on the quality of the result, so trial and error is in
|
|
amount of cells depends on the quality of the result, so trial and error is in
|
|
|
-order. Starting with dividing the pattern in to cells of size 16.
|
|
|
|
|
|
|
+order. Starting with dividing the pattern in to cells of size 16.
|
|
|
|
|
|
|
|
\item Compute a histogram for each cell.
|
|
\item Compute a histogram for each cell.
|
|
|
|
|
|
|
@@ -154,7 +154,7 @@ order. Starting with dividing the pattern in to cells of size 16.
|
|
|
result is a feature vector of the image.
|
|
result is a feature vector of the image.
|
|
|
|
|
|
|
|
\item Feed these vectors to a support vector machine. This will ''learn'' which
|
|
\item Feed these vectors to a support vector machine. This will ''learn'' which
|
|
|
-vector indicates what vector is which character.
|
|
|
|
|
|
|
+vector indicates what vector is which character.
|
|
|
|
|
|
|
|
\end{itemize}
|
|
\end{itemize}
|
|
|
|
|
|
|
@@ -186,7 +186,7 @@ choices we made.
|
|
|
\subsection{Character retrieval}
|
|
\subsection{Character retrieval}
|
|
|
|
|
|
|
|
In order to retrieve the characters from the entire image, we need to
|
|
In order to retrieve the characters from the entire image, we need to
|
|
|
-perform a perspective transformation. However, to do this, we need to know the
|
|
|
|
|
|
|
+perform a perspective transformation. However, to do this, we need to know the
|
|
|
coordinates of the four corners of each character. For our dataset, this is
|
|
coordinates of the four corners of each character. For our dataset, this is
|
|
|
stored in XML files. So, the first step is to read these XML files.
|
|
stored in XML files. So, the first step is to read these XML files.
|
|
|
|
|
|
|
@@ -196,7 +196,7 @@ The XML reader will return a 'license plate' object when given an XML file. The
|
|
|
licence plate holds a list of, up to six, NormalizedImage characters and from
|
|
licence plate holds a list of, up to six, NormalizedImage characters and from
|
|
|
which country the plate is from. The reader is currently assuming the XML file
|
|
which country the plate is from. The reader is currently assuming the XML file
|
|
|
and image name are corresponding, since this was the case for the given
|
|
and image name are corresponding, since this was the case for the given
|
|
|
-dataset. This can easily be adjusted if required.
|
|
|
|
|
|
|
+dataset. This can easily be adjusted if required.
|
|
|
|
|
|
|
|
To parse the XML file, the minidom module is used. So the XML file can be
|
|
To parse the XML file, the minidom module is used. So the XML file can be
|
|
|
treated as a tree, where one can search for certain nodes. In each XML
|
|
treated as a tree, where one can search for certain nodes. In each XML
|
|
@@ -205,7 +205,7 @@ will do is retrieve the current and most up-to-date version of the plate. The
|
|
|
reader will only get results from this version.
|
|
reader will only get results from this version.
|
|
|
|
|
|
|
|
Now we are only interested in the individual characters so we can skip the
|
|
Now we are only interested in the individual characters so we can skip the
|
|
|
-location of the entire license plate. Each character has
|
|
|
|
|
|
|
+location of the entire license plate. Each character has
|
|
|
a single character value, indicating what someone thought what the letter or
|
|
a single character value, indicating what someone thought what the letter or
|
|
|
digit was and four coordinates to create a bounding box. If less then four
|
|
digit was and four coordinates to create a bounding box. If less then four
|
|
|
points have been set the character will not be saved. Else, to make things not
|
|
points have been set the character will not be saved. Else, to make things not
|
|
@@ -215,7 +215,7 @@ it gives some extra freedom when using the data.
|
|
|
When four points have been gathered the data from the actual image is being
|
|
When four points have been gathered the data from the actual image is being
|
|
|
requested. For each corner a small margin is added (around 3 pixels) so that no
|
|
requested. For each corner a small margin is added (around 3 pixels) so that no
|
|
|
features will be lost and minimum amounts of new features will be introduced by
|
|
features will be lost and minimum amounts of new features will be introduced by
|
|
|
-noise in the margin.
|
|
|
|
|
|
|
+noise in the margin.
|
|
|
|
|
|
|
|
In the next section you can read more about the perspective transformation that
|
|
In the next section you can read more about the perspective transformation that
|
|
|
is being done. After the transformation the character can be saved: Converted
|
|
is being done. After the transformation the character can be saved: Converted
|
|
@@ -232,12 +232,12 @@ rectangle.
|
|
|
|
|
|
|
|
\subsection{Noise reduction}
|
|
\subsection{Noise reduction}
|
|
|
|
|
|
|
|
-The image contains a lot of noise, both from camera errors due to dark noise
|
|
|
|
|
-etc., as from dirt on the license plate. In this case, noise therefore means
|
|
|
|
|
|
|
+The image contains a lot of noise, both from camera errors due to dark noise
|
|
|
|
|
+etc., as from dirt on the license plate. In this case, noise therefore means
|
|
|
any unwanted difference in color from the surrounding pixels.
|
|
any unwanted difference in color from the surrounding pixels.
|
|
|
|
|
|
|
|
\paragraph*{Camera noise and small amounts of dirt}
|
|
\paragraph*{Camera noise and small amounts of dirt}
|
|
|
-The dirt on the license plate can be of different sizes. We can reduce the
|
|
|
|
|
|
|
+The dirt on the license plate can be of different sizes. We can reduce the
|
|
|
smaller amounts of dirt in the same way as we reduce normal noise, by applying
|
|
smaller amounts of dirt in the same way as we reduce normal noise, by applying
|
|
|
a Gaussian blur to the image. This is the next step in our program.\\
|
|
a Gaussian blur to the image. This is the next step in our program.\\
|
|
|
\\
|
|
\\
|
|
@@ -256,7 +256,7 @@ characters will still be conserved in the LBP, even if there is dirt
|
|
|
surrounding the character.
|
|
surrounding the character.
|
|
|
|
|
|
|
|
\subsection{Creating Local Binary Patterns and feature vector}
|
|
\subsection{Creating Local Binary Patterns and feature vector}
|
|
|
-Every pixel is a center pixel and it is also a value to evaluate but not at the
|
|
|
|
|
|
|
+Every pixel is a center pixel and it is also a value to evaluate but not at the
|
|
|
same time. Every pixel is evaluated as shown in the explanation
|
|
same time. Every pixel is evaluated as shown in the explanation
|
|
|
of the LBP algorithm. There are several neighbourhoods we can evaluate. We have
|
|
of the LBP algorithm. There are several neighbourhoods we can evaluate. We have
|
|
|
tried the following neighbourhoods:
|
|
tried the following neighbourhoods:
|
|
@@ -276,12 +276,12 @@ would add a computational step, thus making the code execute slower. In the
|
|
|
next section we will describe what the best neighbourhood was.
|
|
next section we will describe what the best neighbourhood was.
|
|
|
|
|
|
|
|
Take an example where the full square can be evaluated, so none of the
|
|
Take an example where the full square can be evaluated, so none of the
|
|
|
-neighbours are out of bounds. The first to be checked is the pixel in the left
|
|
|
|
|
-bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$
|
|
|
|
|
|
|
+neighbours are out of bounds. The first to be checked is the pixel in the left
|
|
|
|
|
+bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$
|
|
|
as center pixel that has coordinates $(x, y)$. If the grayscale value of the
|
|
as center pixel that has coordinates $(x, y)$. If the grayscale value of the
|
|
|
neighbour in the left corner is greater than the grayscale
|
|
neighbour in the left corner is greater than the grayscale
|
|
|
value of the center pixel than return true. Bit-shift the first bit with 7. The
|
|
value of the center pixel than return true. Bit-shift the first bit with 7. The
|
|
|
-outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so
|
|
|
|
|
|
|
+outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so
|
|
|
on. Until we are at 0. The result is a binary pattern of the local point just
|
|
on. Until we are at 0. The result is a binary pattern of the local point just
|
|
|
evaluated.
|
|
evaluated.
|
|
|
Now only the edge pixels are a problem, but a simple check if the location of
|
|
Now only the edge pixels are a problem, but a simple check if the location of
|
|
@@ -348,7 +348,7 @@ scripts is named here and a description is given on what the script does.
|
|
|
\section{Finding parameters}
|
|
\section{Finding parameters}
|
|
|
|
|
|
|
|
Now that we have a functioning system, we need to tune it to work properly for
|
|
Now that we have a functioning system, we need to tune it to work properly for
|
|
|
-license plates. This means we need to find the parameters. Throughout the
|
|
|
|
|
|
|
+license plates. This means we need to find the parameters. Throughout the
|
|
|
program we have a number of parameters for which no standard choice is
|
|
program we have a number of parameters for which no standard choice is
|
|
|
available. These parameters are:\\
|
|
available. These parameters are:\\
|
|
|
\\
|
|
\\
|
|
@@ -373,7 +373,7 @@ The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
|
|
|
find this parameter, we tested a few values, by trying them and checking the
|
|
find this parameter, we tested a few values, by trying them and checking the
|
|
|
results. It turned out that the best value was $\sigma = 1.4$.\\
|
|
results. It turned out that the best value was $\sigma = 1.4$.\\
|
|
|
\\
|
|
\\
|
|
|
-Theoretically, this can be explained as follows. The filter has width of
|
|
|
|
|
|
|
+Theoretically, this can be explained as follows. The filter has width of
|
|
|
$6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
|
|
$6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
|
|
|
after our resize operations, around 8 pixels. This means, our filter `matches'
|
|
after our resize operations, around 8 pixels. This means, our filter `matches'
|
|
|
the smallest detail size we want to be able to see, so everything that is
|
|
the smallest detail size we want to be able to see, so everything that is
|
|
@@ -456,7 +456,7 @@ $2^{-1}$ & 61 & 61 & 61 & 61 & 62 &
|
|
|
92 & 93 & 93 & 86 & 45\\
|
|
92 & 93 & 93 & 86 & 45\\
|
|
|
$2^{7}$ & 61 & 70 & 84 & 90 & 92 &
|
|
$2^{7}$ & 61 & 70 & 84 & 90 & 92 &
|
|
|
93 & 93 & 93 & 86 & 45\\
|
|
93 & 93 & 93 & 86 & 45\\
|
|
|
- $2^{9}$ & 70 & 84 & 90 & 92 & 92 &
|
|
|
|
|
|
|
+ $2^{9}$ & 70 & 84 & 90 & 92 & 92 &
|
|
|
93 & 93 & 93 & 86 & 45\\
|
|
93 & 93 & 93 & 86 & 45\\
|
|
|
$2^{11}$ & 84 & 90 & 92 & 92 & 92 &
|
|
$2^{11}$ & 84 & 90 & 92 & 92 & 92 &
|
|
|
92 & 93 & 93 & 86 & 45\\
|
|
92 & 93 & 93 & 86 & 45\\
|
|
@@ -494,7 +494,7 @@ good scores per character, and $93\%$ seems to be a fairly good result.\\
|
|
|
\\
|
|
\\
|
|
|
Possibilities for improvement of this score would be more extensive
|
|
Possibilities for improvement of this score would be more extensive
|
|
|
grid-searches, finding more exact values for $c$ and $\gamma$, more tests
|
|
grid-searches, finding more exact values for $c$ and $\gamma$, more tests
|
|
|
-for finding $\sigma$ and more experiments on the size and shape of the
|
|
|
|
|
|
|
+for finding $\sigma$ and more experiments on the size and shape of the
|
|
|
neighbourhoods.
|
|
neighbourhoods.
|
|
|
|
|
|
|
|
\subsection{Speed}
|
|
\subsection{Speed}
|
|
@@ -582,7 +582,7 @@ implementation was most suited to be done by one individually or in a pair.
|
|
|
|
|
|
|
|
\subsubsection*{Who did what}
|
|
\subsubsection*{Who did what}
|
|
|
Gijs created the basic classes we could use and helped everyone by keeping
|
|
Gijs created the basic classes we could use and helped everyone by keeping
|
|
|
-track of what was required to be finished and whom was working on what.
|
|
|
|
|
|
|
+track of what was required to be finished and whom was working on what.
|
|
|
Tadde\"us and Jayke were mostly working on the SVM and all kinds of tests
|
|
Tadde\"us and Jayke were mostly working on the SVM and all kinds of tests
|
|
|
whether the histograms were matching, and what parameters had to be used.
|
|
whether the histograms were matching, and what parameters had to be used.
|
|
|
Fabi\"en created the functions to read and parse the given xml files with
|
|
Fabi\"en created the functions to read and parse the given xml files with
|