14 lat temu · 7756e8e1da
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -45,38 +45,39 @@ in classifying characters on a license plate.
 
				 In short our program must be able to do the following:
			
 
				 
			
 
				 \begin{enumerate}
			
 
				-    \item Extracting characters using the location points in the xml file.
			
 
				+    \item Extract characters using the location points in the xml file.
			
 
				     \item Reduce noise where possible to ensure maximum readability.
			
 
				-    \item Transforming a character to a normal form.
			
 
				-    \item Creating a local binary pattern histogram vector.
			
 
				-    \item Matching the found vector with a learning set.
			
 
				-    \item And finally it has to check results with a real data set.
			
 
				+    \item Transform a character to a normal form.
			
 
				+    \item Create a local binary pattern histogram vector.
			
 
				+    \item Recognize the character value of a vector using a classifier.
			
 
				+    \item Determine the performance of the classifier with a given test set.
			
 
				 \end{enumerate}
			
 
				 
			
 
				 \section{Language of choice}
			
 
				 
			
 
				 The actual purpose of this project is to check if LBP is capable of recognizing
			
 
				-license plate characters. We knew the LBP implementation would be pretty
			
 
				-simple. Thus an advantage had to be its speed compared with other license plate
			
 
				-recognition implementations, but the uncertainty of whether we could get some
			
 
				-results made us pick Python. We felt Python would not restrict us as much in
			
 
				-assigning tasks to each member of the group. In addition, when using the
			
 
				-correct modules to handle images, Python can be decent in speed.
			
 
				+license plate characters. Since the LBP algorithm is fairly simple to
			
 
				+implement, it should have a good performance in comparison to other license
			
 
				+plate recognition implementations if implemented in C. However, we decided to
			
 
				+focus on functionality rather than speed. Therefore, we picked Python. We felt
			
 
				+Python would not restrict us as much in assigning tasks to each member of the
			
 
				+group. In addition, when using the correct modules to handle images, Python can
			
 
				+be decent in speed.
			
 
				 
			
 
				 \section{Theory}
			
 
				 
			
 
				 Now we know what our program has to be capable of, we can start with the
			
 
				-defining what problems we have and how we want to solve these.
			
 
				+defining the problems we have and how we are planning to solve these.
			
 
				 
			
 
				 \subsection{Extracting a letter and resizing it}
			
 
				 
			
 
				-Rewrite this section once we have implemented this properly.
			
 
				+% TODO: Rewrite this section once we have implemented this properly.
			
 
				 
			
 
				 \subsection{Transformation}
			
 
				 
			
 
				 A simple perspective transformation will be sufficient to transform and resize
			
 
				 the characters to a normalized format. The corner positions of characters in
			
 
				-the dataset are supplied together with the dataset.
			
 
				+the dataset are provided together with the dataset.
			
 
				 
			
 
				 \subsection{Reducing noise}
			
 
				 
			
@@ -92,76 +93,80 @@ part of the license plate remains readable.
 
				 
			
 
				 \subsection{Local binary patterns}
			
 
				 Once we have separate digits and characters, we intent to use Local Binary
			
 
				-Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character
			
 
				-or digit we are dealing with. Local Binary
			
 
				-Patterns are a way to classify a texture based on the distribution of edge
			
 
				-directions in the image. Since letters on a license plate consist mainly of
			
 
				-straight lines and simple curves, LBP should be suited to identify these.
			
 
				+Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character or
			
 
				+digit we are dealing with. Local Binary Patterns are a way to classify a
			
 
				+texture based on the distribution of edge directions in the image. Since
			
 
				+letters on a license plate consist mainly of straight lines and simple curves,
			
 
				+LBP should be suited to identify these.
			
 
				 
			
 
				 \subsubsection{LBP Algorithm}
			
 
				 The LBP algorithm that we implemented can use a variety of neighbourhoods,
			
 
				-including the same square pattern that is introduced by Ojala et al (1994),
			
 
				-and a circular form as presented by Wikipedia.
			
 
				-\begin{itemize}
			
 
				+including the same square pattern that is introduced by Ojala et al (1994), and
			
 
				+a circular form as presented by Wikipedia.
			
 
				+
			
 
				+\begin{enumerate}
			
 
				+
			
 
				 \item Determine the size of the square where the local patterns are being
			
 
				 registered. For explanation purposes let the square be 3 x 3. \\
			
 
				-\item The grayscale value of the middle pixel is used as threshold. Every
			
 
				-value of the pixel around the middle pixel is evaluated. If it's value is
			
 
				-greater than the threshold it will be become a one else a zero.
			
 
				+
			
 
				+\item The grayscale value of the center pixel is used as threshold. Every value
			
 
				+of the pixel around the center pixel is evaluated. If it's value is greater
			
 
				+than the threshold it will be become a one, otherwise it will be a zero.
			
 
				 
			
 
				 \begin{figure}[H]
			
 
				-\center
			
 
				-\includegraphics[scale=0.5]{lbp.png}
			
 
				-\caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
			
 
				+    \center
			
 
				+    \includegraphics[scale=0.5]{lbp.png}
			
 
				+    \caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
			
 
				 \end{figure}
			
 
				 
			
 
				-Notice that the pattern will be come of the form 01001110. This is done when a
			
 
				-the value of the evaluated pixel is greater than the threshold, shift the bit
			
 
				-by the n(with i=i$_{th}$ pixel evaluated, starting with $i=0$).
			
 
				+The pattern will be an 8-bit integer. This is accomplished by shifting the
			
 
				+boolean value of each comparison one to seven places to the left.
			
 
				 
			
 
				 This results in a mathematical expression:
			
 
				 
			
 
				-Let I($x_i, y_i$) an Image with grayscale values and $g_n$ the grayscale value
			
 
				-of the pixel $(x_i, y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ =
			
 
				-grayscale value of the center pixel and $g_i$ the grayscale value of the pixel
			
 
				-to be evaluated.
			
 
				+Let I($x_i, y_i$) be a grayscale Image and $g_n$ the value of the pixel $(x_i,
			
 
				+y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ being the value of the
			
 
				+center pixel and $g_i$ the grayscale value of the pixel to be evaluated.
			
 
				 
			
 
				 $$
			
 
				-  s(g_i, g_c) = \left\{
			
 
				-  \begin{array}{l l}
			
 
				-    1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
			
 
				-    0 & \quad \text{if $g_i$ $<$ $g_c$}\\
			
 
				-  \end{array} \right.
			
 
				+    s(g_i, g_c) = \left \{
			
 
				+    \begin{array}{l l}
			
 
				+        1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
			
 
				+        0 & \quad \text{if $g_i$ $<$ $g_c$}\\
			
 
				+    \end{array} \right.
			
 
				 $$
			
 
				 
			
 
				-$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c)^{2i} $$
			
 
				+$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c) \cdot 2^i$$
			
 
				 
			
 
				-The outcome of this operations will be a binary pattern.
			
 
				+The outcome of this operations will be a binary pattern. Note that the
			
 
				+mathematical expression has the same effect as the bit shifting operation that
			
 
				+we defined earlier.
			
 
				 
			
 
				-\item Given this pattern, the next step is to divide the pattern in cells. The
			
 
				-amount of cells depends on the quality of the result, so trial and error is in
			
 
				-order. Starting with dividing the pattern in to cells of size 16.
			
 
				+\item Given this pattern, the next step is to divide the pattern into cells.
			
 
				+The amount of cells depends on the quality of the result, which we plan to
			
 
				+determine by trial and error. We will start by dividing the pattern into cells
			
 
				+of size 16, which is a common value according to Wikipedia.
			
 
				 
			
 
				 \item Compute a histogram for each cell.
			
 
				 
			
 
				 \begin{figure}[H]
			
 
				-\center
			
 
				-\includegraphics[scale=0.7]{cells.png}
			
 
				-\caption{Divide in cells(Pietik\"ainen et all (2011))}
			
 
				+    \center
			
 
				+    \includegraphics[scale=0.7]{cells.png}
			
 
				+    \caption{Divide into cells (Pietik\"ainen et all (2011))}
			
 
				 \end{figure}
			
 
				 
			
 
				 \item Consider every histogram as a vector element and concatenate these. The
			
 
				 result is a feature vector of the image.
			
 
				 
			
 
				-\item Feed these vectors to a support vector machine. This will ''learn'' which
			
 
				-vector indicates what vector is which character.
			
 
				+\item Feed these vectors to a support vector machine. The SVM will ``learn''
			
 
				+which vectors to associate with a character.
			
 
				 
			
 
				-\end{itemize}
			
 
				+\end{enumerate}
			
 
				 
			
 
				 To our knowledge, LBP has yet not been used in this manner before. Therefore,
			
 
				 it will be the first thing to implement, to see if it lives up to the
			
 
				-expectations. When the proof of concept is there, it can be used in a final
			
 
				-program.
			
 
				+expectations. When the proof of concept is there, it can be used in a final,
			
 
				+more efficient program.
			
 
				 
			
 
				 Later we will show that taking a histogram over the entire image (basically
			
 
				 working with just one cell) gives us the best results.
			
@@ -169,17 +174,17 @@ working with just one cell) gives us the best results.
 
				 \subsection{Matching the database}
			
 
				 
			
 
				 Given the LBP of a character, a Support Vector Machine can be used to classify
			
 
				-the character to a character in a learning set. The SVM uses a concatenation
			
 
				-of each cell in an image as a feature vector (in the case we check the entire
			
 
				-image no concatenation has to be done of course. The SVM can be trained with a
			
 
				-subset of the given dataset called the ''Learning set''. Once trained, the
			
 
				-entire classifier can be saved as a Pickle object\footnote{See
			
 
				+the character to a character in a learning set. The SVM uses the concatenation
			
 
				+of the histograms of all cells in an image as a feature vector (in the case we
			
 
				+check the entire image no concatenation has to be done of course. The SVM can
			
 
				+be trained with a subset of the given dataset called the ``learning set''. Once
			
 
				+trained, the entire classifier can be saved as a Pickle object\footnote{See
			
 
				 \url{http://docs.python.org/library/pickle.html}} for later usage.
			
 
				 
			
 
				 \section{Implementation}
			
 
				 
			
 
				-In this section we will describe our implementations in more detail, explaining
			
 
				-choices we made.
			
 
				+In this section we will describe our implementation in more detail, explaining
			
 
				+the choices we made in the process.
			
 
				 
			
 
				 \subsection{Character retrieval}