14 ani în urmă · 7756e8e1da
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -45,38 +45,39 @@ in classifying characters on a license plate.
 
															 In short our program must be able to do the following:
														
 
															 \begin{enumerate}
														
 
															-    \item Extracting characters using the location points in the xml file.
														
 
															+    \item Extract characters using the location points in the xml file.
														
 
															     \item Reduce noise where possible to ensure maximum readability.
														
 
															-    \item Transforming a character to a normal form.
														
 
															-    \item Creating a local binary pattern histogram vector.
														
 
															-    \item Matching the found vector with a learning set.
														
 
															-    \item And finally it has to check results with a real data set.
														
 
															+    \item Transform a character to a normal form.
														
 
															+    \item Create a local binary pattern histogram vector.
														
 
															+    \item Recognize the character value of a vector using a classifier.
														
 
															+    \item Determine the performance of the classifier with a given test set.
														
 
															 \end{enumerate}
														
 
															 \section{Language of choice}
														
 
															 The actual purpose of this project is to check if LBP is capable of recognizing
														
 
															-license plate characters. We knew the LBP implementation would be pretty
														
 
															-simple. Thus an advantage had to be its speed compared with other license plate
														
 
															-recognition implementations, but the uncertainty of whether we could get some
														
 
															-results made us pick Python. We felt Python would not restrict us as much in
														
 
															-assigning tasks to each member of the group. In addition, when using the
														
 
															-correct modules to handle images, Python can be decent in speed.
														
 
															+license plate characters. Since the LBP algorithm is fairly simple to
														
 
															+implement, it should have a good performance in comparison to other license
														
 
															+plate recognition implementations if implemented in C. However, we decided to
														
 
															+focus on functionality rather than speed. Therefore, we picked Python. We felt
														
 
															+Python would not restrict us as much in assigning tasks to each member of the
														
 
															+group. In addition, when using the correct modules to handle images, Python can
														
 
															+be decent in speed.
														
 
															 \section{Theory}
														
 
															 Now we know what our program has to be capable of, we can start with the
														
 
															-defining what problems we have and how we want to solve these.
														
 
															+defining the problems we have and how we are planning to solve these.
														
 
															 \subsection{Extracting a letter and resizing it}
														
 
															-Rewrite this section once we have implemented this properly.
														
 
															+% TODO: Rewrite this section once we have implemented this properly.
														
 
															 \subsection{Transformation}
														
 
															 A simple perspective transformation will be sufficient to transform and resize
														
 
															 the characters to a normalized format. The corner positions of characters in
														
 
															-the dataset are supplied together with the dataset.
														
 
															+the dataset are provided together with the dataset.
														
 
															 \subsection{Reducing noise}
														
@@ -92,76 +93,80 @@ part of the license plate remains readable.
 
															 \subsection{Local binary patterns}
														
 
															 Once we have separate digits and characters, we intent to use Local Binary
														
 
															-Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character
														
 
															-or digit we are dealing with. Local Binary
														
 
															-Patterns are a way to classify a texture based on the distribution of edge
														
 
															-directions in the image. Since letters on a license plate consist mainly of
														
 
															-straight lines and simple curves, LBP should be suited to identify these.
														
 
															+Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character or
														
 
															+digit we are dealing with. Local Binary Patterns are a way to classify a
														
 
															+texture based on the distribution of edge directions in the image. Since
														
 
															+letters on a license plate consist mainly of straight lines and simple curves,
														
 
															+LBP should be suited to identify these.
														
 
															 \subsubsection{LBP Algorithm}
														
 
															 The LBP algorithm that we implemented can use a variety of neighbourhoods,
														
 
															-including the same square pattern that is introduced by Ojala et al (1994),
														
 
															-and a circular form as presented by Wikipedia.
														
 
															-\begin{itemize}
														
 
															+including the same square pattern that is introduced by Ojala et al (1994), and
														
 
															+a circular form as presented by Wikipedia.
														
 
															+
														
 
															+\begin{enumerate}
														
 
															+
														
 
															 \item Determine the size of the square where the local patterns are being
														
 
															 registered. For explanation purposes let the square be 3 x 3. \\
														
 
															-\item The grayscale value of the middle pixel is used as threshold. Every
														
 
															-value of the pixel around the middle pixel is evaluated. If it's value is
														
 
															-greater than the threshold it will be become a one else a zero.
														
 
															+
														
 
															+\item The grayscale value of the center pixel is used as threshold. Every value
														
 
															+of the pixel around the center pixel is evaluated. If it's value is greater
														
 
															+than the threshold it will be become a one, otherwise it will be a zero.
														
 
															 \begin{figure}[H]
														
 
															-\center
														
 
															-\includegraphics[scale=0.5]{lbp.png}
														
 
															-\caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
														
 
															+    \center
														
 
															+    \includegraphics[scale=0.5]{lbp.png}
														
 
															+    \caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
														
 
															 \end{figure}
														
 
															-Notice that the pattern will be come of the form 01001110. This is done when a
														
 
															-the value of the evaluated pixel is greater than the threshold, shift the bit
														
 
															-by the n(with i=i$_{th}$ pixel evaluated, starting with $i=0$).
														
 
															+The pattern will be an 8-bit integer. This is accomplished by shifting the
														
 
															+boolean value of each comparison one to seven places to the left.
														
 
															 This results in a mathematical expression:
														
 
															-Let I($x_i, y_i$) an Image with grayscale values and $g_n$ the grayscale value
														
 
															-of the pixel $(x_i, y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ =
														
 
															-grayscale value of the center pixel and $g_i$ the grayscale value of the pixel
														
 
															-to be evaluated.
														
 
															+Let I($x_i, y_i$) be a grayscale Image and $g_n$ the value of the pixel $(x_i,
														
 
															+y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ being the value of the
														
 
															+center pixel and $g_i$ the grayscale value of the pixel to be evaluated.
														
 
															 $$
														
 
															-  s(g_i, g_c) = \left\{
														
 
															-  \begin{array}{l l}
														
 
															-    1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
														
 
															-    0 & \quad \text{if $g_i$ $<$ $g_c$}\\
														
 
															-  \end{array} \right.
														
 
															+    s(g_i, g_c) = \left \{
														
 
															+    \begin{array}{l l}
														
 
															+        1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
														
 
															+        0 & \quad \text{if $g_i$ $<$ $g_c$}\\
														
 
															+    \end{array} \right.
														
 
															 $$
														
 
															-$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c)^{2i} $$
														
 
															+$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c) \cdot 2^i$$
														
 
															-The outcome of this operations will be a binary pattern.
														
 
															+The outcome of this operations will be a binary pattern. Note that the
														
 
															+mathematical expression has the same effect as the bit shifting operation that
														
 
															+we defined earlier.
														
 
															-\item Given this pattern, the next step is to divide the pattern in cells. The
														
 
															-amount of cells depends on the quality of the result, so trial and error is in
														
 
															-order. Starting with dividing the pattern in to cells of size 16.
														
 
															+\item Given this pattern, the next step is to divide the pattern into cells.
														
 
															+The amount of cells depends on the quality of the result, which we plan to
														
 
															+determine by trial and error. We will start by dividing the pattern into cells
														
 
															+of size 16, which is a common value according to Wikipedia.
														
 
															 \item Compute a histogram for each cell.
														
 
															 \begin{figure}[H]
														
 
															-\center
														
 
															-\includegraphics[scale=0.7]{cells.png}
														
 
															-\caption{Divide in cells(Pietik\"ainen et all (2011))}
														
 
															+    \center
														
 
															+    \includegraphics[scale=0.7]{cells.png}
														
 
															+    \caption{Divide into cells (Pietik\"ainen et all (2011))}
														
 
															 \end{figure}
														
 
															 \item Consider every histogram as a vector element and concatenate these. The
														
 
															 result is a feature vector of the image.
														
 
															-\item Feed these vectors to a support vector machine. This will ''learn'' which
														
 
															-vector indicates what vector is which character.
														
 
															+\item Feed these vectors to a support vector machine. The SVM will ``learn''
														
 
															+which vectors to associate with a character.
														
 
															-\end{itemize}
														
 
															+\end{enumerate}
														
 
															 To our knowledge, LBP has yet not been used in this manner before. Therefore,
														
 
															 it will be the first thing to implement, to see if it lives up to the
														
 
															-expectations. When the proof of concept is there, it can be used in a final
														
 
															-program.
														
 
															+expectations. When the proof of concept is there, it can be used in a final,
														
 
															+more efficient program.
														
 
															 Later we will show that taking a histogram over the entire image (basically
														
 
															 working with just one cell) gives us the best results.
														
@@ -169,17 +174,17 @@ working with just one cell) gives us the best results.
 
															 \subsection{Matching the database}
														
 
															 Given the LBP of a character, a Support Vector Machine can be used to classify
														
 
															-the character to a character in a learning set. The SVM uses a concatenation
														
 
															-of each cell in an image as a feature vector (in the case we check the entire
														
 
															-image no concatenation has to be done of course. The SVM can be trained with a
														
 
															-subset of the given dataset called the ''Learning set''. Once trained, the
														
 
															-entire classifier can be saved as a Pickle object\footnote{See
														
 
															+the character to a character in a learning set. The SVM uses the concatenation
														
 
															+of the histograms of all cells in an image as a feature vector (in the case we
														
 
															+check the entire image no concatenation has to be done of course. The SVM can
														
 
															+be trained with a subset of the given dataset called the ``learning set''. Once
														
 
															+trained, the entire classifier can be saved as a Pickle object\footnote{See
														
 
															 \url{http://docs.python.org/library/pickle.html}} for later usage.
														
 
															 \section{Implementation}
														
 
															-In this section we will describe our implementations in more detail, explaining
														
 
															-choices we made.
														
 
															+In this section we will describe our implementation in more detail, explaining
														
 
															+the choices we made in the process.
														
 
															 \subsection{Character retrieval}