14 лет назад · 5e66b6cb55
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -19,7 +19,7 @@ Gijs van der Voort\\
 
				 Richard Torenvliet\\
			
 
				 Jayke Meijer\\
			
 
				 Tadde\"us Kroes\\
			
 
				-Fabi\'en Tesselaar
			
 
				+Fabi\"en Tesselaar
			
 
				 
			
 
				 \tableofcontents
			
 
				 \pagebreak
			
@@ -71,25 +71,6 @@ defining what problems we have and how we want to solve these.
 
				 \subsection{Extracting a letter and resizing it}
			
 
				 
			
 
				 Rewrite this section once we have implemented this properly.
			
 
				-%NO LONGER VALID!
			
 
				-%Because we are already given the locations of the characters, we only need to
			
 
				-%transform those locations using the same perspective transformation used to
			
 
				-%create a front facing license plate. The next step is to transform the
			
 
				-%characters to a normalized manner. The size of the letter W is used as a
			
 
				-%standard to normalize the width of all the characters, because W is the widest
			
 
				-%character of the alphabet. We plan to also normalize the height of characters,
			
 
				-%the best manner for this is still to be determined.
			
 
				-
			
 
				-%\begin{enumerate}
			
 
				-%    \item Crop the image in such a way that the character precisely fits the
			
 
				-%          image.
			
 
				-%    \item Scale the image to a standard height.
			
 
				-%    \item Extend the image on either the left or right side to a certain width.
			
 
				-%\end{enumerate}
			
 
				-
			
 
				-%The resulting image will always have the same size, the character contained
			
 
				-%will always be of the same height, and the character will always be positioned
			
 
				-%at either the left of right side of the image.
			
 
				 
			
 
				 \subsection{Transformation}
			
 
				 
			
@@ -224,9 +205,10 @@ reader will only get results from this version.
 
				 Now we are only interested in the individual characters so we can skip the
			
 
				 location of the entire license plate. Each character has 
			
 
				 a single character value, indicating what someone thought what the letter or
			
 
				-digit was and four coordinates to create a bounding box. If less then four points have been set the character will not be saved. Else, to make things not to
			
 
				-complicated, a Character class is used. It acts as an associative list, but it gives some extra freedom when using the
			
 
				-data.
			
 
				+digit was and four coordinates to create a bounding box. If less then four
			
 
				+points have been set the character will not be saved. Else, to make things not
			
 
				+to complicated, a Character class is used. It acts as an associative list, but
			
 
				+it gives some extra freedom when using the data.
			
 
				 
			
 
				 When four points have been gathered the data from the actual image is being
			
 
				 requested. For each corner a small margin is added (around 3 pixels) so that no
			
@@ -315,13 +297,38 @@ increasing our performance, so we only have one histogram to feed to the SVM.
 
				 
			
 
				 For the classification, we use a standard Python Support Vector Machine,
			
 
				 \texttt{libsvm}. This is a often used SVM, and should allow us to simply feed
			
 
				-the data from the LBP and Feature Vector steps into the SVM and receive results.\\
			
 
				+the data from the LBP and Feature Vector steps into the SVM and receive
			
 
				+results.\\
			
 
				 \\
			
 
				 Using a SVM has two steps. First you have to train the SVM, and then you can
			
 
				 use it to classify data. The training step takes a lot of time, so luckily
			
 
				 \texttt{libsvm} offers us an opportunity to save a trained SVM. This means,
			
 
				 you do not have to train the SVM every time.
			
 
				 
			
 
				+\subsection{Supporting Scripts}
			
 
				+
			
 
				+In order to work with the code, we wrote a number of scripts. Each of these
			
 
				+scripts is named here and a description is given on what the script does.
			
 
				+
			
 
				+\subsection*{\texttt{find\_svm\_params.py}}
			
 
				+
			
 
				+
			
 
				+
			
 
				+\subsection*{\texttt{LearningSetGenerator.py}}
			
 
				+
			
 
				+
			
 
				+
			
 
				+\subsection*{\texttt{load\_characters.py}}
			
 
				+
			
 
				+
			
 
				+
			
 
				+\subsection*{\texttt{load\_learning\_set.py}}
			
 
				+
			
 
				+
			
 
				+
			
 
				+\subsection*{\texttt{run\_classifier.py}}
			
 
				+
			
 
				+
			
 
				 \section{Finding parameters}
			
 
				 
			
 
				 Now that we have a functioning system, we need to tune it to work properly for
			
@@ -348,7 +355,14 @@ value, and what value we decided on.
 
				 
			
 
				 The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
			
 
				 find this parameter, we tested a few values, by trying them and checking the
			
 
				-results. It turned out that the best value was $\sigma = 1.4$.
			
 
				+results. It turned out that the best value was $\sigma = 1.4$.\\
			
 
				+\\
			
 
				+Theoretically, this can be explained as follows. The filter has width of 
			
 
				+$6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
			
 
				+after our resize operations, around 8 pixels. This means, our filter `matches'
			
 
				+the smallest detail size we want to be able to see, so everything that is
			
 
				+smaller is properly suppressed, yet it retains the details we do want to keep,
			
 
				+being everything that is part of the character.
			
 
				 
			
 
				 \subsection{Parameter \emph{cell size}}
			
 
				 
			
@@ -456,8 +470,8 @@ there.\\
 
				 \\
			
 
				 The speed of a classification turned out to be reasonably good. We time between
			
 
				 the moment a character has been 'cut out' of the image, so we have a exact
			
 
				-image of a character, to the moment where the SVM tells us what character it is.
			
 
				-This time is on average $65$ ms. That means that this
			
 
				+image of a character, to the moment where the SVM tells us what character it
			
 
				+is. This time is on average $65$ ms. That means that this
			
 
				 technique (tested on an AMD Phenom II X4 955 Quad core CPU running at 3.2 GHz)
			
 
				 can identify 15 characters per second.\\
			
 
				 \\