Просмотр исходного кода

Merge branch 'master' of github.com:taddeus/licenseplates

Richard Torenvliet 14 лет назад
Родитель
Сommit
ad483f62dc
1 измененных файлов с 40 добавлено и 38 удалено
  1. 40 38
      docs/report.tex

+ 40 - 38
docs/report.tex

@@ -162,20 +162,19 @@ The outcome of this operations will be a binary pattern. Note that the
 mathematical expression has the same effect as the bit shifting operation that
 mathematical expression has the same effect as the bit shifting operation that
 we defined earlier.
 we defined earlier.
 
 
-\item Given this pattern, the next step is to divide the pattern in cells. The
-amount of cells depends on the quality of the result, so trial and error is in
-order. Starting with dividing the pattern in to cells of size 16.
+\item Given this pattern for each pixel, the next step is to divide the image
+into cells.
 
 
 \item Compute a histogram for each cell.
 \item Compute a histogram for each cell.
 
 
 \begin{figure}[H]
 \begin{figure}[H]
     \center
     \center
     \includegraphics[scale=0.7]{cells.png}
     \includegraphics[scale=0.7]{cells.png}
-    \caption{Divide in cells(Pietik\"ainen et all (2011))}
+    \caption{Divide into cells (Pietik\"ainen et all (2011))}
 \end{figure}
 \end{figure}
 
 
-\item Consider every histogram as a vector element and concatenate these. The
-result is a feature vector of the image.
+\item Consider every histogram a vector element and concatenate all histograms.
+The concatenation is the feature vector of the image.
 
 
 \item Feed these vectors to a support vector machine. The SVM will ``learn''
 \item Feed these vectors to a support vector machine. The SVM will ``learn''
 which vectors to associate with a character.
 which vectors to associate with a character.
@@ -329,25 +328,28 @@ For the classification, we use a standard Python Support Vector Machine,
 \texttt{libsvm}. This is an often used SVM, and should allow us to simply feed
 \texttt{libsvm}. This is an often used SVM, and should allow us to simply feed
 data from the LBP and Feature Vector steps into the SVM and receive results.
 data from the LBP and Feature Vector steps into the SVM and receive results.
 
 
-Using a SVM has two steps. First, the SVM has to be trained, and then it can be
-used to classify data. The training step takes a lot of time, but luckily
-\texttt{libsvm} offers us an opportunity to save a trained SVM. This means that
-the SVM only has to be changed once.
-
-We have decided to only include a character in the system if the SVM can be
-trained with at least 70 examples. This is done automatically, by splitting the
-data set in a learning set and a test set, where the first 70 examples of a
-character are added to the learning set, and all the following examples are
-added to the test set. Therefore, if there are not enough examples, all
-available examples end up in the learning set, and non of these characters end
-up in the test set, thus they do not decrease our score. However, if this
-character later does get offered to the system, the training is as good as
-possible, since it is trained with all available characters.
+
+
+Usage a SVM can be divided in two steps. First, the SVM has to be trained
+before it can be used to classify data. The training step takes a lot of time,
+but luckily \texttt{libsvm} offers us an opportunity to save a trained SVM.
+This means that the SVM only has to be created once, and can be saved for later
+usage.
+
+We have decided only to include a character in the system if the SVM can be
+trained with 70 examples. This is done automatically, by splitting the data set
+in a learning set and a test set, where the first 70 occurrences of a character
+are added to the learning set, and all the following are added to the test set.
+Therefore, if there are not enough examples, all available occurrences end up
+in the learning set, and non of these characters end up in the test set. Thus,
+they do not decrease our score. If such a character would be offered to the
+system (which it will not be in out own test program), the SVM will recognize
+it as good as possible because all occurrences are in the learning set.
 
 
 \subsection{Supporting Scripts}
 \subsection{Supporting Scripts}
 
 
-In order to work with the code, we wrote a number of scripts. Each of these
-scripts is named here and a description is given on what the script does.
+To be able to use the code efficiently, we wrote a number of scripts. This
+section describes the purpose and usage of each script.
 
 
 \subsection*{\texttt{create\_characters.py}}
 \subsection*{\texttt{create\_characters.py}}
 
 
@@ -378,18 +380,18 @@ scripts is named here and a description is given on what the script does.
 Now that we have a functioning system, we need to tune it to work properly for
 Now that we have a functioning system, we need to tune it to work properly for
 license plates. This means we need to find the parameters. Throughout the
 license plates. This means we need to find the parameters. Throughout the
 program we have a number of parameters for which no standard choice is
 program we have a number of parameters for which no standard choice is
-available. These parameters are:\\
-\\
+available. These parameters are:
+
 \begin{tabular}{l|l}
 \begin{tabular}{l|l}
-	Parameter 			& Description\\
+	Parameter 			& Description \\
 	\hline
 	\hline
-	$\sigma$  			& The size of the Gaussian blur.\\
+	$\sigma$  			& The size of the Gaussian blur. \\
 	\emph{cell size}	& The size of a cell for which a histogram of LBP's
 	\emph{cell size}	& The size of a cell for which a histogram of LBP's
-	                      will be generated.\\
-	\emph{Neighbourhood}& The neighbourhood to use for creating the LBP.\\
-	$\gamma$			& Parameter for the Radial kernel used in the SVM.\\
+	                      will be generated. \\
+	\emph{Neighbourhood}& The neighbourhood to use for creating the LBP. \\
+	$\gamma$			& Parameter for the Radial kernel used in the SVM. \\
 	$c$					& The soft margin of the SVM. Allows how much training
 	$c$					& The soft margin of the SVM. Allows how much training
-						  errors are accepted.\\
+						  errors are accepted. \\
 \end{tabular}
 \end{tabular}
 
 
 For each of these parameters, we will describe how we searched for a good
 For each of these parameters, we will describe how we searched for a good
@@ -446,7 +448,7 @@ reached with the following neighbourhood, which we will call the
 \subsection{Parameters $\gamma$ \& $c$}
 \subsection{Parameters $\gamma$ \& $c$}
 
 
 The parameters $\gamma$ and $c$ are used for the SVM. $c$ is a standard
 The parameters $\gamma$ and $c$ are used for the SVM. $c$ is a standard
-parameter for each type of SVM, called the 'soft margin'. This indicates how
+parameter for each type of SVM, called the `soft margin'. This indicates how
 exact each element in the learning set should be taken. A large soft margin
 exact each element in the learning set should be taken. A large soft margin
 means that an element in the learning set that accidentally has a completely
 means that an element in the learning set that accidentally has a completely
 different feature vector than expected, due to noise for example, is not taken
 different feature vector than expected, due to noise for example, is not taken
@@ -463,7 +465,7 @@ the highest score is then used as our parameters, and the entire SVM will be
 trained using those parameters.
 trained using those parameters.
 
 
 The results of this grid-search are shown in the following table. The values
 The results of this grid-search are shown in the following table. The values
-in the table are rounded percentages, for easy displaying.
+in the table are rounded percentages, for better readability.
 
 
 \begin{tabular}{|r|r r r r r r r r r r|}
 \begin{tabular}{|r|r r r r r r r r r r|}
 \hline
 \hline
@@ -493,10 +495,10 @@ $2^{13}$ &       90 &       92 &       92 &       92 &       92 &
 $2^{15}$ &       92 &       92 &       92 &       92 &       92 &
 $2^{15}$ &       92 &       92 &       92 &       92 &       92 &
        92 &       93 &       93 &       86 &       45\\
        92 &       93 &       93 &       86 &       45\\
 \hline
 \hline
-\end{tabular}
+\end{tabular} \\
 
 
-We found that the best values for these parameters are $c = 32$ and
-$\gamma = 0.125$.
+The grid-search shows that the best values for these parameters are $c = 2^5 =
+32$ and $\gamma = 2^{-3} = 0.125$.
 
 
 \section{Results}
 \section{Results}
 
 
@@ -535,9 +537,9 @@ there.
 The speed of a classification turned out to be reasonably good. We time between
 The speed of a classification turned out to be reasonably good. We time between
 the moment a character has been 'cut out' of the image, so we have a exact
 the moment a character has been 'cut out' of the image, so we have a exact
 image of a character, to the moment where the SVM tells us what character it
 image of a character, to the moment where the SVM tells us what character it
-is. This time is on average $65$ ms. That means that this
-technique (tested on an AMD Phenom II X4 955 CPU running at 3.2 GHz)
-can identify 15 characters per second.
+is. This time is on average $65ms$. That means that this technique (tested on
+an AMD Phenom II X4 955 CPU running at 3.2 GHz) can identify 15 characters per
+second.
 
 
 This is not spectacular considering the amount of calculating power this CPU
 This is not spectacular considering the amount of calculating power this CPU
 can offer, but it is still fairly reasonable. Of course, this program is
 can offer, but it is still fairly reasonable. Of course, this program is