Преглед на файлове

Worked on Classifier section in report.

Taddeus Kroes преди 14 години
родител
ревизия
bda02c1dea
променени са 1 файла, в които са добавени 18 реда и са изтрити 19 реда
  1. 18 19
      docs/report.tex

+ 18 - 19
docs/report.tex

@@ -175,8 +175,7 @@ working with just one cell) gives us the best results.
 
 
 Given the LBP of a character, a Support Vector Machine can be used to classify
 Given the LBP of a character, a Support Vector Machine can be used to classify
 the character to a character in a learning set. The SVM uses the concatenation
 the character to a character in a learning set. The SVM uses the concatenation
-of the histograms of all cells in an image as a feature vector (in the case we
-check the entire image no concatenation has to be done of course. The SVM can
+of the histograms of all cells in an image as a feature vector. The SVM can
 be trained with a subset of the given dataset called the ``learning set''. Once
 be trained with a subset of the given dataset called the ``learning set''. Once
 trained, the entire classifier can be saved as a Pickle object\footnote{See
 trained, the entire classifier can be saved as a Pickle object\footnote{See
 \url{http://docs.python.org/library/pickle.html}} for later usage.
 \url{http://docs.python.org/library/pickle.html}} for later usage.
@@ -195,7 +194,7 @@ stored in XML files. So, the first step is to read these XML files.
 
 
 \paragraph*{XML reader}
 \paragraph*{XML reader}
 
 
-The XML reader will return a 'license plate' object when given an XML file. The
+The XML reader will return a `license plate' object when given an XML file. The
 licence plate holds a list of, up to six, NormalizedImage characters and from
 licence plate holds a list of, up to six, NormalizedImage characters and from
 which country the plate is from. The reader is currently assuming the XML file
 which country the plate is from. The reader is currently assuming the XML file
 and image name are corresponding, since this was the case for the given
 and image name are corresponding, since this was the case for the given
@@ -305,22 +304,21 @@ increasing our performance, so we only have one histogram to feed to the SVM.
 \subsection{Classification}
 \subsection{Classification}
 
 
 For the classification, we use a standard Python Support Vector Machine,
 For the classification, we use a standard Python Support Vector Machine,
-\texttt{libsvm}. This is a often used SVM, and should allow us to simply feed
-the data from the LBP and Feature Vector steps into the SVM and receive
-results.\\
-\\
-Using a SVM has two steps. First you have to train the SVM, and then you can
-use it to classify data. The training step takes a lot of time, so luckily
-\texttt{libsvm} offers us an opportunity to save a trained SVM. This means,
-you do not have to train the SVM every time.\\
-\\
+\texttt{libsvm}. This is an often used SVM, and should allow us to simply feed
+data from the LBP and Feature Vector steps into the SVM and receive results.
+
+Using a SVM has two steps. First, the SVM has to be trained, and then it can be
+used to classify data. The training step takes a lot of time, but luckily
+\texttt{libsvm} offers us an opportunity to save a trained SVM. This means that
+the SVM only has to be changed once.
+
 We have decided to only include a character in the system if the SVM can be
 We have decided to only include a character in the system if the SVM can be
-trained with at least 70 examples. This is done automatically, by splitting
-the data set in a trainingset and a testset, where the first 70 examples of
-a character are added to the trainingset, and all the following examples are
-added to the testset. Therefore, if there are not enough examples, all
-available examples end up in the trainingset, and non of these characters
-end up in the testset, thus they do not decrease our score. However, if this
+trained with at least 70 examples. This is done automatically, by splitting the
+data set in a learning set and a test set, where the first 70 examples of a
+character are added to the learning set, and all the following examples are
+added to the test set. Therefore, if there are not enough examples, all
+available examples end up in the learning set, and non of these characters end
+up in the test set, thus they do not decrease our score. However, if this
 character later does get offered to the system, the training is as good as
 character later does get offered to the system, the training is as good as
 possible, since it is trained with all available characters.
 possible, since it is trained with all available characters.
 
 
@@ -333,7 +331,7 @@ scripts is named here and a description is given on what the script does.
 
 
 
 
 
 
-\subsection*{\texttt{LearningSetGenerator.py}}
+\subsection*{\texttt{generate\_learning\_set.py}}
 
 
 
 
 
 
@@ -348,6 +346,7 @@ scripts is named here and a description is given on what the script does.
 \subsection*{\texttt{run\_classifier.py}}
 \subsection*{\texttt{run\_classifier.py}}
 
 
 
 
+
 \section{Finding parameters}
 \section{Finding parameters}
 
 
 Now that we have a functioning system, we need to tune it to work properly for
 Now that we have a functioning system, we need to tune it to work properly for