Commit bda02c1d authored by Taddeus Kroes's avatar Taddeus Kroes

Worked on Classifier section in report.

parent 1848170e
...@@ -175,8 +175,7 @@ working with just one cell) gives us the best results. ...@@ -175,8 +175,7 @@ working with just one cell) gives us the best results.
Given the LBP of a character, a Support Vector Machine can be used to classify Given the LBP of a character, a Support Vector Machine can be used to classify
the character to a character in a learning set. The SVM uses the concatenation the character to a character in a learning set. The SVM uses the concatenation
of the histograms of all cells in an image as a feature vector (in the case we of the histograms of all cells in an image as a feature vector. The SVM can
check the entire image no concatenation has to be done of course. The SVM can
be trained with a subset of the given dataset called the ``learning set''. Once be trained with a subset of the given dataset called the ``learning set''. Once
trained, the entire classifier can be saved as a Pickle object\footnote{See trained, the entire classifier can be saved as a Pickle object\footnote{See
\url{http://docs.python.org/library/pickle.html}} for later usage. \url{http://docs.python.org/library/pickle.html}} for later usage.
...@@ -195,7 +194,7 @@ stored in XML files. So, the first step is to read these XML files. ...@@ -195,7 +194,7 @@ stored in XML files. So, the first step is to read these XML files.
\paragraph*{XML reader} \paragraph*{XML reader}
The XML reader will return a 'license plate' object when given an XML file. The The XML reader will return a `license plate' object when given an XML file. The
licence plate holds a list of, up to six, NormalizedImage characters and from licence plate holds a list of, up to six, NormalizedImage characters and from
which country the plate is from. The reader is currently assuming the XML file which country the plate is from. The reader is currently assuming the XML file
and image name are corresponding, since this was the case for the given and image name are corresponding, since this was the case for the given
...@@ -305,22 +304,21 @@ increasing our performance, so we only have one histogram to feed to the SVM. ...@@ -305,22 +304,21 @@ increasing our performance, so we only have one histogram to feed to the SVM.
\subsection{Classification} \subsection{Classification}
For the classification, we use a standard Python Support Vector Machine, For the classification, we use a standard Python Support Vector Machine,
\texttt{libsvm}. This is a often used SVM, and should allow us to simply feed \texttt{libsvm}. This is an often used SVM, and should allow us to simply feed
the data from the LBP and Feature Vector steps into the SVM and receive data from the LBP and Feature Vector steps into the SVM and receive results.
results.\\
\\ Using a SVM has two steps. First, the SVM has to be trained, and then it can be
Using a SVM has two steps. First you have to train the SVM, and then you can used to classify data. The training step takes a lot of time, but luckily
use it to classify data. The training step takes a lot of time, so luckily \texttt{libsvm} offers us an opportunity to save a trained SVM. This means that
\texttt{libsvm} offers us an opportunity to save a trained SVM. This means, the SVM only has to be changed once.
you do not have to train the SVM every time.\\
\\
We have decided to only include a character in the system if the SVM can be We have decided to only include a character in the system if the SVM can be
trained with at least 70 examples. This is done automatically, by splitting trained with at least 70 examples. This is done automatically, by splitting the
the data set in a trainingset and a testset, where the first 70 examples of data set in a learning set and a test set, where the first 70 examples of a
a character are added to the trainingset, and all the following examples are character are added to the learning set, and all the following examples are
added to the testset. Therefore, if there are not enough examples, all added to the test set. Therefore, if there are not enough examples, all
available examples end up in the trainingset, and non of these characters available examples end up in the learning set, and non of these characters end
end up in the testset, thus they do not decrease our score. However, if this up in the test set, thus they do not decrease our score. However, if this
character later does get offered to the system, the training is as good as character later does get offered to the system, the training is as good as
possible, since it is trained with all available characters. possible, since it is trained with all available characters.
...@@ -333,7 +331,7 @@ scripts is named here and a description is given on what the script does. ...@@ -333,7 +331,7 @@ scripts is named here and a description is given on what the script does.
\subsection*{\texttt{LearningSetGenerator.py}} \subsection*{\texttt{generate\_learning\_set.py}}
...@@ -348,6 +346,7 @@ scripts is named here and a description is given on what the script does. ...@@ -348,6 +346,7 @@ scripts is named here and a description is given on what the script does.
\subsection*{\texttt{run\_classifier.py}} \subsection*{\texttt{run\_classifier.py}}
\section{Finding parameters} \section{Finding parameters}
Now that we have a functioning system, we need to tune it to work properly for Now that we have a functioning system, we need to tune it to work properly for
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment