14 lat temu · 3228ef587b
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -45,39 +45,38 @@ in classifying characters on a license plate.
 
				 In short our program must be able to do the following:
			
 
				 
			
 
				 \begin{enumerate}
			
 
				-    \item Extract characters using the location points in the xml file.
			
 
				+    \item Extracting characters using the location points in the xml file.
			
 
				     \item Reduce noise where possible to ensure maximum readability.
			
 
				-    \item Transform a character to a normal form.
			
 
				-    \item Create a local binary pattern histogram vector.
			
 
				-    \item Recognize the character value of a vector using a classifier.
			
 
				-    \item Determine the performance of the classifier with a given test set.
			
 
				+    \item Transforming a character to a normal form.
			
 
				+    \item Creating a local binary pattern histogram vector.
			
 
				+    \item Matching the found vector with a learning set.
			
 
				+    \item And finally it has to check results with a real data set.
			
 
				 \end{enumerate}
			
 
				 
			
 
				 \section{Language of choice}
			
 
				 
			
 
				 The actual purpose of this project is to check if LBP is capable of recognizing
			
 
				-license plate characters. Since the LBP algorithm is fairly simple to
			
 
				-implement, it should have a good performance in comparison to other license
			
 
				-plate recognition implementations if implemented in C. However, we decided to
			
 
				-focus on functionality rather than speed. Therefore, we picked Python. We felt
			
 
				-Python would not restrict us as much in assigning tasks to each member of the
			
 
				-group. In addition, when using the correct modules to handle images, Python can
			
 
				-be decent in speed.
			
 
				+license plate characters. We knew the LBP implementation would be pretty
			
 
				+simple. Thus an advantage had to be its speed compared with other license plate
			
 
				+recognition implementations, but the uncertainty of whether we could get some
			
 
				+results made us pick Python. We felt Python would not restrict us as much in
			
 
				+assigning tasks to each member of the group. In addition, when using the
			
 
				+correct modules to handle images, Python can be decent in speed.
			
 
				 
			
 
				 \section{Theory}
			
 
				 
			
 
				 Now we know what our program has to be capable of, we can start with the
			
 
				-defining the problems we have and how we are planning to solve these.
			
 
				+defining what problems we have and how we want to solve these.
			
 
				 
			
 
				 \subsection{Extracting a letter and resizing it}
			
 
				 
			
 
				-% TODO: Rewrite this section once we have implemented this properly.
			
 
				+Rewrite this section once we have implemented this properly.
			
 
				 
			
 
				 \subsection{Transformation}
			
 
				 
			
 
				 A simple perspective transformation will be sufficient to transform and resize
			
 
				 the characters to a normalized format. The corner positions of characters in
			
 
				-the dataset are provided together with the dataset.
			
 
				+the dataset are supplied together with the dataset.
			
 
				 
			
 
				 \subsection{Reducing noise}
			
 
				 
			
@@ -93,80 +92,76 @@ part of the license plate remains readable.
 
				 
			
 
				 \subsection{Local binary patterns}
			
 
				 Once we have separate digits and characters, we intent to use Local Binary
			
 
				-Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character or
			
 
				-digit we are dealing with. Local Binary Patterns are a way to classify a
			
 
				-texture based on the distribution of edge directions in the image. Since
			
 
				-letters on a license plate consist mainly of straight lines and simple curves,
			
 
				-LBP should be suited to identify these.
			
 
				+Patterns (Ojala, Pietikäinen \& Harwood, 1994) to determine what character
			
 
				+or digit we are dealing with. Local Binary
			
 
				+Patterns are a way to classify a texture based on the distribution of edge
			
 
				+directions in the image. Since letters on a license plate consist mainly of
			
 
				+straight lines and simple curves, LBP should be suited to identify these.
			
 
				 
			
 
				 \subsubsection{LBP Algorithm}
			
 
				 The LBP algorithm that we implemented can use a variety of neighbourhoods,
			
 
				-including the same square pattern that is introduced by Ojala et al (1994), and
			
 
				-a circular form as presented by Wikipedia.
			
 
				-
			
 
				-\begin{enumerate}
			
 
				-
			
 
				+including the same square pattern that is introduced by Ojala et al (1994),
			
 
				+and a circular form as presented by Wikipedia.
			
 
				+\begin{itemize}
			
 
				 \item Determine the size of the square where the local patterns are being
			
 
				 registered. For explanation purposes let the square be 3 x 3. \\
			
 
				-
			
 
				-\item The grayscale value of the center pixel is used as threshold. Every value
			
 
				-of the pixel around the center pixel is evaluated. If it's value is greater
			
 
				-than the threshold it will be become a one, otherwise it will be a zero.
			
 
				+\item The grayscale value of the middle pixel is used as threshold. Every
			
 
				+value of the pixel around the middle pixel is evaluated. If it's value is
			
 
				+greater than the threshold it will be become a one else a zero.
			
 
				 
			
 
				 \begin{figure}[H]
			
 
				-    \center
			
 
				-    \includegraphics[scale=0.5]{lbp.png}
			
 
				-    \caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
			
 
				+\center
			
 
				+\includegraphics[scale=0.5]{lbp.png}
			
 
				+\caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
			
 
				 \end{figure}
			
 
				 
			
 
				-The pattern will be an 8-bit integer. This is accomplished by shifting the
			
 
				-boolean value of each comparison one to seven places to the left.
			
 
				+Notice that the pattern will be come of the form 01001110. This is done when a
			
 
				+the value of the evaluated pixel is greater than the threshold, shift the bit
			
 
				+by the n(with i=i$_{th}$ pixel evaluated, starting with $i=0$).
			
 
				 
			
 
				 This results in a mathematical expression:
			
 
				 
			
 
				-Let I($x_i, y_i$) be a grayscale Image and $g_n$ the value of the pixel $(x_i,
			
 
				-y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ being the value of the
			
 
				-center pixel and $g_i$ the grayscale value of the pixel to be evaluated.
			
 
				+Let I($x_i, y_i$) an Image with grayscale values and $g_n$ the grayscale value
			
 
				+of the pixel $(x_i, y_i)$. Also let $s(g_i, g_c)$ (see below) with $g_c$ =
			
 
				+grayscale value of the center pixel and $g_i$ the grayscale value of the pixel
			
 
				+to be evaluated.
			
 
				 
			
 
				 $$
			
 
				-    s(g_i, g_c) = \left \{
			
 
				-    \begin{array}{l l}
			
 
				-        1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
			
 
				-        0 & \quad \text{if $g_i$ $<$ $g_c$}\\
			
 
				-    \end{array} \right.
			
 
				+  s(g_i, g_c) = \left\{
			
 
				+  \begin{array}{l l}
			
 
				+    1 & \quad \text{if $g_i$ $\geq$ $g_c$}\\
			
 
				+    0 & \quad \text{if $g_i$ $<$ $g_c$}\\
			
 
				+  \end{array} \right.
			
 
				 $$
			
 
				 
			
 
				-$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c) \cdot 2^i$$
			
 
				+$$LBP_{n, g_c = (x_c, y_c)} = \sum\limits_{i=0}^{n-1} s(g_i, g_c)^{2i} $$
			
 
				 
			
 
				-The outcome of this operations will be a binary pattern. Note that the
			
 
				-mathematical expression has the same effect as the bit shifting operation that
			
 
				-we defined earlier.
			
 
				+The outcome of this operations will be a binary pattern.
			
 
				 
			
 
				-\item Given this pattern, the next step is to divide the pattern into cells.
			
 
				-The amount of cells depends on the quality of the result, which we plan to
			
 
				-determine by trial and error. We will start by dividing the pattern into cells
			
 
				-of size 16, which is a common value according to Wikipedia.
			
 
				+\item Given this pattern, the next step is to divide the pattern in cells. The
			
 
				+amount of cells depends on the quality of the result, so trial and error is in
			
 
				+order. Starting with dividing the pattern in to cells of size 16.
			
 
				 
			
 
				 \item Compute a histogram for each cell.
			
 
				 
			
 
				 \begin{figure}[H]
			
 
				-    \center
			
 
				-    \includegraphics[scale=0.7]{cells.png}
			
 
				-    \caption{Divide into cells (Pietik\"ainen et all (2011))}
			
 
				+\center
			
 
				+\includegraphics[scale=0.7]{cells.png}
			
 
				+\caption{Divide in cells(Pietik\"ainen et all (2011))}
			
 
				 \end{figure}
			
 
				 
			
 
				 \item Consider every histogram as a vector element and concatenate these. The
			
 
				 result is a feature vector of the image.
			
 
				 
			
 
				-\item Feed these vectors to a support vector machine. The SVM will ``learn''
			
 
				-which vectors to associate with a character.
			
 
				+\item Feed these vectors to a support vector machine. This will ''learn'' which
			
 
				+vector indicates what vector is which character.
			
 
				 
			
 
				-\end{enumerate}
			
 
				+\end{itemize}
			
 
				 
			
 
				 To our knowledge, LBP has yet not been used in this manner before. Therefore,
			
 
				 it will be the first thing to implement, to see if it lives up to the
			
 
				-expectations. When the proof of concept is there, it can be used in a final,
			
 
				-more efficient program.
			
 
				+expectations. When the proof of concept is there, it can be used in a final
			
 
				+program.
			
 
				 
			
 
				 Later we will show that taking a histogram over the entire image (basically
			
 
				 working with just one cell) gives us the best results.
			
@@ -174,16 +169,19 @@ working with just one cell) gives us the best results.
 
				 \subsection{Matching the database}
			
 
				 
			
 
				 Given the LBP of a character, a Support Vector Machine can be used to classify
			
 
				-the character to a character in a learning set. The SVM uses the concatenation
			
 
				-of the histograms of all cells in an image as a feature vector. The SVM can
			
 
				-be trained with a subset of the given dataset called the ``learning set''. Once
			
 
				-trained, the entire classifier can be saved as a Pickle object\footnote{See
			
 
				+the character to a character in a learning set. The SVM uses a concatenation
			
 
				+of each cell in an image as a feature vector (in the case we check the entire
			
 
				+image no concatenation has to be done of course. The SVM can be trained with a
			
 
				+subset of the given dataset called the ''Learning set''. Once trained, the
			
 
				+entire classifier can be saved as a Pickle object\footnote{See
			
 
				 \url{http://docs.python.org/library/pickle.html}} for later usage.
			
 
				+In our case the support vector machine uses a radial gauss kernel function. The
			
 
				+ SVM finds a seperating hyperplane with minimum margins.
			
 
				 
			
 
				 \section{Implementation}
			
 
				 
			
 
				-In this section we will describe our implementation in more detail, explaining
			
 
				-the choices we made in the process.
			
 
				+In this section we will describe our implementations in more detail, explaining
			
 
				+choices we made.
			
 
				 
			
 
				 \subsection{Character retrieval}
			
 
				 
			
@@ -602,6 +600,23 @@ not a big problem as no one was afraid of staying at Science Park a bit longer
 
				 to help out. Further communication usually went through e-mails and replies
			
 
				 were instantaneous! A crew to remember.
			
 
				 
			
 
				+\section{Discussion}
			
 
				+
			
 
				+
			
 
				+\begin{thebibliography}{9}
			
 
				+\bibitem{lbp1}
			
 
				+  Matti Pietik\"ainen, Guoyin Zhao, Abdenour hadid,
			
 
				+  Timo Ahonen.
			
 
				+  \emph{Computational Imaging and Vision}.
			
 
				+  Springer-Verlag, London,
			
 
				+  1st Edition,
			
 
				+  2011.
			
 
				+\bibitem{wikiplate}
			
 
				+  \emph{Automatic number-plate recognition}. (2011, December 17).\\
			
 
				+  Wikipedia.
			
 
				+  Retrieved from http://en.wikipedia.org/wiki/Automatic\_number\_plate\_recognition
			
 
				+\end{thebibliography}
			
 
				+
			
 
				 
			
 
				 \appendix
			
 
				 \section{Faulty Classifications}
			
--- a/src/Character.py
+++ b/src/Character.py
@@ -8,6 +8,8 @@ class Character:
 
				         self.filename = filename
			
 
				 
			
 
				     def get_single_cell_feature_vector(self, neighbours=5):
			
 
				+        """Get the histogram of Local Binary Patterns over this entire
			
 
				+        image."""
			
 
				         if hasattr(self, 'feature'):
			
 
				             return
			
 
				 
			
@@ -15,6 +17,7 @@ class Character:
 
				         self.feature = pattern.single_cell_features_vector()
			
 
				 
			
 
				     def get_feature_vector(self, cell_size=None):
			
 
				+        """Get the concatenated histograms of Local Binary Patterns. """
			
 
				         pattern = LBP(self.image) if cell_size == None \
			
 
				                   else LBP(self.image, cell_size)
			
 
				 
			
--- a/src/Classifier.py
+++ b/src/Classifier.py
@@ -1,7 +1,6 @@
 
				 from svmutil import svm_train, svm_problem, svm_parameter, svm_predict, \
			
 
				         svm_save_model, svm_load_model, RBF
			
 
				 
			
 
				-
			
 
				 class Classifier:
			
 
				     def __init__(self, c=None, gamma=None, filename=None, neighbours=3, \
			
 
				             verbose=0):
			
--- a/src/GrayscaleImage.py
+++ b/src/GrayscaleImage.py
@@ -22,20 +22,6 @@ class GrayscaleImage:
 
				             for x in xrange(self.data.shape[1]):
			
 
				                 yield y, x, self.data[y, x]
			
 
				 
			
 
				-        #self.__i_x = -1
			
 
				-        #self.__i_y = 0
			
 
				-        #return self
			
 
				-
			
 
				-    #def next(self):
			
 
				-    #    self.__i_x += 1
			
 
				-    #    if self.__i_x  == self.width:
			
 
				-    #        self.__i_x = 0
			
 
				-    #        self.__i_y += 1
			
 
				-    #    if self.__i_y == self.height:
			
 
				-    #        raise StopIteration
			
 
				-
			
 
				-    #    return  self.__i_y, self.__i_x, self[self.__i_y, self.__i_x]
			
 
				-
			
 
				     def __getitem__(self, position):
			
 
				         return self.data[position]
			
 
				 
			
--- a/src/Histogram.py
+++ b/src/Histogram.py
@@ -6,13 +6,9 @@ class Histogram:
 
				         self.max = max
			
 
				 
			
 
				     def add(self, number):
			
 
				-        #bin_index = self.get_bin_index(number)
			
 
				-        #self.bins[bin_index] += 1
			
 
				         self.bins[number] += 1
			
 
				 
			
 
				     def remove(self, number):
			
 
				-        #bin_index = self.get_bin_index(number)
			
 
				-        #self.bins[bin_index] -= 1
			
 
				         self.bins[number] -= 1
			
 
				 
			
 
				     def get_bin_index(self, number):
			
--- a/src/NormalizedCharacterImage.py
+++ b/src/NormalizedCharacterImage.py
@@ -13,14 +13,16 @@ class NormalizedCharacterImage(GrayscaleImage):
 
				         self.blur = blur
			
 
				         self.gaussian_filter()
			
 
				 
			
 
				-        self.increase_contrast()
			
 
				+        #self.increase_contrast()
			
 
				 
			
 
				         self.height = height
			
 
				         self.resize()
			
 
				 
			
 
				-    def increase_contrast(self):
			
 
				-        self.data -= self.data.min()
			
 
				-        self.data = self.data.astype(float) / self.data.max()
			
 
				+#    def increase_contrast(self):
			
 
				+#        """Increase the contrast by performing a grayscale mapping from the 
			
 
				+#        current maximum and minimum to a range between 0 and 1."""
			
 
				+#        self.data -= self.data.min()
			
 
				+#        self.data = self.data.astype(float) / self.data.max()
			
 
				 
			
 
				     def gaussian_filter(self):
			
 
				         GaussianFilter(self.blur).filter(self)
			
--- a/src/create_characters.py
+++ b/src/create_characters.py
@@ -80,6 +80,7 @@ def load_test_set(neighbours, blur_scale, verbose=0):
 
				 
			
 
				 
			
 
				 def generate_sets(neighbours, blur_scale, verbose=0):
			
 
				+    """Split the entire dataset into a trainingset and a testset."""
			
 
				     suffix = '_%s_%s' % (blur_scale, neighbours)
			
 
				     learning_set_file = 'learning_set%s.dat' % suffix
			
 
				     test_set_file = 'test_set%s.dat' % suffix
			
--- a/src/xml_helper_functions.py
+++ b/src/xml_helper_functions.py
@@ -125,33 +125,33 @@ def xml_to_LicensePlate(filename, save_character=None):
 
				     return LicensePlate(country, result_characters)
			
 
				 
			
 
				 def get_corners(dom):
			
 
				-  nodes = dom.getElementsByTagName("point")
			
 
				-  corners = []
			
 
				+    nodes = dom.getElementsByTagName("point")
			
 
				+    corners = []
			
 
				 
			
 
				-  margin_y = 3
			
 
				-  margin_x = 2
			
 
				+    margin_y = 3
			
 
				+    margin_x = 2
			
 
				 
			
 
				-  corners.append(
			
 
				+    corners.append(
			
 
				     Point(get_coord(nodes[0], "x") - margin_x,
			
 
				           get_coord(nodes[0], "y") - margin_y)
			
 
				-  )
			
 
				+    )
			
 
				 
			
 
				-  corners.append(
			
 
				+    corners.append(
			
 
				     Point(get_coord(nodes[1], "x") + margin_x,
			
 
				           get_coord(nodes[1], "y") - margin_y)
			
 
				-  )
			
 
				+    )
			
 
				 
			
 
				-  corners.append(
			
 
				+    corners.append(
			
 
				     Point(get_coord(nodes[2], "x") + margin_x,
			
 
				           get_coord(nodes[2], "y") + margin_y)
			
 
				-  )
			
 
				+    )
			
 
				 
			
 
				-  corners.append(
			
 
				+    corners.append(
			
 
				     Point(get_coord(nodes[3], "x") - margin_x,
			
 
				           get_coord(nodes[3], "y") + margin_y)
			
 
				-  )
			
 
				+    )
			
 
				 
			
 
				-  return corners
			
 
				+    return corners
			
 
				 
			
 
				 def get_coord(node, attribute):
			
 
				-  return int(node.getAttribute(attribute))
			
 
				+    return int(node.getAttribute(attribute))