14 سال پیش · df95bf08e7
--- a/docs/verslag.tex
+++ b/docs/verslag.tex
@@ -39,8 +39,8 @@ Microsoft recently published a new and effective method to find the location of
 
				 text in an image.
			
 
				 
			
 
				 Determining what character we are looking at will be done by using Local Binary
			
 
				-Patterns. The main goal of our research is finding out how effective LBPs are in
			
 
				-classifying characters on a licenseplate.
			
 
				+Patterns. The main goal of our research is finding out how effective LBP's are
			
 
				+in classifying characters on a license plate.
			
 
				 
			
 
				 In short our program must be able to do the following:
			
 
				 
			
@@ -56,8 +56,8 @@ In short our program must be able to do the following:
 
				 
			
 
				 \section{Solutions}
			
 
				 
			
 
				-Now that the problem is defined, the next step is stating our basic solutions. This will
			
 
				-come in a few steps as well.
			
 
				+Now that the problem is defined, the next step is stating our basic solutions.
			
 
				+This will come in a few steps as well.
			
 
				 
			
 
				 \subsection{Transformation}
			
 
				 
			
@@ -133,81 +133,158 @@ entire classifier can be saved as a Pickle object\footnote{See
 
				 In this section we will describe our implementations in more detail, explaining
			
 
				 choices we made.
			
 
				 
			
 
				-\subsection*{Licenseplate retrieval}
			
 
				+\subsection{Licenseplate retrieval}
			
 
				 
			
 
				-In order to retrieve the license plate from the entire image, we need to perform
			
 
				-a perspective transformation. However, to do this, we need to know the 
			
 
				+In order to retrieve the license plate from the entire image, we need to
			
 
				+perform a perspective transformation. However, to do this, we need to know the 
			
 
				 coordinates of the four corners of the licenseplate. For our dataset, this is
			
 
				-stored in XML files. So, the first step is to read these XML files.
			
 
				-
			
 
				+stored in XML files. So, the first step is to read these XML files.\\
			
 
				+\\
			
 
				 \paragraph*{XML reader}
			
 
				 
			
 
				 
			
 
				 
			
 
				 \paragraph*{Perspective transformation}
			
 
				-
			
 
				-Once we retrieved the cornerpoints of the licenseplate, we feed those to a
			
 
				-module that extracts the (warped) licenseplate from the original image, and
			
 
				-creates a new image where the licenseplate is cut out, and is transformed to a
			
 
				+Once we retrieved the cornerpoints of the license plate, we feed those to a
			
 
				+module that extracts the (warped) license plate from the original image, and
			
 
				+creates a new image where the license plate is cut out, and is transformed to a
			
 
				 rectangle.
			
 
				 
			
 
				-\subsection*{Noise reduction}
			
 
				+\subsection{Noise reduction}
			
 
				 
			
 
				-The image contains a lot of noise, both from camera errors due to dark noise etc.,
			
 
				-as from dirt on the license plate. In this case, noise therefor means any unwanted
			
 
				-difference in color from the surrounding pixels.
			
 
				+The image contains a lot of noise, both from camera errors due to dark noise 
			
 
				+etc., as from dirt on the license plate. In this case, noise therefore means 
			
 
				+any unwanted difference in color from the surrounding pixels.
			
 
				 
			
 
				 \paragraph*{Camera noise and small amounts of dirt}
			
 
				-
			
 
				-The dirt on the licenseplate can be of different sizes. We can reduce the smaller
			
 
				-amounts of dirt in the same way as we reduce normal noise, by applying a gaussian
			
 
				-blur to the image. This is the next step in our program.\\
			
 
				+The dirt on the license plate can be of different sizes. We can reduce the 
			
 
				+smaller amounts of dirt in the same way as we reduce normal noise, by applying
			
 
				+a Gaussian blur to the image. This is the next step in our program.\\
			
 
				 \\
			
 
				-The gaussian filter we use comes from the \texttt{scipy.ndimage} module. We use
			
 
				+The Gaussian filter we use comes from the \texttt{scipy.ndimage} module. We use
			
 
				 this function instead of our own function, because the standard functions are
			
 
				-most likely more optimized then our own implementation, and speed is an important
			
 
				-factor in this application.
			
 
				+most likely more optimized then our own implementation, and speed is an
			
 
				+important factor in this application.
			
 
				 
			
 
				 \paragraph*{Larger amounts of dirt}
			
 
				-
			
 
				 Larger amounts of dirt are not going to be resolved by using a Gaussian filter.
			
 
				-We rely on one of the characteristics of the Local Binary Pattern, only looking at
			
 
				-the difference between two pixels, to take care of these problems.\\
			
 
				-Because there will probably always be a difference between the characters and the 
			
 
				-dirt, and the fact that the characters are very black, the shape of the characters
			
 
				-will still be conserved in the LBP, even if there is dirt surrounding the character.
			
 
				+We rely on one of the characteristics of the Local Binary Pattern, only looking
			
 
				+at the difference between two pixels, to take care of these problems.\\
			
 
				+Because there will probably always be a difference between the characters and
			
 
				+the dirt, and the fact that the characters are very black, the shape of the
			
 
				+characters will still be conserved in the LBP, even if there is dirt
			
 
				+surrounding the character.
			
 
				 
			
 
				-\subsection*{Character retrieval}
			
 
				+\subsection{Character retrieval}
			
 
				 
			
 
				 The retrieval of the character is done the same as the retrieval of the license
			
 
				-plate, by using a perspective transformation. The location of the characters on the
			
 
				-licenseplate is also available in de XML file, so this is parsed from that as well.
			
 
				+plate, by using a perspective transformation. The location of the characters on
			
 
				+the license plate is also available in de XML file, so this is parsed from that
			
 
				+as well.
			
 
				 
			
 
				-\subsection*{Creating Local Binary Patterns and feature vector}
			
 
				+\subsection{Creating Local Binary Patterns and feature vector}
			
 
				 
			
 
				 
			
 
				 
			
 
				-\subsection*{Classification}
			
 
				+\subsection{Classification}
			
 
				 
			
 
				 
			
 
				 
			
 
				 \section{Finding parameters}
			
 
				 
			
 
				 Now that we have a functioning system, we need to tune it to work properly for
			
 
				-license plates. This means we need to find the parameters. Throughout the program
			
 
				-we have a number of parameters for which no standard choice is available. These
			
 
				-parameters are:\\
			
 
				+license plates. This means we need to find the parameters. Throughout the 
			
 
				+program we have a number of parameters for which no standard choice is
			
 
				+available. These parameters are:\\
			
 
				 \\
			
 
				 \begin{tabular}{l|l}
			
 
				 	Parameter 			& Description\\
			
 
				 	\hline
			
 
				-	$\sigma$  			& The size of the gaussian blur.\\
			
 
				-	\emph{cell size}	& The size of a cell for which a histogram of LBPs will be generated.
			
 
				+	$\sigma$  			& The size of the Gaussian blur.\\
			
 
				+	\emph{cell size}	& The size of a cell for which a histogram of LBPs will
			
 
				+	                      be generated.\\
			
 
				+	$\gamma$			& Parameter for the Radial kernel used in the SVM.\\
			
 
				+	$c$					& The soft margin of the SVM. Allows how much training
			
 
				+						  errors are accepted.
			
 
				+\end{tabular}\\
			
 
				+\\
			
 
				+For each of these parameters, we will describe how we searched for a good
			
 
				+value, and what value we decided on.
			
 
				+
			
 
				+\subsection{Parameter $\sigma$}
			
 
				+
			
 
				+The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
			
 
				+find this parameter, we tested a few values, by checking visually what value
			
 
				+removed most noise out of the image, while keeping the edges sharp enough to
			
 
				+work with. By checking in the neighbourhood of the value that performed best,
			
 
				+we where able to 'zoom in' on what we thought was the best value. It turned out
			
 
				+that this was $\sigma = ?$.
			
 
				+
			
 
				+\subsection{Parameter \emph{cell size}}
			
 
				+
			
 
				+The cell size of the Local Binary Patterns determines over what region a
			
 
				+histogram is made. The trade-off here is that a bigger cell size makes the
			
 
				+classification less affected by relative movement of a character compared to
			
 
				+those in the learning set, since the important structure will be more likely to
			
 
				+remain in the same cell. However, if the cell size is too big, there will not
			
 
				+be enough cells to properly describe the different areas of the character, and
			
 
				+the feature vectors will not have enough elements.\\
			
 
				+\\
			
 
				+In order to find this parameter, we used a trial-and-error technique on a few
			
 
				+basic cell sizes, being ?, 16, ?. We found that the best result was reached by
			
 
				+using ??.
			
 
				+
			
 
				+\subsection{Parameters $\gamma$ \& $c$}
			
 
				+
			
 
				+The parameters $\gamma$ and $c$ are used for the SVM. $c$ is a standard
			
 
				+parameter for each type of SVM, called the 'soft margin'. This indicates how
			
 
				+exact each element in the learning set should be taken. A large soft margin
			
 
				+means that an element in the learning set that accidentally has a completely
			
 
				+different feature vector than expected, due to noise for example, is not taken
			
 
				+into account. If the soft margin is very small, then almost all vectors will be
			
 
				+taken into account, unless they differ extreme amounts.\\
			
 
				+$\gamma$ is a variable that determines the size of the radial kernel, and as
			
 
				+such blablabla.\\
			
 
				+\\
			
 
				+Since these parameters both influence the SVM, we need to find the best
			
 
				+combination of values. To do this, we perform a so-called grid-search. A
			
 
				+grid-search takes exponentially growing sequences for each parameter, and
			
 
				+checks for each combination of values what the score is. The combination with
			
 
				+the highest score is then used as our parameters, and the entire SVM will be
			
 
				+trained using those parameters.\\
			
 
				+\\
			
 
				+We found that the best values for these parameters are $c=?$ and $\gamma =?$.
			
 
				+
			
 
				+\section{Results}
			
 
				+
			
 
				+The goal was to find out two things with this research: The speed of the
			
 
				+classification and the accuracy. In this section we will show our findings.
			
 
				+
			
 
				+\subsection{Speed}
			
 
				+
			
 
				+Recognizing license plates is something that has to be done fast, since there
			
 
				+can be a lot of cars passing a camera in a short time, especially on a highway.
			
 
				+Therefore, we measured how well our program performed in terms of speed. We
			
 
				+measure the time used to classify a license plate, not the training of the
			
 
				+dataset, since that can be done offline, and speed is not a primary necessity
			
 
				+there.\\
			
 
				+\\
			
 
				+The speed of a classification turned out to be blablabla.
			
 
				+
			
 
				+\subsection{Accuracy}
			
 
				 
			
 
				-\end{tabular}
			
 
				+Of course, it is vital that the recognition of a license plate is correct,
			
 
				+almost correct is not good enough here. Therefore, we have to get the highest
			
 
				+accuracy score we possibly can.\\
			
 
				+\\ According to Wikipedia
			
 
				+\footnote{
			
 
				+\url{http://en.wikipedia.org/wiki/Automatic_number_plate_recognition}},
			
 
				+commercial license plate recognition software score about $90\%$ to $94\%$,
			
 
				+under optimal conditions and with modern equipment. Our program scores an
			
 
				+average of blablabla.
			
 
				 
			
 
				 \section{Conclusion}
			
 
				 
			
 
				 
			
 
				 
			
 
				-\end{document}
			
 
				+\end{document}
			
--- a/src/ClassifierTest.py
+++ b/src/ClassifierTest.py
@@ -1,5 +1,5 @@
 
				 #!/usr/bin/python
			
 
				-from LicensePlate import LicensePlate
			
 
				+from xml_helper_functions import xml_to_LicensePlate
			
 
				 from Classifier import Classifier
			
 
				 from cPickle import dump, load
			
 
				 
			
@@ -8,9 +8,11 @@ chars = []
 
				 for i in range(9):
			
 
				     for j in range(100):
			
 
				         try:
			
 
				-            filename = '%04d/00991_%04d%02d.info' % (i, i, j)
			
 
				+            filename = '%04d/00991_%04d%02d' % (i, i, j)
			
 
				             print 'loading file "%s"' % filename
			
 
				-            plate = LicensePlate(i, j)
			
 
				+
			
 
				+            # is nog steeds een licensePlate object, maar die is nu heel anders :P
			
 
				+            plate = xml_to_LicensePlate(filename) 
			
 
				 
			
 
				             if hasattr(plate, 'characters'):
			
 
				                 chars.extend(plate.characters)
			
--- a/src/LearningSetGenerator.py
+++ b/src/LearningSetGenerator.py
@@ -1,148 +1,10 @@
 
				-from os import mkdir
			
 
				-from os.path import exists
			
 
				-from math import acos
			
 
				-from pylab import imsave, array, zeros, inv, dot, norm, svd, floor
			
 
				-from xml.dom.minidom import parse
			
 
				-from Point import Point
			
 
				-from GrayscaleImage import GrayscaleImage
			
 
				-
			
 
				-class LearningSetGenerator:
			
 
				-
			
 
				-    def __init__(self, folder_nr, file_nr):
			
 
				-        filename = '%04d/00991_%04d%02d' % (folder_nr, folder_nr, file_nr)
			
 
				-
			
 
				-        self.image = GrayscaleImage('../images/Images/%s.jpg' % filename)
			
 
				-        self.read_xml(filename)
			
 
				-
			
 
				-    # sets the entire license plate of an image
			
 
				-    def retrieve_data(self, corners):
			
 
				-        x0, y0 = corners[0].to_tuple()
			
 
				-        x1, y1 = corners[1].to_tuple()
			
 
				-        x2, y2 = corners[2].to_tuple()
			
 
				-        x3, y3 = corners[3].to_tuple()
			
 
				-
			
 
				-        M = int(1.2 * (max(x0, x1, x2, x3) - min(x0, x1, x2, x3)))
			
 
				-        N = max(y0, y1, y2, y3) - min(y0, y1, y2, y3)
			
 
				-
			
 
				-        matrix = array([
			
 
				-          [x0, y0, 1,  0,  0, 0,       0,       0,  0],
			
 
				-          [ 0,  0, 0, x0, y0, 1,       0,       0,  0],
			
 
				-          [x1, y1, 1,  0,  0, 0, -M * x0, -M * y1, -M],
			
 
				-          [ 0,  0, 0, x1, y1, 1,       0,       0,  0],
			
 
				-          [x2, y2, 1,  0,  0, 0, -M * x2, -M * y2, -M],
			
 
				-          [ 0,  0, 0, x2, y2, 1, -N * x2, -N * y2, -N],
			
 
				-          [x3, y3, 1,  0,  0, 0,       0,       0,  0],
			
 
				-          [ 0,  0, 0, x3, y3, 1, -N * x3, -N * y3, -N]
			
 
				-        ])
			
 
				-
			
 
				-        P = inv(self.get_transformation_matrix(matrix))
			
 
				-        data = array([zeros(M, float)] * N)
			
 
				-
			
 
				-        for i in range(0, M):
			
 
				-            for j in range(0, N):
			
 
				-                or_coor   = dot(P, ([[i],[j],[1]]))
			
 
				-                or_coor_h = (or_coor[1][0] / or_coor[2][0],
			
 
				-                             or_coor[0][0] / or_coor[2][0])
			
 
				-
			
 
				-                data[j][i] = self.pV(or_coor_h[0], or_coor_h[1])
			
 
				-
			
 
				-        return data
			
 
				-
			
 
				-    def get_transformation_matrix(self, matrix):
			
 
				-        # Get the vector p and the values that are in there by taking the SVD.
			
 
				-        # Since D is diagonal with the eigenvalues sorted from large to small
			
 
				-        # on the diagonal, the optimal q in min ||Dq|| is q = [[0]..[1]].
			
 
				-        # Therefore, p = Vq means p is the last column in V.
			
 
				-        U, D, V = svd(matrix)
			
 
				-        p = V[8][:]
			
 
				-
			
 
				-        return array([
			
 
				-            [ p[0], p[1], p[2] ],
			
 
				-            [ p[3], p[4], p[5] ],
			
 
				-            [ p[6], p[7], p[8] ]
			
 
				-        ])
			
 
				-
			
 
				-    def pV(self, x, y):
			
 
				-        image = self.image
			
 
				-
			
 
				-        #Get the value of a point (interpolated x, y) in the given image
			
 
				-        if image.in_bounds(x, y):
			
 
				-            x_low  = floor(x)
			
 
				-            x_high = floor(x + 1)
			
 
				-            y_low  = floor(y)
			
 
				-            y_high = floor(y + 1)
			
 
				-            x_y    = (x_high - x_low) * (y_high - y_low)
			
 
				-
			
 
				-            a = x_high - x
			
 
				-            b = y_high - y
			
 
				-            c = x - x_low
			
 
				-            d = y - y_low
			
 
				-
			
 
				-            return image[x_low,  y_low] / x_y * a * b \
			
 
				-                + image[x_high,  y_low] / x_y * c * b \
			
 
				-                + image[x_low , y_high] / x_y * a * d \
			
 
				-                + image[x_high, y_high] / x_y * c * d
			
 
				-
			
 
				-        return 0
			
 
				-
			
 
				-    def read_xml(self, filename):
			
 
				-        dom = parse('../images/Infos/%s.info' % filename)
			
 
				-        self.characters = []
			
 
				-
			
 
				-        version = dom.getElementsByTagName("current-version")[0].firstChild.data
			
 
				-        info    = dom.getElementsByTagName("info")
			
 
				-
			
 
				-        for i in info:
			
 
				-            if version == i.getElementsByTagName("version")[0].firstChild.data:
			
 
				-
			
 
				-                self.country = i.getElementsByTagName("identification-letters")[0].firstChild.data
			
 
				-                temp = i.getElementsByTagName("characters")
			
 
				-
			
 
				-                if len(temp):
			
 
				-                  characters = temp[0].childNodes
			
 
				-                else:
			
 
				-                  self.characters = []
			
 
				-                  break
			
 
				-
			
 
				-                for i, character in enumerate(characters):
			
 
				-                    if character.nodeName == "character":
			
 
				-                        value   = character.getElementsByTagName("char")[0].firstChild.data
			
 
				-                        corners = self.get_corners(character)
			
 
				-
			
 
				-                        if not len(corners) == 4:
			
 
				-                          break
			
 
				-
			
 
				-                        image = GrayscaleImage(data = self.retrieve_data(corners))
			
 
				-
			
 
				-                        print value
			
 
				-
			
 
				-                        path = "../images/LearningSet/%s" % value
			
 
				-                        image_path = "%s/%d_%s.jpg" % (path, i, filename.split('/')[-1])
			
 
				-
			
 
				-                        if not exists(path):
			
 
				-                          mkdir(path)
			
 
				-
			
 
				-                        if not exists(image_path):
			
 
				-                          image.save(image_path)
			
 
				-
			
 
				-                break
			
 
				-
			
 
				-    def get_corners(self, dom):
			
 
				-      nodes = dom.getElementsByTagName("point")
			
 
				-
			
 
				-      corners = []
			
 
				-
			
 
				-      for node in nodes:
			
 
				-          corners.append(Point(node))
			
 
				-
			
 
				-      return corners
			
 
				-
			
 
				+from xml_helper_functions import xml_to_LicensePlate
			
 
				 
			
 
				 for i in range(9):
			
 
				     for j in range(100):
			
 
				         try:
			
 
				-            filename = '%04d/00991_%04d%02d.info' % (i, i, j)
			
 
				+            filename = '%04d/00991_%04d%02d' % (i, i, j)
			
 
				             print 'loading file "%s"' % filename
			
 
				-            plate = LearningSetGenerator(i, j)
			
 
				+            plate = xml_to_LicensePlate(filename, save_character=1)
			
 
				         except:
			
 
				-            print "failure"
			
 
				+            print 'epic fail'
			
--- a/src/LicensePlate.py
+++ b/src/LicensePlate.py
@@ -1,131 +1,5 @@
 
				-from pylab import array, zeros, inv, dot, svd, floor
			
 
				-from xml.dom.minidom import parse
			
 
				-from Point import Point
			
 
				-from Character import Character
			
 
				-from GrayscaleImage import GrayscaleImage
			
 
				-from NormalizedCharacterImage import NormalizedCharacterImage
			
 
				-
			
 
				 class LicensePlate:
			
 
				 
			
 
				-    def __init__(self, folder_nr, file_nr):
			
 
				-        filename = '%04d/00991_%04d%02d' % (folder_nr, folder_nr, file_nr)
			
 
				-
			
 
				-        self.image = GrayscaleImage('../images/Images/%s.jpg' % filename)
			
 
				-        self.read_xml(filename)
			
 
				-
			
 
				-    # sets the entire license plate of an image
			
 
				-    def retrieve_data(self, corners):
			
 
				-        x0, y0 = corners[0].to_tuple()
			
 
				-        x1, y1 = corners[1].to_tuple()
			
 
				-        x2, y2 = corners[2].to_tuple()
			
 
				-        x3, y3 = corners[3].to_tuple()
			
 
				-
			
 
				-        M = max(x0, x1, x2, x3) - min(x0, x1, x2, x3)
			
 
				-        N = max(y0, y1, y2, y3) - min(y0, y1, y2, y3)
			
 
				-
			
 
				-        matrix = array([
			
 
				-          [x0, y0, 1,  0,  0, 0,       0,       0,  0],
			
 
				-          [ 0,  0, 0, x0, y0, 1,       0,       0,  0],
			
 
				-          [x1, y1, 1,  0,  0, 0, -M * x0, -M * y1, -M],
			
 
				-          [ 0,  0, 0, x1, y1, 1,       0,       0,  0],
			
 
				-          [x2, y2, 1,  0,  0, 0, -M * x2, -M * y2, -M],
			
 
				-          [ 0,  0, 0, x2, y2, 1, -N * x2, -N * y2, -N],
			
 
				-          [x3, y3, 1,  0,  0, 0,       0,       0,  0],
			
 
				-          [ 0,  0, 0, x3, y3, 1, -N * x3, -N * y3, -N]
			
 
				-        ])
			
 
				-
			
 
				-        P = inv(self.get_transformation_matrix(matrix))
			
 
				-        data = array([zeros(M, float)] * N)
			
 
				-
			
 
				-        for i in range(0, M):
			
 
				-            for j in range(0, N):
			
 
				-                or_coor   = dot(P, ([[i],[j],[1]]))
			
 
				-                or_coor_h = (or_coor[1][0] / or_coor[2][0],
			
 
				-                             or_coor[0][0] / or_coor[2][0])
			
 
				-
			
 
				-                data[j][i] = self.pV(or_coor_h[0], or_coor_h[1])
			
 
				-
			
 
				-        return data
			
 
				-
			
 
				-    def get_transformation_matrix(self, matrix):
			
 
				-        # Get the vector p and the values that are in there by taking the SVD.
			
 
				-        # Since D is diagonal with the eigenvalues sorted from large to small
			
 
				-        # on the diagonal, the optimal q in min ||Dq|| is q = [[0]..[1]].
			
 
				-        # Therefore, p = Vq means p is the last column in V.
			
 
				-        U, D, V = svd(matrix)
			
 
				-        p = V[8][:]
			
 
				-
			
 
				-        return array([
			
 
				-            [ p[0], p[1], p[2] ],
			
 
				-            [ p[3], p[4], p[5] ],
			
 
				-            [ p[6], p[7], p[8] ]
			
 
				-        ])
			
 
				-
			
 
				-    def pV(self, x, y):
			
 
				-        image = self.image
			
 
				-
			
 
				-        #Get the value of a point (interpolated x, y) in the given image
			
 
				-        if image.in_bounds(x, y):
			
 
				-            x_low  = floor(x)
			
 
				-            x_high = floor(x + 1)
			
 
				-            y_low  = floor(y)
			
 
				-            y_high = floor(y + 1)
			
 
				-            x_y    = (x_high - x_low) * (y_high - y_low)
			
 
				-
			
 
				-            a = x_high - x
			
 
				-            b = y_high - y
			
 
				-            c = x - x_low
			
 
				-            d = y - y_low
			
 
				-
			
 
				-            return image[x_low,  y_low] / x_y * a * b \
			
 
				-                + image[x_high,  y_low] / x_y * c * b \
			
 
				-                + image[x_low , y_high] / x_y * a * d \
			
 
				-                + image[x_high, y_high] / x_y * c * d
			
 
				-
			
 
				-        return 0
			
 
				-
			
 
				-    def read_xml(self, filename):
			
 
				-        dom = parse('../images/Infos/%s.info' % filename)
			
 
				-        self.characters = []
			
 
				-        
			
 
				-        version = dom.getElementsByTagName("current-version")[0].firstChild.data
			
 
				-        info    = dom.getElementsByTagName("info")
			
 
				-        
			
 
				-        for i in info:
			
 
				-            if version == i.getElementsByTagName("version")[0].firstChild.data:
			
 
				-
			
 
				-                self.country = i.getElementsByTagName("identification-letters")[0].firstChild.data
			
 
				-                
			
 
				-                
			
 
				-                temp = i.getElementsByTagName("characters")
			
 
				-                
			
 
				-                if len(temp):
			
 
				-                  characters = temp[0].childNodes
			
 
				-                else:
			
 
				-                  self.characters = []
			
 
				-                  break
			
 
				-                
			
 
				-                for character in characters:
			
 
				-                    if character.nodeName == "character":
			
 
				-                        value   = character.getElementsByTagName("char")[0].firstChild.data
			
 
				-                        corners = self.get_corners(character)
			
 
				-                        
			
 
				-                        if not len(corners) == 4:
			
 
				-                          break
			
 
				-                        
			
 
				-                        data    = self.retrieve_data(corners)
			
 
				-                        image   = NormalizedCharacterImage(data=data)
			
 
				-
			
 
				-                        self.characters.append(Character(value, corners, image, filename))
			
 
				-                
			
 
				-                break
			
 
				-
			
 
				-    def get_corners(self, dom):
			
 
				-      nodes = dom.getElementsByTagName("point")
			
 
				-
			
 
				-      corners = []
			
 
				-
			
 
				-      for node in nodes:
			
 
				-          corners.append(Point(node))
			
 
				-
			
 
				-      return corners
			
 
				+    def __init__(self, country=None, characters=None):
			
 
				+        self.country = country
			
 
				+        self.characters = characters
			
--- a/src/Point.py
+++ b/src/Point.py
@@ -1,11 +1,7 @@
 
				 class Point:
			
 
				-    def __init__(self, x_or_corner=None, y=None):
			
 
				-        if y != None:
			
 
				-            self.x = x_or_corner
			
 
				-            self.y = y
			
 
				-        else:
			
 
				-            self.x = int(x_or_corner.getAttribute("x"))
			
 
				-            self.y = int(x_or_corner.getAttribute("y"))
			
 
				+    def __init__(self, x, y):
			
 
				+        self.x = x
			
 
				+        self.y = y
			
 
				 
			
 
				     def to_tuple(self):
			
 
				         return self.x, self.y
			
--- a/src/xml_helper_functions.py
+++ b/src/xml_helper_functions.py
@@ -0,0 +1,158 @@
 
				+from os import mkdir
			
 
				+from os.path import exists
			
 
				+from math import acos
			
 
				+from pylab import imsave, array, zeros, inv, dot, norm, svd, floor
			
 
				+from xml.dom.minidom import parse
			
 
				+from Point import Point
			
 
				+from Character import Character
			
 
				+from GrayscaleImage import GrayscaleImage
			
 
				+from NormalizedCharacterImage import NormalizedCharacterImage
			
 
				+from LicensePlate import LicensePlate
			
 
				+
			
 
				+# sets the entire license plate of an image
			
 
				+def retrieve_data(image, corners):
			
 
				+    x0, y0 = corners[0].to_tuple()
			
 
				+    x1, y1 = corners[1].to_tuple()
			
 
				+    x2, y2 = corners[2].to_tuple()
			
 
				+    x3, y3 = corners[3].to_tuple()
			
 
				+
			
 
				+    M = int(1.2 * (max(x0, x1, x2, x3) - min(x0, x1, x2, x3)))
			
 
				+    N = max(y0, y1, y2, y3) - min(y0, y1, y2, y3)
			
 
				+
			
 
				+    matrix = array([
			
 
				+      [x0, y0, 1,  0,  0, 0,       0,       0,  0],
			
 
				+      [ 0,  0, 0, x0, y0, 1,       0,       0,  0],
			
 
				+      [x1, y1, 1,  0,  0, 0, -M * x0, -M * y1, -M],
			
 
				+      [ 0,  0, 0, x1, y1, 1,       0,       0,  0],
			
 
				+      [x2, y2, 1,  0,  0, 0, -M * x2, -M * y2, -M],
			
 
				+      [ 0,  0, 0, x2, y2, 1, -N * x2, -N * y2, -N],
			
 
				+      [x3, y3, 1,  0,  0, 0,       0,       0,  0],
			
 
				+      [ 0,  0, 0, x3, y3, 1, -N * x3, -N * y3, -N]
			
 
				+    ])
			
 
				+
			
 
				+    P = inv(get_transformation_matrix(matrix))
			
 
				+    data = array([zeros(M, float)] * N)
			
 
				+
			
 
				+    for i in range(M):
			
 
				+        for j in range(N):
			
 
				+            or_coor   = dot(P, ([[i],[j],[1]]))
			
 
				+            or_coor_h = (or_coor[1][0] / or_coor[2][0],
			
 
				+                         or_coor[0][0] / or_coor[2][0])
			
 
				+
			
 
				+            data[j][i] = pV(image, or_coor_h[0], or_coor_h[1])
			
 
				+
			
 
				+    return data
			
 
				+
			
 
				+def get_transformation_matrix(matrix):
			
 
				+    # Get the vector p and the values that are in there by taking the SVD.
			
 
				+    # Since D is diagonal with the eigenvalues sorted from large to small
			
 
				+    # on the diagonal, the optimal q in min ||Dq|| is q = [[0]..[1]].
			
 
				+    # Therefore, p = Vq means p is the last column in V.
			
 
				+    U, D, V = svd(matrix)
			
 
				+    p = V[8][:]
			
 
				+
			
 
				+    return array([
			
 
				+        [ p[0], p[1], p[2] ],
			
 
				+        [ p[3], p[4], p[5] ],
			
 
				+        [ p[6], p[7], p[8] ]
			
 
				+    ])
			
 
				+
			
 
				+def pV(image, x, y):
			
 
				+    #Get the value of a point (interpolated x, y) in the given image
			
 
				+    if image.in_bounds(x, y):
			
 
				+        x_low  = floor(x)
			
 
				+        x_high = floor(x + 1)
			
 
				+        y_low  = floor(y)
			
 
				+        y_high = floor(y + 1)
			
 
				+        x_y    = (x_high - x_low) * (y_high - y_low)
			
 
				+
			
 
				+        a = x_high - x
			
 
				+        b = y_high - y
			
 
				+        c = x - x_low
			
 
				+        d = y - y_low
			
 
				+
			
 
				+        return image[x_low,  y_low] / x_y * a * b \
			
 
				+            + image[x_high,  y_low] / x_y * c * b \
			
 
				+            + image[x_low , y_high] / x_y * a * d \
			
 
				+            + image[x_high, y_high] / x_y * c * d
			
 
				+
			
 
				+    return 0
			
 
				+
			
 
				+def xml_to_LicensePlate(filename, save_character=None):
			
 
				+    image = GrayscaleImage('../images/Images/%s.jpg' % filename)
			
 
				+    dom   = parse('../images/Infos/%s.info' % filename)
			
 
				+    result_characters = []
			
 
				+
			
 
				+    version = dom.getElementsByTagName("current-version")[0].firstChild.data
			
 
				+    info    = dom.getElementsByTagName("info")
			
 
				+
			
 
				+    for i in info:
			
 
				+        if version == i.getElementsByTagName("version")[0].firstChild.data:
			
 
				+
			
 
				+            country = i.getElementsByTagName("identification-letters")[0].firstChild.data
			
 
				+            temp = i.getElementsByTagName("characters")
			
 
				+
			
 
				+            if len(temp):
			
 
				+              characters = temp[0].childNodes
			
 
				+            else:
			
 
				+              characters = []
			
 
				+              break
			
 
				+
			
 
				+            for i, character in enumerate(characters):
			
 
				+                if character.nodeName == "character":
			
 
				+                    value   = character.getElementsByTagName("char")[0].firstChild.data
			
 
				+                    corners = get_corners(character)
			
 
				+
			
 
				+                    if not len(corners) == 4:
			
 
				+                      break
			
 
				+
			
 
				+                    character_data  = retrieve_data(image, corners)
			
 
				+                    character_image = NormalizedCharacterImage(data=character_data)
			
 
				+
			
 
				+                    result_characters.append(Character(value, corners, character_image, filename))
			
 
				+                
			
 
				+                    if save_character:
			
 
				+                        single_character = GrayscaleImage(data=character_data)
			
 
				+
			
 
				+                        path = "../images/LearningSet/%s" % value
			
 
				+                        image_path = "%s/%d_%s.jpg" % (path, i, filename.split('/')[-1])
			
 
				+
			
 
				+                        if not exists(path):
			
 
				+                          mkdir(path)
			
 
				+
			
 
				+                        if not exists(image_path):
			
 
				+                          single_character.save(image_path)
			
 
				+
			
 
				+    return LicensePlate(country, result_characters)
			
 
				+
			
 
				+def get_corners(dom):
			
 
				+  nodes = dom.getElementsByTagName("point")
			
 
				+  corners = []
			
 
				+
			
 
				+  margin_y = 3
			
 
				+  margin_x = 2
			
 
				+
			
 
				+  corners.append(
			
 
				+    Point(get_coord(nodes[0], "x") - margin_x, 
			
 
				+          get_coord(nodes[0], "y") - margin_y)
			
 
				+  )
			
 
				+
			
 
				+  corners.append(
			
 
				+    Point(get_coord(nodes[1], "x") + margin_x, 
			
 
				+          get_coord(nodes[1], "y") - margin_y)
			
 
				+  )
			
 
				+
			
 
				+  corners.append(
			
 
				+    Point(get_coord(nodes[2], "x") + margin_x, 
			
 
				+          get_coord(nodes[2], "y") + margin_y)
			
 
				+  )
			
 
				+
			
 
				+  corners.append(
			
 
				+    Point(get_coord(nodes[3], "x") - margin_x, 
			
 
				+          get_coord(nodes[3], "y") + margin_y)
			
 
				+  )
			
 
				+
			
 
				+  return corners
			
 
				+
			
 
				+def get_coord(node, attribute):
			
 
				+  return int(node.getAttribute(attribute))