Przeglądaj źródła

Merge branch 'master' of github.com:taddeus/licenseplates

Richard Torenvliet 14 lat temu
rodzic
commit
e976d336d5

BIN
docs/12-5neighbourhood.png


+ 76 - 13
docs/report.tex

@@ -226,10 +226,9 @@ reader will only get results from this version.
 Now we are only interested in the individual characters so we can skip the
 location of the entire license plate. Each character has 
 a single character value, indicating what someone thought what the letter or
-digit was and four coordinates to create a bounding box. To make things not to
-complicated a Character class and Point class are used. They
-act pretty much as associative lists, but it gives extra freedom on using the
-data. If less then four points have been set the character will not be saved.
+digit was and four coordinates to create a bounding box. If less then four points have been set the character will not be saved. Else, to make things not to
+complicated, a Character class is used. It acts as an associative list, but it gives some extra freedom when using the
+data.
 
 When four points have been gathered the data from the actual image is being
 requested. For each corner a small margin is added (around 3 pixels) so that no
@@ -351,7 +350,7 @@ value, and what value we decided on.
 
 The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
 find this parameter, we tested a few values, by trying them and checking the
-results. It turned out that the best value was $\sigma = 1.1$.
+results. It turned out that the best value was $\sigma = 1.4$.
 
 \subsection{Parameter \emph{cell size}}
 
@@ -380,7 +379,13 @@ are not significant enough to allow for reliable classification.
 The neighbourhood to use can only be determined through testing. We did a test
 with each of these neighbourhoods, and we found that the best results were
 reached with the following neighbourhood, which we will call the
-()-neighbourhood.
+(12, 5)-neighbourhood, since it has 12 points in a area with a diameter of 5.
+
+\begin{figure}[H]
+\center
+\includegraphics[scale=0.5]{12-5neighbourhood.png}
+\caption{(12,5)-neighbourhood}
+\end{figure}
 
 \subsection{Parameters $\gamma$ \& $c$}
 
@@ -401,8 +406,41 @@ checks for each combination of values what the score is. The combination with
 the highest score is then used as our parameters, and the entire SVM will be
 trained using those parameters.\\
 \\
-We found that the best values for these parameters are $c = ?$ and
-$\gamma = ?$.
+The results of this grid-search are shown in the following table. The values
+in the table are rounded percentages, for easy displaying.
+
+\begin{tabular}{|r|r r r r r r r r r r|}
+\hline
+c $\gamma$ & $2^{-15}$ & $2^{-13}$ & $2^{-11}$ & $2^{-9}$ & $2^{-7}$ &
+	$2^{-5}$ & $2^{-3}$ & $2^{-1}$ & $2^{1}$ & $2^{3}$\\
+\hline
+$2^{-5}$ &       61 &       61 &       61 &       61 &       62 &
+       63 &       67 &       74 &       59 &       24\\
+$2^{-3}$ &       61 &       61 &       61 &       61 &       62 &
+       63 &       70 &       78 &       60 &       24\\
+$2^{-1}$ &       61 &       61 &       61 &       61 &       62 &
+       70 &       83 &       88 &       78 &       27\\
+ $2^{1}$ &       61 &       61 &       61 &       61 &       70 &
+        84 &       90 &       92 &       86 &       45\\
+ $2^{3}$ &       61 &       61 &       61 &       70 &       84 &
+        90 &       93 &       93 &       86 &       45\\
+ $2^{5}$ &       61 &       61 &       70 &       84 &       90 &
+        92 &       93 &       93 &       86 &       45\\
+ $2^{7}$ &       61 &       70 &       84 &       90 &       92 &
+        93 &       93 &       93 &       86 &       45\\
+ $2^{9}$ &       70 &       84 &       90 &       92 &       92 & 
+       93 &       93 &       93 &       86 &       45\\
+$2^{11}$ &       84 &       90 &       92 &       92 &       92 &
+       92 &       93 &       93 &       86 &       45\\
+$2^{13}$ &       90 &       92 &       92 &       92 &       92 &
+       92 &       93 &       93 &       86 &       45\\
+$2^{15}$ &       92 &       92 &       92 &       92 &       92 &
+       92 &       93 &       93 &       86 &       45\\
+\hline
+\end{tabular}
+
+We found that the best values for these parameters are $c = 32$ and
+$\gamma = 0.125$.
 
 \section{Results}
 
@@ -418,7 +456,17 @@ measure the time used to classify a license plate, not the training of the
 dataset, since that can be done offline, and speed is not a primary necessity
 there.\\
 \\
-The speed of a classification turned out to be ???.
+The speed of a classification turned out to be reasonably good. We time between
+the moment a character has been 'cut out' of the image, so we have a exact
+image of a character, to the moment where the SVM tells us what character it is.
+This time is on average $65$ ms. That means that this
+technique (tested on an AMD Phenom II X4 955 Quad core CPU running at 3.2 GHz)
+can identify 15 characters per second.\\
+\\
+This is not spectacular considering the amount of calculating power this cpu
+can offer, but it is still fairly reasonable. Of course, this program is
+written in Python, and is therefore not nearly as optimized as would be
+possible when written in a low-level language.
 
 \subsection{Accuracy}
 
@@ -429,16 +477,31 @@ accuracy score we possibly can.\\
 \footnote{
 \url{http://en.wikipedia.org/wiki/Automatic_number_plate_recognition}},
 commercial license plate recognition software score about $90\%$ to $94\%$,
-under optimal conditions and with modern equipment. Our program scores an
-average of ???.
+under optimal conditions and with modern equipment.\\
+\\
+Our program scores an average of $93\%$. However, this is for a single
+character. That means that a full license plate should theoretically
+get a score of $0.93^6 = 0.647$, so $64.7\%$. That is not particularly
+good compared to the commercial ones. However, our focus was on getting
+good scores per character, and $93\%$ seems to be a fairly good result.\\
+\\
+Possibilities for improvement of this score would be more extensive
+grid-searches, finding more exact values for $c$ and $\gamma$, more tests
+for finding $\sigma$ and more experiments on the size and shape of the 
+neighbourhoods.
 
 \section{Conclusion}
 
 In the end it turns out that using Local Binary Patterns is a promising
-technique for License Plate Recognition. It seems to be relatively unsensitive
+technique for License Plate Recognition. It seems to be relatively indifferent
 for the amount of dirt on license plates and different fonts on these plates.\\
 \\
-The performance speedwise is ???
+The performance speed wise is fairly good, when using a fast machine. However,
+this is written in Python, which means it is not as efficient as it could be
+when using a low-level languages.
+\\
+We believe that with further experimentation and development, LBP's can
+absolutely be used as a good license plate recognition method.
 
 \section{Reflection}
 

+ 0 - 1
src/LicensePlate.py

@@ -1,5 +1,4 @@
 class LicensePlate:
-
     def __init__(self, country=None, characters=None):
         self.country = country
         self.characters = characters

+ 10 - 0
src/LocalBinaryPatternizer.py

@@ -32,6 +32,16 @@ class LocalBinaryPatternizer:
              | (self.is_pixel_darker(y + 1, x - 1, value) << 1) \
              | (self.is_pixel_darker(y    , x - 1, value))
 
+    def pattern_5x5_hybrid(self, y, x, value):
+        return (self.is_pixel_darker(y - 2, x - 2, value) << 7) \
+             | (self.is_pixel_darker(y - 2, x    , value) << 6) \
+             | (self.is_pixel_darker(y - 2, x + 2, value) << 5) \
+             | (self.is_pixel_darker(y    , x + 2, value) << 4) \
+             | (self.is_pixel_darker(y + 2, x + 2, value) << 3) \
+             | (self.is_pixel_darker(y + 2, x    , value) << 2) \
+             | (self.is_pixel_darker(y + 2, x - 2, value) << 1) \
+             | (self.is_pixel_darker(y    , x - 2, value))
+
     def pattern_5x5(self, y, x, value):
         return (self.is_pixel_darker(y - 1, x - 2, value) << 11) \
              | (self.is_pixel_darker(y    , x - 2, value) << 10) \

+ 0 - 10
src/Point.py

@@ -1,10 +0,0 @@
-class Point:
-    def __init__(self, x, y):
-        self.x = x
-        self.y = y
-
-    def to_tuple(self):
-        return self.x, self.y
-        
-    def __str__(self):
-        return str(self.x) + " " + str(self.y)

+ 68 - 0
src/test_performance.py

@@ -0,0 +1,68 @@
+#!/usr/bin/python
+from os import listdir
+from cPickle import load
+from sys import argv, exit
+from time import time
+
+from GrayscaleImage import GrayscaleImage
+from NormalizedCharacterImage import NormalizedCharacterImage
+from Character import Character
+from Classifier import Classifier
+
+if len(argv) < 4:
+    print 'Usage: python %s NEIGHBOURS BLUR_SCALE COUNT' % argv[0]
+    exit(1)
+
+neighbours = int(argv[1])
+blur_scale = float(argv[2])
+count = int(argv[3])
+suffix = '_%s_%s' % (blur_scale, neighbours)
+
+#chars_file = 'characters%s.dat' % suffix
+classifier_file = 'classifier%s.dat' % suffix
+
+#print 'Loading characters...'
+#chars = load(open(chars_file, 'r'))[:count]
+#count = len(chars)
+#
+#for char in chars:
+#    del char.feature
+#
+#print 'Read %d characters' % count
+
+print 'Loading %d characters...' % count
+chars = []
+i = 0
+br = False
+
+for value in sorted(listdir('../images/LearningSet')):
+    for image in sorted(listdir('../images/LearningSet/' + value)):
+        f = '../images/LearningSet/' + value + '/' + image
+        image = GrayscaleImage(f)
+        char = Character(value, [], image)
+        chars.append(char)
+        i += 1
+
+        if i == count:
+            br = True
+            break
+
+    if br:
+        break
+
+print 'Loading classifier...'
+classifier = Classifier(filename=classifier_file)
+classifier.neighbours = neighbours
+
+start = time()
+
+for char in chars:
+    char.image = NormalizedCharacterImage(image, blur=blur_scale, height=42)
+    char.get_single_cell_feature_vector(neighbours)
+    classifier.classify(char)
+
+elapsed = time() - start
+individual = elapsed / count
+
+print 'Took %fs to classify %d caracters (%fms per character)' \
+        % (elapsed, count, individual * 1000)

+ 70 - 91
src/xml_helper_functions.py

@@ -1,22 +1,17 @@
 from os import mkdir
 from os.path import exists
-from math import acos
 from pylab import imsave, array, zeros, inv, dot, norm, svd, floor
 from xml.dom.minidom import parse
-from Point import Point
 from Character import Character
 from GrayscaleImage import GrayscaleImage
 from NormalizedCharacterImage import NormalizedCharacterImage
 from LicensePlate import LicensePlate
 
-# sets the entire license plate of an image
-def retrieve_data(image, corners):
-    x0, y0 = corners[0].to_tuple()
-    x1, y1 = corners[1].to_tuple()
-    x2, y2 = corners[2].to_tuple()
-    x3, y3 = corners[3].to_tuple()
+# Gets the character data from a picture with a license plate
+def retrieve_data(plate, corners):
+    x0,y0, x1,y1, x2,y2, x3,y3 = corners
 
-    M = int(1.2 * (max(x0, x1, x2, x3) - min(x0, x1, x2, x3)))
+    M = max(x0, x1, x2, x3) - min(x0, x1, x2, x3)
     N = max(y0, y1, y2, y3) - min(y0, y1, y2, y3)
 
     matrix = array([
@@ -30,7 +25,7 @@ def retrieve_data(image, corners):
       [ 0,  0, 0, x3, y3, 1, -N * x3, -N * y3, -N]
     ])
 
-    P = inv(get_transformation_matrix(matrix))
+    P = get_transformation_matrix(matrix)
     data = array([zeros(M, float)] * N)
 
     for i in range(M):
@@ -39,7 +34,7 @@ def retrieve_data(image, corners):
             or_coor_h = (or_coor[1][0] / or_coor[2][0],
                          or_coor[0][0] / or_coor[2][0])
 
-            data[j][i] = pV(image, or_coor_h[0], or_coor_h[1])
+            data[j][i] = pV(plate, or_coor_h[0], or_coor_h[1])
 
     return data
 
@@ -51,108 +46,92 @@ def get_transformation_matrix(matrix):
     U, D, V = svd(matrix)
     p = V[8][:]
 
-    return array([
-        [ p[0], p[1], p[2] ],
-        [ p[3], p[4], p[5] ],
-        [ p[6], p[7], p[8] ]
-    ])
+    return inv(array([[p[0],p[1],p[2]], [p[3],p[4],p[5]], [p[6],p[7],p[8]]]))
 
 def pV(image, x, y):
     #Get the value of a point (interpolated x, y) in the given image
-    if image.in_bounds(x, y):
-        x_low  = floor(x)
-        x_high = floor(x + 1)
-        y_low  = floor(y)
-        y_high = floor(y + 1)
-        x_y    = (x_high - x_low) * (y_high - y_low)
+    if not image.in_bounds(x, y):
+      return 0
 
-        a = x_high - x
-        b = y_high - y
-        c = x - x_low
-        d = y - y_low
+    x_low, x_high = floor(x), floor(x+1)
+    y_low, y_high = floor(y), floor(y+1)
+    x_y    = (x_high - x_low) * (y_high - y_low)
 
-        return image[x_low,  y_low] / x_y * a * b \
-            + image[x_high,  y_low] / x_y * c * b \
-            + image[x_low , y_high] / x_y * a * d \
-            + image[x_high, y_high] / x_y * c * d
+    a = x_high - x
+    b = y_high - y
+    c = x - x_low
+    d = y - y_low
 
-    return 0
+    return image[x_low,  y_low] / x_y * a * b \
+        + image[x_high,  y_low] / x_y * c * b \
+        + image[x_low , y_high] / x_y * a * d \
+        + image[x_high, y_high] / x_y * c * d
 
 def xml_to_LicensePlate(filename, save_character=None):
-    image = GrayscaleImage('../images/Images/%s.jpg' % filename)
-    dom   = parse('../images/Infos/%s.info' % filename)
-    result_characters = []
-
-    version = dom.getElementsByTagName("current-version")[0].firstChild.data
-    info    = dom.getElementsByTagName("info")
-
-    for i in info:
-        if version == i.getElementsByTagName("version")[0].firstChild.data:
+    plate   = GrayscaleImage('../images/Images/%s.jpg' % filename)
+    dom     = parse('../images/Infos/%s.info' % filename)
+    country = ''
+    result  = []
+    version = get_node(dom, "current-version")
+    infos   = by_tag(dom, "info")
 
-            country = i.getElementsByTagName("identification-letters")[0].firstChild.data
-            temp = i.getElementsByTagName("characters")
+    for info in infos:
+        if not version == get_node(info, "version"):
+            continue
 
-            if len(temp):
-              characters = temp[0].childNodes
-            else:
-              characters = []
-              break
+        country = get_node(info, "identification-letters")
+        temp    = by_tag(info, "characters")
 
-            for i, character in enumerate(characters):
-                if character.nodeName == "character":
-                    value   = character.getElementsByTagName("char")[0].firstChild.data
-                    corners = get_corners(character)
+        if not temp: # no characters where found in the file
+            break
 
-                    if not len(corners) == 4:
-                      break
+        characters = temp[0].childNodes
 
-                    character_data  = retrieve_data(image, corners)
-                    character_image = NormalizedCharacterImage(data=character_data)
+        for i, char in enumerate(characters):
+            if not char.nodeName == "character":
+              continue
 
-                    result_characters.append(Character(value, corners, character_image, filename))
-                
-                    if save_character:
-                        single_character = GrayscaleImage(data=character_data)
+            value   = get_node(char, "char")
+            corners = get_corners(char)
 
-                        path = "../images/LearningSet/%s" % value
-                        image_path = "%s/%d_%s.jpg" % (path, i, filename.split('/')[-1])
+            if not len(corners) == 8:
+                break
 
-                        if not exists(path):
-                          mkdir(path)
+            data  = retrieve_data(plate, corners)
+            image = NormalizedCharacterImage(data=data)
+            result.append(Character(value, corners, image, filename))
+        
+            if save_character:
+                character_image = GrayscaleImage(data=data)
+                path       = "../images/LearningSet/%s" % value
+                image_path = "%s/%d_%s.jpg" % (path, i, filename.split('/')[-1])
 
-                        if not exists(image_path):
-                          single_character.save(image_path)
+                if not exists(path):
+                  mkdir(path)
 
-    return LicensePlate(country, result_characters)
+                if not exists(image_path):
+                  character_image.save(image_path)
 
-def get_corners(dom):
-  nodes = dom.getElementsByTagName("point")
-  corners = []
-
-  margin_y = 3
-  margin_x = 2
+    return LicensePlate(country, result)
 
-  corners.append(
-    Point(get_coord(nodes[0], "x") - margin_x, 
-          get_coord(nodes[0], "y") - margin_y)
-  )
+def get_node(node, tag):
+    return by_tag(node, tag)[0].firstChild.data
 
-  corners.append(
-    Point(get_coord(nodes[1], "x") + margin_x, 
-          get_coord(nodes[1], "y") - margin_y)
-  )
+def by_tag(node, tag):
+    return node.getElementsByTagName(tag)
 
-  corners.append(
-    Point(get_coord(nodes[2], "x") + margin_x, 
-          get_coord(nodes[2], "y") + margin_y)
-  )
+def get_attr(node, attr):
+  return int(node.getAttribute(attr))
 
-  corners.append(
-    Point(get_coord(nodes[3], "x") - margin_x, 
-          get_coord(nodes[3], "y") + margin_y)
-  )
+def get_corners(dom):
+    p = by_tag(dom, "point")
 
-  return corners
+    # Extra padding
+    y = 3
+    x = 2
 
-def get_coord(node, attribute):
-  return int(node.getAttribute(attribute))
+    # return 8 values (x0,y0, .., x3,y3)
+    return get_attr(p[0], "x") - x, get_attr(p[0], "y") - y,\
+           get_attr(p[1], "x") + x, get_attr(p[1], "y") - y,\
+           get_attr(p[2], "x") + x, get_attr(p[2], "y") + y,\
+           get_attr(p[3], "x") - x, get_attr(p[3], "y") + y