Merge branch 'master' of github.com:taddeus/licenseplates

ebade5d3 · Richard Torenvliet · 3ae07a64 · 8473cbe5 · ebade5d3 · ebade5d3
Commit ebade5d3 authored 13 years ago by Richard Torenvliet
--- a/README
+++ b/README
+Summary of project:
+    This code is an implementation of a classifier for License Plate
+    Recognition, using Local Binary Patterns as features for a Support Vector
+    Machine.
+    A number of scripts are provided to execute tests with this code, and to
+    see how well the code performs, both when considering accuracy and speed.
+    There is also a script that automises the search for proper parameters for
+    the SVM.
+    In the docs folder, a report can be found with a more extensive description
+    of the theory, the implementation and the results.
+    The images folder contains a sorted dataset of characters, cut out from
+    real life images of license plates, provided by Parkingware Schiphol.
+Authors:
+    Taddeüs Kroes
+    Jayke Meijer
+    Fabiën Tesselaar
+    Richard Torenvliet
+    Gijs van der Voort.
+Date:
+    December 2011
+Dependencies:
+    matplotlib
+    numpy
+    scipy
+    python-libsvm
--- a/docs/faulty.png
+++ b/docs/faulty.png
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -19,7 +19,7 @@ Gijs van der Voort\\
 Richard Torenvliet\\
 Jayke Meijer\\
 Tadde\"us Kroes\\
-Fabi\'en Tesselaar
+Fabi\"en Tesselaar
 \tableofcontents
 \pagebreak
@@ -71,25 +71,6 @@ defining what problems we have and how we want to solve these.
 \subsection{Extracting a letter and resizing it}
 Rewrite this section once we have implemented this properly.
-%NO LONGER VALID!
-%Because we are already given the locations of the characters, we only need to
-%transform those locations using the same perspective transformation used to
-%create a front facing license plate. The next step is to transform the
-%characters to a normalized manner. The size of the letter W is used as a
-%standard to normalize the width of all the characters, because W is the widest
-%character of the alphabet. We plan to also normalize the height of characters,
-%the best manner for this is still to be determined.
-%\begin{enumerate}
-%    \item Crop the image in such a way that the character precisely fits the
-%          image.
-%    \item Scale the image to a standard height.
-%    \item Extend the image on either the left or right side to a certain width.
-%\end{enumerate}
-%The resulting image will always have the same size, the character contained
-%will always be of the same height, and the character will always be positioned
-%at either the left of right side of the image.
 \subsection{Transformation}
@@ -128,7 +109,7 @@ registered. For explanation purposes let the square be 3 x 3. \\
 value of the pixel around the middle pixel is evaluated. If it's value is
 greater than the threshold it will be become a one else a zero.
-\begin{figure}[h!]
+\begin{figure}[H]
 \center
 \includegraphics[scale=0.5]{lbp.png}
 \caption{LBP 3 x 3 (Pietik\"ainen, Hadid, Zhao \& Ahonen (2011))}
@@ -163,7 +144,7 @@ order. Starting with dividing the pattern in to cells of size 16.
 \item Compute a histogram for each cell.
-\begin{figure}[h!]
+\begin{figure}[H]
 \center
 \includegraphics[scale=0.7]{cells.png}
 \caption{Divide in cells(Pietik\"ainen et all (2011))}
@@ -226,9 +207,10 @@ reader will only get results from this version.
 Now we are only interested in the individual characters so we can skip the
 location of the entire license plate. Each character has 
 a single character value, indicating what someone thought what the letter or
-digit was and four coordinates to create a bounding box. If less then four points have been set the character will not be saved. Else, to make things not to
+digit was and four coordinates to create a bounding box. If less then four
-complicated, a Character class is used. It acts as an associative list, but it gives some extra freedom when using the
+points have been set the character will not be saved. Else, to make things not
-data.
+to complicated, a Character class is used. It acts as an associative list, but
+it gives some extra freedom when using the data.
 When four points have been gathered the data from the actual image is being
 requested. For each corner a small margin is added (around 3 pixels) so that no
@@ -285,6 +267,10 @@ tried the following neighbourhoods:
 \caption{Tested neighbourhoods}
 \end{figure}
+We name these neighbourhoods respectively (8,3)-, (8,5)- and
+(12,5)-neighbourhoods, after the number of points we use and the diameter
+of the `circle´ on which these points lay.\\
+\\
 We chose these neighbourhoods to prevent having to use interpolation, which
 would add a computational step, thus making the code execute slower. In the
 next section we will describe what the best neighbourhood was.
@@ -317,12 +303,47 @@ increasing our performance, so we only have one histogram to feed to the SVM.
 For the classification, we use a standard Python Support Vector Machine,
 \texttt{libsvm}. This is a often used SVM, and should allow us to simply feed
-the data from the LBP and Feature Vector steps into the SVM and receive results.\\
+the data from the LBP and Feature Vector steps into the SVM and receive
+results.\\
 \\
 Using a SVM has two steps. First you have to train the SVM, and then you can
 use it to classify data. The training step takes a lot of time, so luckily
 \texttt{libsvm} offers us an opportunity to save a trained SVM. This means,
-you do not have to train the SVM every time.
+you do not have to train the SVM every time.\\
+\\
+We have decided to only include a character in the system if the SVM can be
+trained with at least 70 examples. This is done automatically, by splitting
+the data set in a trainingset and a testset, where the first 70 examples of
+a character are added to the trainingset, and all the following examples are
+added to the testset. Therefore, if there are not enough examples, all
+available examples end up in the trainingset, and non of these characters
+end up in the testset, thus they do not decrease our score. However, if this
+character later does get offered to the system, the training is as good as
+possible, since it is trained with all available characters.
+\subsection{Supporting Scripts}
+In order to work with the code, we wrote a number of scripts. Each of these
+scripts is named here and a description is given on what the script does.
+\subsection*{\texttt{find\_svm\_params.py}}
+\subsection*{\texttt{LearningSetGenerator.py}}
+\subsection*{\texttt{load\_characters.py}}
+\subsection*{\texttt{load\_learning\_set.py}}
+\subsection*{\texttt{run\_classifier.py}}
 \section{Finding parameters}
@@ -350,7 +371,14 @@ value, and what value we decided on.
 The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
 find this parameter, we tested a few values, by trying them and checking the
-results. It turned out that the best value was $\sigma = 1.4$.
+results. It turned out that the best value was $\sigma = 1.4$.\\
+\\
+Theoretically, this can be explained as follows. The filter has width of 
+$6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
+after our resize operations, around 8 pixels. This means, our filter `matches'
+the smallest detail size we want to be able to see, so everything that is
+smaller is properly suppressed, yet it retains the details we do want to keep,
+being everything that is part of the character.
 \subsection{Parameter \emph{cell size}}
@@ -379,7 +407,7 @@ are not significant enough to allow for reliable classification.
 The neighbourhood to use can only be determined through testing. We did a test
 with each of these neighbourhoods, and we found that the best results were
 reached with the following neighbourhood, which we will call the
-(12, 5)-neighbourhood, since it has 12 points in a area with a diameter of 5.
+(12,5)-neighbourhood, since it has 12 points in a area with a diameter of 5.
 \begin{figure}[H]
 \center
@@ -447,27 +475,6 @@ $\gamma = 0.125$.
 The goal was to find out two things with this research: The speed of the
 classification and the accuracy. In this section we will show our findings.
-\subsection{Speed}
-Recognizing license plates is something that has to be done fast, since there
-can be a lot of cars passing a camera in a short time, especially on a highway.
-Therefore, we measured how well our program performed in terms of speed. We
-measure the time used to classify a license plate, not the training of the
-dataset, since that can be done offline, and speed is not a primary necessity
-there.\\
-\\
-The speed of a classification turned out to be reasonably good. We time between
-the moment a character has been 'cut out' of the image, so we have a exact
-image of a character, to the moment where the SVM tells us what character it is.
-This time is on average $65$ ms. That means that this
-technique (tested on an AMD Phenom II X4 955 Quad core CPU running at 3.2 GHz)
-can identify 15 characters per second.\\
-\\
-This is not spectacular considering the amount of calculating power this cpu
-can offer, but it is still fairly reasonable. Of course, this program is
-written in Python, and is therefore not nearly as optimized as would be
-possible when written in a low-level language.
 \subsection{Accuracy}
 Of course, it is vital that the recognition of a license plate is correct,
@@ -490,6 +497,35 @@ grid-searches, finding more exact values for $c$ and $\gamma$, more tests
 for finding $\sigma$ and more experiments on the size and shape of the 
 neighbourhoods.
+\subsection{Speed}
+Recognizing license plates is something that has to be done fast, since there
+can be a lot of cars passing a camera in a short time, especially on a highway.
+Therefore, we measured how well our program performed in terms of speed. We
+measure the time used to classify a license plate, not the training of the
+dataset, since that can be done offline, and speed is not a primary necessity
+there.\\
+\\
+The speed of a classification turned out to be reasonably good. We time between
+the moment a character has been 'cut out' of the image, so we have a exact
+image of a character, to the moment where the SVM tells us what character it
+is. This time is on average $65$ ms. That means that this
+technique (tested on an AMD Phenom II X4 955 CPU running at 3.2 GHz)
+can identify 15 characters per second.\\
+\\
+This is not spectacular considering the amount of calculating power this CPU
+can offer, but it is still fairly reasonable. Of course, this program is
+written in Python, and is therefore not nearly as optimized as would be
+possible when written in a low-level language.\\
+\\
+Another performance gain is by using one of the other two neighbourhoods.
+Since these have 8 points instead of 12 points, this increases performance
+drastically, but at the cost of accuracy. With the (8,5)-neighbourhood
+we only need 1.6 ms seconds to identify a character. However, the accuracy
+drops to $89\%$. When using the (8,3)-neighbourhood, the speedwise performance
+remains the same, but accuracy drops even further, so that neighbourhood
+is not advisable to use.
 \section{Conclusion}
 In the end it turns out that using Local Binary Patterns is a promising
@@ -545,14 +581,17 @@ every team member was up-to-date and could start figuring out which part of the
 implementation was most suited to be done by one individually or in a pair.
 \subsubsection*{Who did what}
-Gijs created the basic classes we could use and helped the rest everyone by 
+Gijs created the basic classes we could use and helped everyone by keeping
-keeping track of what required to be finished and whom was working on what. 
+track of what was required to be finished and whom was working on what. 
 Tadde\"us and Jayke were mostly working on the SVM and all kinds of tests
-whether the histograms were matching and alike. Fabi\"en created the functions
+whether the histograms were matching, and what parameters had to be used.
-to read and parse the given xml files with information about the license
+Fabi\"en created the functions to read and parse the given xml files with
-plates. Upon completion all kinds of learning and data sets could be created.
+information about the license plates. Upon completion all kinds of learning
-Richard helped out wherever anyone needed a helping hand, and was always
+and data sets could be created. Richard helped out wherever anyone needed a
-available when someone had to talk or ask something.
+helping hand, and was always available when someone had doubts about what they
+where doing or needed to ask something. He also wrote an image cropper that
+automatically exactly cuts out a character, which eventually turned out to be
+obsolete.
 \subsubsection*{How it went}
@@ -561,4 +600,12 @@ not a big problem as no one was afraid of staying at Science Park a bit longer
 to help out. Further communication usually went through e-mails and replies
 were instantaneous! A crew to remember.
+\appendix
+\section{Faulty Classifications}
+\begin{figure}[H]
+\center
+\includegraphics[scale=0.5]{faulty.png}
+\caption{Faulty classifications of characters}
+\end{figure}
 \end{document}
--- a/images/plate.png
+++ b/images/plate.png
--- a/images/test10.png
+++ b/images/test10.png
--- a/images/test2.png
+++ b/images/test2.png
--- a/images/test3.png
+++ b/images/test3.png
--- a/images/test4.png
+++ b/images/test4.png
--- a/images/test5.png
+++ b/images/test5.png
--- a/images/test6.png
+++ b/images/test6.png
--- a/images/test7.png
+++ b/images/test7.png
--- a/images/test9.png
+++ b/images/test9.png
--- a/images/test_plate.png
+++ b/images/test_plate.png
--- a/src/.gitignore
+++ b/src/.gitignore
 *.dat
+data/*
 results*.txt
--- a/src/LetterCropper.py
+++ b/src/LetterCropper.py
-from Rectangle import Rectangle
-class LetterCropper:
-    def __init__(self, threshold = 0.9):
-        self.threshold = threshold
-    def crop_to_letter(self, image):
-        self.image = image
-        self.determine_letter_bounds()
-        self.image.crop(self.letter_bounds)
-    def determine_letter_bounds(self):
-        min_x = self.image.width
-        max_x = 0
-        min_y = self.image.height
-        max_y = 0
-        for y, x, value in self.image:
-            if value < self.threshold:
-                if x < min_x: min_x = x
-                if y < min_y: min_y = y
-                if x > max_x: max_x = x
-                if y > max_y: max_y = y
-        self.letter_bounds = Rectangle(
-            min_x,
-            min_y,
-            max_x - min_x ,
-            max_y - min_y
-        )
--- a/src/NormalizedCharacterImage.py
+++ b/src/NormalizedCharacterImage.py
 from copy import deepcopy
 from GrayscaleImage import GrayscaleImage
-from LetterCropper import LetterCropper
 from GaussianFilter import GaussianFilter
 class NormalizedCharacterImage(GrayscaleImage):
@@ -16,9 +15,6 @@ class NormalizedCharacterImage(GrayscaleImage):
        self.increase_contrast()
-        #self.crop_threshold = crop_threshold
-        #self.crop_to_letter()
        self.height = height
        self.resize()
@@ -29,10 +25,6 @@ class NormalizedCharacterImage(GrayscaleImage):
    def gaussian_filter(self):
        GaussianFilter(self.blur).filter(self)
-    def crop_to_letter(self):
-        cropper = LetterCropper(0.9)
-        cropper.crop_to_letter(self)
    def resize(self):
        """Resize the image to a fixed height."""
        if self.height == None:

--- a/src/create_characters.py
+++ b/src/create_characters.py
+#!/usr/bin/python
+from os import listdir
+from GrayscaleImage import GrayscaleImage
+from NormalizedCharacterImage import NormalizedCharacterImage
+from Character import Character
+from data import IMAGES_FOLDER, exists, fload, fdump
+NORMALIZED_HEIGHT = 42
+def load_characters(neighbours, blur_scale, verbose=0):
+    chars_file = 'characters_%s_%s.dat' % (blur_scale, neighbours)
+    if exists(chars_file):
+        print 'Loading characters...'
+        chars = fload(chars_file)
+    else:
+        print 'Going to generate character objects...'
+        chars = []
+        for char in sorted(listdir(IMAGES_FOLDER)):
+            count = 0
+            for image in sorted(listdir(IMAGES_FOLDER + char)):
+                image = GrayscaleImage(IMAGES_FOLDER + char + '/' + image)
+                norm = NormalizedCharacterImage(image, blur=blur_scale, \
+                                                height=NORMALIZED_HEIGHT)
+                character = Character(char, [], norm)
+                character.get_single_cell_feature_vector(neighbours)
+                chars.append(character)
+                count += 1
+                if verbose:
+                    print 'Loaded character %s %d times' % (char, count)
+        if verbose:
+            print 'Saving characters...'
+        fdump(chars, chars_file)
+    return chars
+def load_learning_set(neighbours, blur_scale, verbose=0):
+    learning_set_file = 'learning_set_%s_%s.dat' % (blur_scale, neighbours)
+    if exists(learning_set_file):
+        if verbose:
+            print 'Loading learning set...'
+        learning_set = fload(learning_set_file)
+        if verbose:
+            print 'Learning set:', [c.value for c in learning_set]
+    else:
+        learning_set = generate_sets(neighbours, blur_scale, \
+                verbose=verbose)[0]
+    return learning_set
+def load_test_set(neighbours, blur_scale, verbose=0):
+    test_set_file = 'test_set_%s_%s.dat' % (blur_scale, neighbours)
+    if exists(test_set_file):
+        if verbose:
+            print 'Loading test set...'
+        test_set = fload(test_set_file)
+        if verbose:
+            print 'Test set:', [c.value for c in test_set]
+    else:
+        test_set = generate_sets(neighbours, blur_scale, verbose=verbose)[1]
+    return test_set
+def generate_sets(neighbours, blur_scale, verbose=0):
+    suffix = '_%s_%s' % (blur_scale, neighbours)
+    learning_set_file = 'learning_set%s.dat' % suffix
+    test_set_file = 'test_set%s.dat' % suffix
+    chars = load_characters(neighbours, blur_scale, verbose=verbose)
+    if verbose:
+        print 'Going to generate learning set and test set...'
+    learning_set = []
+    test_set = []
+    learned = []
+    for char in chars:
+        if learned.count(char.value) == 70:
+            test_set.append(char)
+        else:
+            learning_set.append(char)
+            learned.append(char.value)
+    if verbose:
+        print 'Learning set:', [c.value for c in learning_set]
+        print '\nTest set:', [c.value for c in test_set]
+        print '\nSaving learning set...'
+    fdump(learning_set, learning_set_file)
+    if verbose:
+        print 'Saving test set...'
+    fdump(test_set, test_set_file)
+    return learning_set, test_set
+if __name__ == '__main__':
+    from sys import argv, exit
+    if len(argv) < 3:
+        print 'Usage: python %s NEIGHBOURS BLUR_SCALE' % argv[0]
+        exit(1)
+    neighbours = int(argv[1])
+    blur_scale = float(argv[2])
+    # Generate the character file and the learning set/test set files
+    load_learning_set(neighbours, blur_scale, verbose=1)
+    load_test_set(neighbours, blur_scale, verbose=1)
--- a/src/create_classifier.py
+++ b/src/create_classifier.py
+#!/usr/bin/python
+from Classifier import Classifier
+from create_characters import load_learning_set
+from data import exists, DATA_FOLDER
+def load_classifier(neighbours, blur_scale, c=None, gamma=None, verbose=0):
+    classifier_file = DATA_FOLDER + 'classifier_%s_%s.dat' \
+            % (blur_scale, neighbours)
+    if exists(classifier_file):
+        if verbose:
+            print 'Loading classifier...'
+        classifier = Classifier(filename=classifier_file, verbose=verbose)
+        classifier.neighbours = neighbours
+    elif c != None and gamma != None:
+        if verbose:
+            print 'Training new classifier...'
+        classifier = Classifier(c=c, gamma=gamma, neighbours=neighbours, \
+                verbose=verbose)
+        learning_set = load_learning_set(neighbours, blur_scale, \
+                verbose=verbose)
+        classifier.train(learning_set)
+    else:
+        raise Exception('No soft margin and gamma specified.')
+    return classifier
+if __name__ == '__main__':
+    from sys import argv, exit
+    if len(argv) < 3:
+        print 'Usage: python %s NEIGHBOURS BLUR_SCALE [ C GAMMA ]' % argv[0]
+        exit(1)
+    neighbours = int(argv[1])
+    blur_scale = float(argv[2])
+    # Generate the classifier file
+    if len(argv) > 4:
+        c = float(argv[3])
+        gamma = float(argv[4])
+        load_classifier(neighbours, blur_scale, c=c, gamma=gamma, verbose=1)
+    else:
+        load_classifier(neighbours, blur_scale, verbose=1)
--- a/src/data.py
+++ b/src/data.py
+import os
+from cPickle import load, dump
+DATA_FOLDER = 'data/'
+IMAGES_FOLDER = '../images/LearningSet/'
+RESULTS_FOLDER = 'results/'
+def assert_data_folder_exists():
+    if not os.path.exists(DATA_FOLDER):
+        os.mkdir(DATA_FOLDER)
+def exists(filename):
+    return os.path.exists(DATA_FOLDER + filename)
+def fload(filename):
+    f = open(DATA_FOLDER + filename, 'r')
+    l = load(f)
+    f.close()
+    return l
+def fdump(obj, filename):
+    assert_data_folder_exists()
+    f = open(DATA_FOLDER + filename, 'w+')
+    dump(obj, f)
+    f.close()
--- a/src/find_svm_params.py
+++ b/src/find_svm_params.py
 #!/usr/bin/python
-from os import listdir
+import os
-from os.path import exists
-from cPickle import load, dump
 from sys import argv, exit
-from GrayscaleImage import GrayscaleImage
-from NormalizedCharacterImage import NormalizedCharacterImage
-from Character import Character
 from Classifier import Classifier
+from data import DATA_FOLDER, RESULTS_FOLDER
+from create_characters import load_learning_set, load_test_set
 if len(argv) < 3:
    print 'Usage: python %s NEIGHBOURS BLUR_SCALE' % argv[0]
@@ -17,64 +14,15 @@ neighbours = int(argv[1])
 blur_scale = float(argv[2])
 suffix = '_%s_%s' % (blur_scale, neighbours)
-chars_file = 'characters%s.dat' % suffix
+if not os.path.exists(RESULTS_FOLDER):
-learning_set_file = 'learning_set%s.dat' % suffix
+    os.mkdir(RESULTS_FOLDER)
-test_set_file = 'test_set%s.dat' % suffix
-classifier_file = 'classifier%s.dat' % suffix
-results_file = 'results%s.txt' % suffix
-# Load characters
-if exists(chars_file):
-    print 'Loading characters...'
-    chars = load(open(chars_file, 'r'))
-else:
-    print 'Going to generate character objects...'
-    chars = []
-    for char in sorted(listdir('../images/LearningSet')):
-        for image in sorted(listdir('../images/LearningSet/' + char)):
-            f = '../images/LearningSet/' + char + '/' + image
-            image = GrayscaleImage(f)
-            norm = NormalizedCharacterImage(image, blur=blur_scale, height=42)
-            #imshow(norm.data, cmap='gray'); show()
-            character = Character(char, [], norm)
-            character.get_single_cell_feature_vector(neighbours)
-            chars.append(character)
-            print char
-    print 'Saving characters...'
-    dump(chars, open(chars_file, 'w+'))
+classifier_file = DATA_FOLDER + 'classifier%s.dat' % suffix
+results_file = '%sresult%s.txt' % (RESULTS_FOLDER, suffix)
 # Load learning set and test set
-if exists(learning_set_file):
+learning_set = load_learning_set(neighbours, blur_scale, verbose=1)
-    print 'Loading learning set...'
+test_set = load_test_set(neighbours, blur_scale, verbose=1)
-    learning_set = load(open(learning_set_file, 'r'))
-    print 'Learning set:', [c.value for c in learning_set]
-    print 'Loading test set...'
-    test_set = load(open(test_set_file, 'r'))
-    print 'Test set:', [c.value for c in test_set]
-else:
-    print 'Going to generate learning set and test set...'
-    learning_set = []
-    test_set = []
-    learned = []
-    for char in chars:
-        if learned.count(char.value) == 70:
-            test_set.append(char)
-        else:
-            learning_set.append(char)
-            learned.append(char.value)
-    print 'Learning set:', [c.value for c in learning_set]
-    print '\nTest set:', [c.value for c in test_set]
-    print '\nSaving learning set...'
-    dump(learning_set, file(learning_set_file, 'w+'))
-    print 'Saving test set...'
-    dump(test_set, file(test_set_file, 'w+'))
 # Perform a grid-search to find the optimal values for C and gamma
 C = [float(2 ** p) for p in xrange(-5, 16, 2)]