Преглед изворни кода

Merge branch 'master' of github.com:taddeus/licenseplates

Richard Torenvliet пре 14 година
родитељ
комит
292aefd4ef
4 измењених фајлова са 39 додато и 23 уклоњено
  1. 12 0
      docs/Makefile
  2. 22 22
      docs/report.tex
  3. 5 1
      src/LocalBinaryPatternizer.py
  4. 0 0
      src/generate_learning_set.py

+ 12 - 0
docs/Makefile

@@ -0,0 +1,12 @@
+RM=rm -rf
+
+all: plan.pdf report.pdf
+
+plan.pdf: plan.tex
+	pdflatex $^
+
+report.pdf: report.tex
+	pdflatex $^
+
+clean:
+	$(RM) *.pdf *.aux *.log *.out *.toc

+ 22 - 22
docs/report.tex

@@ -16,7 +16,7 @@
 
 \section*{Project members}
 Gijs van der Voort\\
-Richard Torenvliet\\
+Raichard Torenvliet\\
 Jayke Meijer\\
 Tadde\"us Kroes\\
 Fabi\"en Tesselaar
@@ -36,7 +36,7 @@ Reading license plates with a computer is much more difficult. Our dataset
 contains photographs of license plates from various angles and distances. This
 means that not only do we have to implement a method to read the actual
 characters, but given the location of the license plate and each individual
-character, we must make sure we transform each character to a standard form. 
+character, we must make sure we transform each character to a standard form.
 
 Determining what character we are looking at will be done by using Local Binary
 Patterns. The main goal of our research is finding out how effective LBP's are
@@ -57,9 +57,9 @@ In short our program must be able to do the following:
 
 The actual purpose of this project is to check if LBP is capable of recognizing
 license plate characters. We knew the LBP implementation would be pretty
-simple. Thus an advantage had to be its speed compared with other license plate 
+simple. Thus an advantage had to be its speed compared with other license plate
 recognition implementations, but the uncertainty of whether we could get some
-results made us pick Python. We felt Python would not restrict us as much in 
+results made us pick Python. We felt Python would not restrict us as much in
 assigning tasks to each member of the group. In addition, when using the
 correct modules to handle images, Python can be decent in speed.
 
@@ -140,7 +140,7 @@ The outcome of this operations will be a binary pattern.
 
 \item Given this pattern, the next step is to divide the pattern in cells. The
 amount of cells depends on the quality of the result, so trial and error is in
-order. Starting with dividing the pattern in to cells of size 16. 
+order. Starting with dividing the pattern in to cells of size 16.
 
 \item Compute a histogram for each cell.
 
@@ -154,7 +154,7 @@ order. Starting with dividing the pattern in to cells of size 16.
 result is a feature vector of the image.
 
 \item Feed these vectors to a support vector machine. This will ''learn'' which
-vector indicates what vector is which character. 
+vector indicates what vector is which character.
 
 \end{itemize}
 
@@ -186,7 +186,7 @@ choices we made.
 \subsection{Character retrieval}
 
 In order to retrieve the characters from the entire image, we need to
-perform a perspective transformation. However, to do this, we need to know the 
+perform a perspective transformation. However, to do this, we need to know the
 coordinates of the four corners of each character. For our dataset, this is
 stored in XML files. So, the first step is to read these XML files.
 
@@ -196,7 +196,7 @@ The XML reader will return a 'license plate' object when given an XML file. The
 licence plate holds a list of, up to six, NormalizedImage characters and from
 which country the plate is from. The reader is currently assuming the XML file
 and image name are corresponding, since this was the case for the given
-dataset. This can easily be adjusted if required. 
+dataset. This can easily be adjusted if required.
 
 To parse the XML file, the minidom module is used. So the XML file can be
 treated as a tree, where one can search for certain nodes. In each XML
@@ -205,7 +205,7 @@ will do is retrieve the current and most up-to-date version of the plate. The
 reader will only get results from this version.
 
 Now we are only interested in the individual characters so we can skip the
-location of the entire license plate. Each character has 
+location of the entire license plate. Each character has
 a single character value, indicating what someone thought what the letter or
 digit was and four coordinates to create a bounding box. If less then four
 points have been set the character will not be saved. Else, to make things not
@@ -215,7 +215,7 @@ it gives some extra freedom when using the data.
 When four points have been gathered the data from the actual image is being
 requested. For each corner a small margin is added (around 3 pixels) so that no
 features will be lost and minimum amounts of new features will be introduced by
-noise in the margin. 
+noise in the margin.
 
 In the next section you can read more about the perspective transformation that
 is being done. After the transformation the character can be saved: Converted
@@ -232,12 +232,12 @@ rectangle.
 
 \subsection{Noise reduction}
 
-The image contains a lot of noise, both from camera errors due to dark noise 
-etc., as from dirt on the license plate. In this case, noise therefore means 
+The image contains a lot of noise, both from camera errors due to dark noise
+etc., as from dirt on the license plate. In this case, noise therefore means
 any unwanted difference in color from the surrounding pixels.
 
 \paragraph*{Camera noise and small amounts of dirt}
-The dirt on the license plate can be of different sizes. We can reduce the 
+The dirt on the license plate can be of different sizes. We can reduce the
 smaller amounts of dirt in the same way as we reduce normal noise, by applying
 a Gaussian blur to the image. This is the next step in our program.\\
 \\
@@ -256,7 +256,7 @@ characters will still be conserved in the LBP, even if there is dirt
 surrounding the character.
 
 \subsection{Creating Local Binary Patterns and feature vector}
-Every pixel is a center pixel and it is also a value to evaluate but not at the 
+Every pixel is a center pixel and it is also a value to evaluate but not at the
 same time. Every pixel is evaluated as shown in the explanation
 of the LBP algorithm. There are several neighbourhoods we can evaluate. We have
 tried the following neighbourhoods:
@@ -276,12 +276,12 @@ would add a computational step, thus making the code execute slower. In the
 next section we will describe what the best neighbourhood was.
 
 Take an example where the full square can be evaluated, so none of the
-neighbours are out of bounds. The first to be checked is the pixel in the left 
-bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$ 
+neighbours are out of bounds. The first to be checked is the pixel in the left
+bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$
 as center pixel that has coordinates $(x, y)$. If the grayscale value of the
 neighbour in the left corner is greater than the grayscale
 value of the center pixel than return true. Bit-shift the first bit with 7. The
-outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so 
+outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so
 on. Until we are at 0. The result is a binary pattern of the local point just
 evaluated.
 Now only the edge pixels are a problem, but a simple check if the location of
@@ -348,7 +348,7 @@ scripts is named here and a description is given on what the script does.
 \section{Finding parameters}
 
 Now that we have a functioning system, we need to tune it to work properly for
-license plates. This means we need to find the parameters. Throughout the 
+license plates. This means we need to find the parameters. Throughout the
 program we have a number of parameters for which no standard choice is
 available. These parameters are:\\
 \\
@@ -373,7 +373,7 @@ The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
 find this parameter, we tested a few values, by trying them and checking the
 results. It turned out that the best value was $\sigma = 1.4$.\\
 \\
-Theoretically, this can be explained as follows. The filter has width of 
+Theoretically, this can be explained as follows. The filter has width of
 $6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
 after our resize operations, around 8 pixels. This means, our filter `matches'
 the smallest detail size we want to be able to see, so everything that is
@@ -456,7 +456,7 @@ $2^{-1}$ &       61 &       61 &       61 &       61 &       62 &
         92 &       93 &       93 &       86 &       45\\
  $2^{7}$ &       61 &       70 &       84 &       90 &       92 &
         93 &       93 &       93 &       86 &       45\\
- $2^{9}$ &       70 &       84 &       90 &       92 &       92 & 
+ $2^{9}$ &       70 &       84 &       90 &       92 &       92 &
        93 &       93 &       93 &       86 &       45\\
 $2^{11}$ &       84 &       90 &       92 &       92 &       92 &
        92 &       93 &       93 &       86 &       45\\
@@ -494,7 +494,7 @@ good scores per character, and $93\%$ seems to be a fairly good result.\\
 \\
 Possibilities for improvement of this score would be more extensive
 grid-searches, finding more exact values for $c$ and $\gamma$, more tests
-for finding $\sigma$ and more experiments on the size and shape of the 
+for finding $\sigma$ and more experiments on the size and shape of the
 neighbourhoods.
 
 \subsection{Speed}
@@ -582,7 +582,7 @@ implementation was most suited to be done by one individually or in a pair.
 
 \subsubsection*{Who did what}
 Gijs created the basic classes we could use and helped everyone by keeping
-track of what was required to be finished and whom was working on what. 
+track of what was required to be finished and whom was working on what.
 Tadde\"us and Jayke were mostly working on the SVM and all kinds of tests
 whether the histograms were matching, and what parameters had to be used.
 Fabi\"en created the functions to read and parse the given xml files with

+ 5 - 1
src/LocalBinaryPatternizer.py

@@ -2,7 +2,8 @@ from Histogram import Histogram
 from math import ceil
 
 class LocalBinaryPatternizer:
-
+    """This class generates a Local Binary Pattern of a given image."""
+    
     def __init__(self, image, cell_size=16, neighbours=3):
         self.cell_size = cell_size
         self.image = image
@@ -23,6 +24,7 @@ class LocalBinaryPatternizer:
                 self.histograms[i].append(Histogram(self.bins, 0, self.bins))
 
     def pattern_3x3(self, y, x, value):
+        """Create the Local Binary Pattern in the (8,3)-neighbourhood."""
         return (self.is_pixel_darker(y - 1, x - 1, value) << 7) \
              | (self.is_pixel_darker(y - 1, x    , value) << 6) \
              | (self.is_pixel_darker(y - 1, x + 1, value) << 5) \
@@ -33,6 +35,7 @@ class LocalBinaryPatternizer:
              | (self.is_pixel_darker(y    , x - 1, value))
 
     def pattern_5x5_hybrid(self, y, x, value):
+        """Create the Local Binary Pattern in the (8,5)-neighbourhood."""
         return (self.is_pixel_darker(y - 2, x - 2, value) << 7) \
              | (self.is_pixel_darker(y - 2, x    , value) << 6) \
              | (self.is_pixel_darker(y - 2, x + 2, value) << 5) \
@@ -43,6 +46,7 @@ class LocalBinaryPatternizer:
              | (self.is_pixel_darker(y    , x - 2, value))
 
     def pattern_5x5(self, y, x, value):
+        """Create the Local Binary Pattern in the (12,5)-neighbourhood."""
         return (self.is_pixel_darker(y - 1, x - 2, value) << 11) \
              | (self.is_pixel_darker(y    , x - 2, value) << 10) \
              | (self.is_pixel_darker(y + 1, x - 2, value) << 9) \

+ 0 - 0
src/LearningSetGenerator.py → src/generate_learning_set.py