пре 14 година · 292aefd4ef
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -0,0 +1,12 @@
 
				+RM=rm -rf
			
 
				+
			
 
				+all: plan.pdf report.pdf
			
 
				+
			
 
				+plan.pdf: plan.tex
			
 
				+	pdflatex $^
			
 
				+
			
 
				+report.pdf: report.tex
			
 
				+	pdflatex $^
			
 
				+
			
 
				+clean:
			
 
				+	$(RM) *.pdf *.aux *.log *.out *.toc
			
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -16,7 +16,7 @@
 
				 
			
 
				 \section*{Project members}
			
 
				 Gijs van der Voort\\
			
 
				-Richard Torenvliet\\
			
 
				+Raichard Torenvliet\\
			
 
				 Jayke Meijer\\
			
 
				 Tadde\"us Kroes\\
			
 
				 Fabi\"en Tesselaar
			
@@ -36,7 +36,7 @@ Reading license plates with a computer is much more difficult. Our dataset
 
				 contains photographs of license plates from various angles and distances. This
			
 
				 means that not only do we have to implement a method to read the actual
			
 
				 characters, but given the location of the license plate and each individual
			
 
				-character, we must make sure we transform each character to a standard form. 
			
 
				+character, we must make sure we transform each character to a standard form.
			
 
				 
			
 
				 Determining what character we are looking at will be done by using Local Binary
			
 
				 Patterns. The main goal of our research is finding out how effective LBP's are
			
@@ -57,9 +57,9 @@ In short our program must be able to do the following:
 
				 
			
 
				 The actual purpose of this project is to check if LBP is capable of recognizing
			
 
				 license plate characters. We knew the LBP implementation would be pretty
			
 
				-simple. Thus an advantage had to be its speed compared with other license plate 
			
 
				+simple. Thus an advantage had to be its speed compared with other license plate
			
 
				 recognition implementations, but the uncertainty of whether we could get some
			
 
				-results made us pick Python. We felt Python would not restrict us as much in 
			
 
				+results made us pick Python. We felt Python would not restrict us as much in
			
 
				 assigning tasks to each member of the group. In addition, when using the
			
 
				 correct modules to handle images, Python can be decent in speed.
			
 
				 
			
@@ -140,7 +140,7 @@ The outcome of this operations will be a binary pattern.
 
				 
			
 
				 \item Given this pattern, the next step is to divide the pattern in cells. The
			
 
				 amount of cells depends on the quality of the result, so trial and error is in
			
 
				-order. Starting with dividing the pattern in to cells of size 16. 
			
 
				+order. Starting with dividing the pattern in to cells of size 16.
			
 
				 
			
 
				 \item Compute a histogram for each cell.
			
 
				 
			
@@ -154,7 +154,7 @@ order. Starting with dividing the pattern in to cells of size 16.
 
				 result is a feature vector of the image.
			
 
				 
			
 
				 \item Feed these vectors to a support vector machine. This will ''learn'' which
			
 
				-vector indicates what vector is which character. 
			
 
				+vector indicates what vector is which character.
			
 
				 
			
 
				 \end{itemize}
			
 
				 
			
@@ -186,7 +186,7 @@ choices we made.
 
				 \subsection{Character retrieval}
			
 
				 
			
 
				 In order to retrieve the characters from the entire image, we need to
			
 
				-perform a perspective transformation. However, to do this, we need to know the 
			
 
				+perform a perspective transformation. However, to do this, we need to know the
			
 
				 coordinates of the four corners of each character. For our dataset, this is
			
 
				 stored in XML files. So, the first step is to read these XML files.
			
 
				 
			
@@ -196,7 +196,7 @@ The XML reader will return a 'license plate' object when given an XML file. The
 
				 licence plate holds a list of, up to six, NormalizedImage characters and from
			
 
				 which country the plate is from. The reader is currently assuming the XML file
			
 
				 and image name are corresponding, since this was the case for the given
			
 
				-dataset. This can easily be adjusted if required. 
			
 
				+dataset. This can easily be adjusted if required.
			
 
				 
			
 
				 To parse the XML file, the minidom module is used. So the XML file can be
			
 
				 treated as a tree, where one can search for certain nodes. In each XML
			
@@ -205,7 +205,7 @@ will do is retrieve the current and most up-to-date version of the plate. The
 
				 reader will only get results from this version.
			
 
				 
			
 
				 Now we are only interested in the individual characters so we can skip the
			
 
				-location of the entire license plate. Each character has 
			
 
				+location of the entire license plate. Each character has
			
 
				 a single character value, indicating what someone thought what the letter or
			
 
				 digit was and four coordinates to create a bounding box. If less then four
			
 
				 points have been set the character will not be saved. Else, to make things not
			
@@ -215,7 +215,7 @@ it gives some extra freedom when using the data.
 
				 When four points have been gathered the data from the actual image is being
			
 
				 requested. For each corner a small margin is added (around 3 pixels) so that no
			
 
				 features will be lost and minimum amounts of new features will be introduced by
			
 
				-noise in the margin. 
			
 
				+noise in the margin.
			
 
				 
			
 
				 In the next section you can read more about the perspective transformation that
			
 
				 is being done. After the transformation the character can be saved: Converted
			
@@ -232,12 +232,12 @@ rectangle.
 
				 
			
 
				 \subsection{Noise reduction}
			
 
				 
			
 
				-The image contains a lot of noise, both from camera errors due to dark noise 
			
 
				-etc., as from dirt on the license plate. In this case, noise therefore means 
			
 
				+The image contains a lot of noise, both from camera errors due to dark noise
			
 
				+etc., as from dirt on the license plate. In this case, noise therefore means
			
 
				 any unwanted difference in color from the surrounding pixels.
			
 
				 
			
 
				 \paragraph*{Camera noise and small amounts of dirt}
			
 
				-The dirt on the license plate can be of different sizes. We can reduce the 
			
 
				+The dirt on the license plate can be of different sizes. We can reduce the
			
 
				 smaller amounts of dirt in the same way as we reduce normal noise, by applying
			
 
				 a Gaussian blur to the image. This is the next step in our program.\\
			
 
				 \\
			
@@ -256,7 +256,7 @@ characters will still be conserved in the LBP, even if there is dirt
 
				 surrounding the character.
			
 
				 
			
 
				 \subsection{Creating Local Binary Patterns and feature vector}
			
 
				-Every pixel is a center pixel and it is also a value to evaluate but not at the 
			
 
				+Every pixel is a center pixel and it is also a value to evaluate but not at the
			
 
				 same time. Every pixel is evaluated as shown in the explanation
			
 
				 of the LBP algorithm. There are several neighbourhoods we can evaluate. We have
			
 
				 tried the following neighbourhoods:
			
@@ -276,12 +276,12 @@ would add a computational step, thus making the code execute slower. In the
 
				 next section we will describe what the best neighbourhood was.
			
 
				 
			
 
				 Take an example where the full square can be evaluated, so none of the
			
 
				-neighbours are out of bounds. The first to be checked is the pixel in the left 
			
 
				-bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$ 
			
 
				+neighbours are out of bounds. The first to be checked is the pixel in the left
			
 
				+bottom corner in the square 3 x 3, with coordinate $(x - 1, y - 1)$ with $g_c$
			
 
				 as center pixel that has coordinates $(x, y)$. If the grayscale value of the
			
 
				 neighbour in the left corner is greater than the grayscale
			
 
				 value of the center pixel than return true. Bit-shift the first bit with 7. The
			
 
				-outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so 
			
 
				+outcome is now 1000000. The second neighbour will be bit-shifted with 6, and so
			
 
				 on. Until we are at 0. The result is a binary pattern of the local point just
			
 
				 evaluated.
			
 
				 Now only the edge pixels are a problem, but a simple check if the location of
			
@@ -348,7 +348,7 @@ scripts is named here and a description is given on what the script does.
 
				 \section{Finding parameters}
			
 
				 
			
 
				 Now that we have a functioning system, we need to tune it to work properly for
			
 
				-license plates. This means we need to find the parameters. Throughout the 
			
 
				+license plates. This means we need to find the parameters. Throughout the
			
 
				 program we have a number of parameters for which no standard choice is
			
 
				 available. These parameters are:\\
			
 
				 \\
			
@@ -373,7 +373,7 @@ The first parameter to decide on, is the $\sigma$ used in the Gaussian blur. To
 
				 find this parameter, we tested a few values, by trying them and checking the
			
 
				 results. It turned out that the best value was $\sigma = 1.4$.\\
			
 
				 \\
			
 
				-Theoretically, this can be explained as follows. The filter has width of 
			
 
				+Theoretically, this can be explained as follows. The filter has width of
			
 
				 $6 * \sigma = 6 * 1.4 = 8.4$ pixels. The width of a `stroke' in a character is,
			
 
				 after our resize operations, around 8 pixels. This means, our filter `matches'
			
 
				 the smallest detail size we want to be able to see, so everything that is
			
@@ -456,7 +456,7 @@ $2^{-1}$ &       61 &       61 &       61 &       61 &       62 &
 
				         92 &       93 &       93 &       86 &       45\\
			
 
				  $2^{7}$ &       61 &       70 &       84 &       90 &       92 &
			
 
				         93 &       93 &       93 &       86 &       45\\
			
 
				- $2^{9}$ &       70 &       84 &       90 &       92 &       92 & 
			
 
				+ $2^{9}$ &       70 &       84 &       90 &       92 &       92 &
			
 
				        93 &       93 &       93 &       86 &       45\\
			
 
				 $2^{11}$ &       84 &       90 &       92 &       92 &       92 &
			
 
				        92 &       93 &       93 &       86 &       45\\
			
@@ -494,7 +494,7 @@ good scores per character, and $93\%$ seems to be a fairly good result.\\
 
				 \\
			
 
				 Possibilities for improvement of this score would be more extensive
			
 
				 grid-searches, finding more exact values for $c$ and $\gamma$, more tests
			
 
				-for finding $\sigma$ and more experiments on the size and shape of the 
			
 
				+for finding $\sigma$ and more experiments on the size and shape of the
			
 
				 neighbourhoods.
			
 
				 
			
 
				 \subsection{Speed}
			
@@ -582,7 +582,7 @@ implementation was most suited to be done by one individually or in a pair.
 
				 
			
 
				 \subsubsection*{Who did what}
			
 
				 Gijs created the basic classes we could use and helped everyone by keeping
			
 
				-track of what was required to be finished and whom was working on what. 
			
 
				+track of what was required to be finished and whom was working on what.
			
 
				 Tadde\"us and Jayke were mostly working on the SVM and all kinds of tests
			
 
				 whether the histograms were matching, and what parameters had to be used.
			
 
				 Fabi\"en created the functions to read and parse the given xml files with
			
--- a/src/LocalBinaryPatternizer.py
+++ b/src/LocalBinaryPatternizer.py
@@ -2,7 +2,8 @@ from Histogram import Histogram
 
				 from math import ceil
			
 
				 
			
 
				 class LocalBinaryPatternizer:
			
 
				-
			
 
				+    """This class generates a Local Binary Pattern of a given image."""
			
 
				+    
			
 
				     def __init__(self, image, cell_size=16, neighbours=3):
			
 
				         self.cell_size = cell_size
			
 
				         self.image = image
			
@@ -23,6 +24,7 @@ class LocalBinaryPatternizer:
 
				                 self.histograms[i].append(Histogram(self.bins, 0, self.bins))
			
 
				 
			
 
				     def pattern_3x3(self, y, x, value):
			
 
				+        """Create the Local Binary Pattern in the (8,3)-neighbourhood."""
			
 
				         return (self.is_pixel_darker(y - 1, x - 1, value) << 7) \
			
 
				              | (self.is_pixel_darker(y - 1, x    , value) << 6) \
			
 
				              | (self.is_pixel_darker(y - 1, x + 1, value) << 5) \
			
@@ -33,6 +35,7 @@ class LocalBinaryPatternizer:
 
				              | (self.is_pixel_darker(y    , x - 1, value))
			
 
				 
			
 
				     def pattern_5x5_hybrid(self, y, x, value):
			
 
				+        """Create the Local Binary Pattern in the (8,5)-neighbourhood."""
			
 
				         return (self.is_pixel_darker(y - 2, x - 2, value) << 7) \
			
 
				              | (self.is_pixel_darker(y - 2, x    , value) << 6) \
			
 
				              | (self.is_pixel_darker(y - 2, x + 2, value) << 5) \
			
@@ -43,6 +46,7 @@ class LocalBinaryPatternizer:
 
				              | (self.is_pixel_darker(y    , x - 2, value))
			
 
				 
			
 
				     def pattern_5x5(self, y, x, value):
			
 
				+        """Create the Local Binary Pattern in the (12,5)-neighbourhood."""
			
 
				         return (self.is_pixel_darker(y - 1, x - 2, value) << 11) \
			
 
				              | (self.is_pixel_darker(y    , x - 2, value) << 10) \
			
 
				              | (self.is_pixel_darker(y + 1, x - 2, value) << 9) \
			
--- a/src/generate_learning_set.py
+++ b/src/generate_learning_set.py