13 anos atrás · 294c077383
--- a/docs/data/diagrams.tex
+++ b/docs/data/diagrams.tex
@@ -140,7 +140,7 @@
 
				 }
			
 
				 
			
 
				 \def\trackerdiagram{
			
 
				-    \begin{figure}[h]
			
 
				+    \begin{figure}[h!]
			
 
				         \center
			
 
				         \architecture{
			
 
				             \node[block, below of=driver] (eventdriver) {Event driver}
			
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -18,20 +18,19 @@
 
				 % Title page
			
 
				 \maketitle
			
 
				 \begin{abstract}
			
 
				-    Applications that use
			
 
				-    complex gesture-based interaction need to translate primitive messages from
			
 
				-    low-level device drivers to complex, high-level gestures, and map these
			
 
				-    gestures to elements in an application. This report presents a generic
			
 
				-    architecture for the detection of complex gestures in an application. The
			
 
				-    architecture translates device driver messages to a common set of
			
 
				-    ``events''. The events are then delegated to a tree of ``areas'', which are
			
 
				-    used to separate groups of events and assign these groups to an element in
			
 
				-    the application. Gesture detection is performed on a group of events
			
 
				-    assigned to an area, using detection units called ``gesture tackers''. An
			
 
				-    implementation of the architecture as a daemon process would be capable of
			
 
				-    serving gestures to multiple applications at the same time. A reference
			
 
				-    implementation and two test case applications have been created to test the
			
 
				-    effectiveness of the architecture design.
			
 
				+    Applications that use complex gesture-based interaction need to translate
			
 
				+    primitive messages from low-level device drivers to complex, high-level
			
 
				+    gestures, and map these gestures to elements in an application. This report
			
 
				+    presents a generic architecture for the detection of complex gestures in an
			
 
				+    application. The architecture translates device driver messages to a common
			
 
				+    set of ``events''. The events are then delegated to a tree of ``areas'',
			
 
				+    which are used to separate groups of events and assign these groups to an
			
 
				+    element in the application. Gesture detection is performed on a group of
			
 
				+    events assigned to an area, using detection units called ``gesture
			
 
				+    tackers''. An implementation of the architecture as a daemon process would
			
 
				+    be capable of serving gestures to multiple applications at the same time. A
			
 
				+    reference implementation and two test case applications have been created
			
 
				+    to test the effectiveness of the architecture design.
			
 
				 \end{abstract}
			
 
				 
			
 
				 % Set paragraph indentation
			
@@ -243,26 +242,25 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     % TODO: in introduction: gestures zijn opgebouwd uit meerdere primitieven
			
 
				     Touch input devices are unaware of the graphical input
			
 
				     widgets\footnote{``Widget'' is a name commonly used to identify an element
			
 
				-    of a graphical user interface (GUI).} of an application, and therefore
			
 
				-    generate events that simply identify the screen location at which an event
			
 
				-    takes place. User interfaces of applications that do not run in full screen
			
 
				-    modus are contained in a window. Events which occur outside the application
			
 
				-    window should not be handled by the program in most cases. What's more,
			
 
				-    widget within the application window itself should be able to respond to
			
 
				-    different gestures. E.g. a button widget may respond to a ``tap'' gesture
			
 
				-    to be activated, whereas the application window responds to a ``pinch''
			
 
				-    gesture to be resized. In order to be able to direct a gesture to a
			
 
				-    particular widget in an application, a gesture must be restricted to the
			
 
				-    area of the screen covered by that widget. An important question is if the
			
 
				-    architecture should offer a solution to this problem, or leave the task of
			
 
				-    assigning gestures to application widgets to the application developer.
			
 
				+    of a graphical user interface (GUI).} rendered by an application, and
			
 
				+    therefore generate events that simply identify the screen location at which
			
 
				+    an event takes place. User interfaces of applications that do not run in
			
 
				+    full screen modus are contained in a window. Events which occur outside the
			
 
				+    application window should not be handled by the program in most cases.
			
 
				+    What's more, widget within the application window itself should be able to
			
 
				+    respond to different gestures. E.g. a button widget may respond to a
			
 
				+    ``tap'' gesture to be activated, whereas the application window responds to
			
 
				+    a ``pinch'' gesture to be resized. In order to be able to direct a gesture
			
 
				+    to a particular widget in an application, a gesture must be restricted to
			
 
				+    the area of the screen covered by that widget. An important question is if
			
 
				+    the architecture should offer a solution to this problem, or leave the task
			
 
				+    of assigning gestures to application widgets to the application developer.
			
 
				 
			
 
				     If the architecture does not provide a solution, the ``Event analysis''
			
 
				     component in figure \ref{fig:multipledrivers} receives all events that
			
 
				     occur on the screen surface. The gesture detection logic thus uses all
			
 
				     events as input to detect a gesture. This leaves no possibility for a
			
 
				-    gesture to occur at multiple screen positions at the same time, unless the
			
 
				-    gesture detection logic incorporates event cluster detection. The problem
			
 
				+    gesture to occur at multiple screen positions at the same time. The problem
			
 
				     is illustrated in figure \ref{fig:ex1}, where two widgets on the screen can
			
 
				     be rotated independently. The rotation detection component that detects
			
 
				     rotation gestures receives all four fingers as input. If the two groups of
			
@@ -274,11 +272,12 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     A gesture detection component could perform a heuristic way of cluster
			
 
				     detection based on the distance between events. However, this method cannot
			
 
				     guarantee that a cluster of events corresponds with a particular
			
 
				-    application widget. In short, gesture detection is difficult to implement
			
 
				-    without awareness of the location of application widgets. Moreover, the
			
 
				-    application developer still needs to direct gestures to a particular widget
			
 
				-    manually.  This requires geometric calculations in the application logic,
			
 
				-    which is a tedious and error-prone task for the developer.
			
 
				+    application widget. In short, a gesture detection component is difficult to
			
 
				+    implement without awareness of the location of application widgets.
			
 
				+    Secondly, the application developer still needs to direct gestures to a
			
 
				+    particular widget manually. This requires geometric calculations in the
			
 
				+    application logic, which is a tedious and error-prone task for the
			
 
				+    developer.
			
 
				 
			
 
				     A better solution is to group events that occur inside the area covered by
			
 
				     a widget, before passing them on to a gesture detection component.
			
@@ -301,9 +300,9 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     application is a ``callback'' mechanism: the application developer binds a
			
 
				     function to an event, that is called when the event occurs. Because of the
			
 
				     familiarity of this concept with developers, the architecture uses a
			
 
				-    callback mechanism to handle gestures in an application. Since an area
			
 
				-    controls the grouping of events and thus the occurrence of gestures in an
			
 
				-    area, gesture handlers for a specific gesture type are bound to an area.
			
 
				+    callback mechanism to handle gestures in an application. Callback handlers
			
 
				+    are bound to event areas, since events areas controls the grouping of
			
 
				+    events and thus the occurrence of gestures in an area of the screen.
			
 
				     Figure \ref{fig:areadiagram} shows the position of areas in the
			
 
				     architecture.
			
 
				 
			
@@ -324,13 +323,16 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     complex touch objects can have additional parameters, such as rotational
			
 
				     orientation or color. An even more generic concept is the \emph{event
			
 
				     filter}, which detects whether an event should be assigned to a particular
			
 
				-    gesture detection component based on all available parameters. This level of
			
 
				-    abstraction allows for constraints like ``Use all blue objects within a
			
 
				-    widget for rotation, and green objects for dragging.''. As mentioned in the
			
 
				-    introduction chapter [\ref{chapter:introduction}], the scope of this thesis
			
 
				-    is limited to multi-touch surface based devices, for which the \emph{event
			
 
				-    area} concept suffices. Section \ref{sec:eventfilter} explores the
			
 
				-    possibility of event areas to be replaced with event filters.
			
 
				+    gesture detection component based on all available parameters. This level
			
 
				+    of abstraction provides additional methods of interaction. For example, a
			
 
				+    camera-based multi-touch surface could make a distinction between gestures
			
 
				+    performed with a blue gloved hand, and gestures performed with a green
			
 
				+    gloved hand.
			
 
				+
			
 
				+    As mentioned in the introduction chapter [\ref{chapter:introduction}], the
			
 
				+    scope of this thesis is limited to multi-touch surface based devices, for
			
 
				+    which the \emph{event area} concept suffices. Section \ref{sec:eventfilter}
			
 
				+    explores the possibility of event areas to be replaced with event filters.
			
 
				 
			
 
				     \subsection{Area tree}
			
 
				     \label{sec:tree}
			
@@ -362,7 +364,7 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     \subsection{Event propagation}
			
 
				     \label{sec:eventpropagation}
			
 
				 
			
 
				-    A problem occurs when event areas overlap, as shown by figure
			
 
				+    Another problem occurs when event areas overlap, as shown by figure
			
 
				     \ref{fig:eventpropagation}. When the white square is rotated, the gray
			
 
				     square should keep its current orientation. This means that events that are
			
 
				     used for rotation of the white square, should not be used for rotation of
			
@@ -416,14 +418,15 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     complex gestures, like the writing of a character from the alphabet,
			
 
				     require more advanced detection algorithms.
			
 
				 
			
 
				-    A way to detect complex gestures based on a sequence of input features
			
 
				-    is with the use of machine learning methods, such as Hidden Markov Models
			
 
				-    \footnote{A Hidden Markov Model (HMM) is a statistical model without a
			
 
				-    memory, it can be used to detect gestures based on the current input state
			
 
				-    alone.} \cite{conf/gw/RigollKE97}. A sequence of input states can be mapped
			
 
				-    to a feature vector that is recognized as a particular gesture with a
			
 
				-    certain probability. An advantage of using machine learning with respect to
			
 
				-    an imperative programming style is that complex gestures can be described
			
 
				+    A way to detect these complex gestures based on a sequence of input events,
			
 
				+    is with the use of machine learning methods, such as the Hidden Markov
			
 
				+    Models \footnote{A Hidden Markov Model (HMM) is a statistical model without
			
 
				+    a memory, it can be used to detect gestures based on the current input
			
 
				+    state alone.} used for sign language detection by
			
 
				+    \cite{conf/gw/RigollKE97}. A sequence of input states can be mapped to a
			
 
				+    feature vector that is recognized as a particular gesture with a certain
			
 
				+    probability. An advantage of using machine learning with respect to an
			
 
				+    imperative programming style is that complex gestures can be described
			
 
				     without the use of explicit detection logic. For example, the detection of
			
 
				     the character `A' being written on the screen is difficult to implement
			
 
				     using an imperative programming style, while a trained machine learning
			
@@ -434,25 +437,31 @@ goal is to test the effectiveness of the design and detect its shortcomings.
 
				     sufficient to detect many common gestures, like rotation and dragging. The
			
 
				     imperative programming style is also familiar and understandable for a wide
			
 
				     range of application developers. Therefore, the architecture should support
			
 
				-    an imperative style of gesture detection.
			
 
				-
			
 
				-    A problem with the imperative programming style is that the explicit
			
 
				-    detection of different gestures requires different gesture detection
			
 
				-    components. If these components is not managed well, the detection logic is
			
 
				-    prone to become chaotic and over-complex.
			
 
				-
			
 
				-    To manage complexity and support multiple methods of gesture detection, the
			
 
				-    architecture has adopted the tracker-based design as described by
			
 
				-    \cite{win7touch}. Different detection components are wrapped in separate
			
 
				+    an imperative style of gesture detection. A problem with an imperative
			
 
				+    programming style is that the explicit detection of different gestures
			
 
				+    requires different gesture detection components. If these components are
			
 
				+    not managed well, the detection logic is prone to become chaotic and
			
 
				+    over-complex.
			
 
				+
			
 
				+    To manage complexity and support multiple styles of gesture detection
			
 
				+    logic, the architecture has adopted the tracker-based design as described
			
 
				+    by \cite{win7touch}. Different detection components are wrapped in separate
			
 
				     gesture tracking units, or \emph{gesture trackers}. The input of a gesture
			
 
				-    tracker is provided by an event area in the form of events. When a gesture
			
 
				-    tracker detects a gesture, this gesture is triggered in the corresponding
			
 
				-    event area. The event area then calls the callbacks which are bound to the
			
 
				-    gesture type by the application. Figure \ref{fig:trackerdiagram} shows the
			
 
				-    position of gesture trackers in the architecture.
			
 
				+    tracker is provided by an event area in the form of events. Each gesture
			
 
				+    detection component is wrapped in a gesture tracker with a fixed type of
			
 
				+    input and output. Internally, the gesture tracker can adopt any programming
			
 
				+    style. A character recognition component can use an HMM, whereas a tap
			
 
				+    detection component defines a simple function that compares event
			
 
				+    coordinates.
			
 
				 
			
 
				     \trackerdiagram
			
 
				 
			
 
				+    When a gesture tracker detects a gesture, this gesture is triggered in the
			
 
				+    corresponding event area. The event area then calls the callbacks which are
			
 
				+    bound to the gesture type by the application. Figure
			
 
				+    \ref{fig:trackerdiagram} shows the position of gesture trackers in the
			
 
				+    architecture.
			
 
				+
			
 
				     The use of gesture trackers as small detection units provides extendability
			
 
				     of the architecture. A developer can write a custom gesture tracker and
			
 
				     register it in the architecture. The tracker can use any type of detection