Commit a6abbafa authored by Taddeüs Kroes's avatar Taddeüs Kroes

Performed final spell- and consistency-check on report.

parent cd578f73
......@@ -310,7 +310,9 @@
\center
\architecture{
\tikzstyle{area} = [block, fill=gray!15];
\tikzstyle{tracker} = [block, draw=gray!50];
\tikzstyle{tracker} = [block, draw=gray!50, text width=7em];
\tikzstyle{left} = [xshift=-5em];
\tikzstyle{right} = [xshift=5em];
\node[block, below of=driver] (eventdriver) {Event driver}
edge[linefrom] node[right, near end] {device-specific messages} (driver);
......@@ -318,33 +320,36 @@
\node[area, below of=eventdriver] (rootarea) {Screen area}
edge[linefrom] (eventdriver);
\node[area, below of=rootarea, xshift=-5em] (appwindow) {Application window area}
\node[area, below of=rootarea, left] (appwindow) {Application window area}
edge[lineto, <->] (rootarea);
\node[tracker, left of=appwindow, xshift=-4em, text width=7em] {Transformation tracker}
\node[tracker, left of=appwindow, xshift=-4em, yshift=3em] {Transformation tracker}
edge[linefrom, bend right=10] (appwindow)
edge[lineto, dotted, bend left=10] (appwindow);
\node[tracker, left of=appwindow, xshift=-4em, yshift=-1em] {Tap tracker}
edge[lineto, dotted, bend right=10] (appwindow)
edge[linefrom, bend left=10] (appwindow);
\node[area, below of=rootarea, xshift=5em] (overlay) {Overlay area}
\node[area, below of=rootarea, right] (overlay) {Overlay area}
edge[lineto, <->] (rootarea);
\node[tracker, right of=overlay, xshift=4em] (tracker) {Hand tracker}
edge[lineto, dotted, bend left=10] (overlay)
edge[linefrom, bend right=10] (overlay);
\node[area, below of=appwindow, xshift=-5em] (rectangle) {Rectangle area}
\node[area, below of=appwindow, left] (rectangle) {Rectangle area}
edge[lineto, <->] (appwindow);
\node[tracker, left of=rectangle, xshift=-4em, yshift=2em, text width=7em] (recttracker) {Transformation tracker}
\node[tracker, left of=rectangle, xshift=-4em, yshift=2em] (recttracker) {Transformation tracker}
edge[lineto, dotted, bend left=10] (rectangle)
edge[linefrom, bend right=10] (rectangle);
\node[tracker, left of=rectangle, xshift=-4em, yshift=-2em, text width=7em] {Tap tracker}
\node[tracker, left of=rectangle, xshift=-4em, yshift=-2em] {Tap tracker}
edge[lineto, dotted, bend right=10] (rectangle)
edge[linefrom, bend left=10] (rectangle);
\node[area, below of=appwindow, xshift=5em] (triangle) {Triangle area}
\node[area, below of=appwindow, right] (triangle) {Triangle area}
edge[lineto, <->] (appwindow);
\node[tracker, right of=triangle, xshift=4em, yshift=2em, text width=7em] {Transformation tracker}
\node[tracker, right of=triangle, xshift=4em, yshift=2em] {Transformation tracker}
edge[lineto, dotted, bend right=10] (triangle)
edge[linefrom, bend left=10] (triangle);
\node[tracker, right of=triangle, xshift=4em, yshift=-2em, text width=7em] (taptracker) {Tap tracker}
\node[tracker, right of=triangle, xshift=4em, yshift=-2em] (taptracker) {Tap tracker}
edge[lineto, dotted, bend left=10] (triangle)
edge[linefrom, bend right=10] (triangle);
......@@ -360,11 +365,14 @@
Diagram representation of the second test application. A full
screen event area contains an application window and a full screen
overlay. The application window contains a rectangle and a
triangle. the application window and its children can be
triangle. The application window and its children can be
transformed, and thus each have ``transformation tracker''. The
rectangle and triangle also have a ``tap tracker'' that detects
double tap gestures. Dotted arrows represent a flow of gestures,
regular arrows represent events (unless labeled otherwise).
application window area has a ``tap tracker'' to detect double tap
events. The rectangle and triangle also have a tap tracker that
detects regular tap events. These stop event propagation to the
application window area. Dotted arrows represent a flow of
gestures, regular arrows represent events (unless labeled
otherwise).
}
\label{fig:testappdiagram}
\end{figure}
......
......@@ -169,9 +169,8 @@ detection for every new gesture-based application.
but only in the form of machine learning algorithms. Many applications for
mobile phones and tablets only use simple gestures such as taps. For this
category of applications, machine learning is an excessively complex method
of gesture detection. Manoj Kumar shows that when managed well, a
predefined set of gesture detection rules is sufficient to detect simple
gestures.
of gesture detection. Manoj Kumar shows that if managed well, a predefined
set of gesture detection rules is sufficient to detect simple gestures.
This thesis explores the possibility to create an architecture that
combines support for multiple input devices with different methods of
......@@ -208,7 +207,7 @@ detection for every new gesture-based application.
The TUIO protocol \cite{TUIO} is an example of a driver that can be used by
multi-touch devices. TUIO uses ALIVE- and SET-messages to communicate
low-level touch events (section\ref{sec:tuio} describes these in more
low-level touch events (section \ref{sec:tuio} describes these in more
detail). These messages are specific to the API of the TUIO protocol.
Other drivers may use different messages types. To support more than one
driver in the architecture, there must be some translation from
......@@ -263,7 +262,7 @@ detection for every new gesture-based application.
What's more, a widget within the application window itself should be able
to respond to different gestures. E.g. a button widget may respond to a
``tap'' gesture to be activated, whereas the application window responds to
a ``pinch'' gesture to be resized. In order to restrict the occurence of a
a ``pinch'' gesture to be resized. In order to restrict the occurrence of a
gesture to a particular widget in an application, the events used for the
gesture must be restricted to the area of the screen covered by that
widget. An important question is if the architecture should offer a
......@@ -275,17 +274,17 @@ detection for every new gesture-based application.
the screen surface. The gesture detection logic thus uses all events as
input to detect a gesture. This leaves no possibility for a gesture to
occur at multiple screen positions at the same time. The problem is
illustrated in figure \ref{fig:ex1}, where two widgets on the screen can be
illustrated by figure \ref{fig:ex1}, where two widgets on the screen can be
rotated independently. The rotation detection component that detects
rotation gestures receives all four fingers as input. If the two groups of
finger events are not separated by clustering them based on the area in
which they are placed, only one rotation event will occur.
rotation gestures receives events from all four fingers as input. If the
two groups of events are not separated by clustering them based on the area
in which they are placed, only one rotation event will occur.
\examplefigureone
A gesture detection component could perform a heuristic way of clustering
based on the distance between events. However, this method cannot guarantee
that a cluster of events corresponds with a particular application widget.
that a cluster of events corresponds to a particular application widget.
In short, a gesture detection component is difficult to implement without
awareness of the location of application widgets. Secondly, the
application developer still needs to direct gestures to a particular widget
......@@ -306,9 +305,9 @@ detection for every new gesture-based application.
represented by two event areas, each having a different rotation detection
component. Each event area can consist of four corner locations of the
square it represents. To detect whether an event is located inside a
square, the event areas use a point-in-polygon (PIP) test \cite{PIP}. It is
the task of the client application to update the corner locations of the
event area with those of the widget.
square, the event areas can use a point-in-polygon (PIP) test \cite{PIP}.
It is the task of the client application to synchronize the corner
locations of the event area with those of the widget.
\subsection{Callback mechanism}
......@@ -339,14 +338,14 @@ detection for every new gesture-based application.
structure. A root event area represents the panel, containing five other
event areas which are positioned relative to the root area. The relative
positions do not need to be updated when the panel area changes its
position. GUI frameworks use this kind of tree structure to manage
graphical widgets.
position. GUI toolkits use this kind of tree structure to manage graphical
widgets.
If the GUI toolkit provides an API for requesting the position and size of
a widget, a recommended first step when developing an application is to
create a subclass of the area that automatically synchronizes with the
position of a widget from the GUI framework. For example, the test
application described in section \ref{sec:testapp} extends the GTK
application described in section \ref{sec:testapp} extends the GTK+
\cite{GTK} application window widget with the functionality of a
rectangular event area, to direct touch events to an application window.
......@@ -362,7 +361,7 @@ detection for every new gesture-based application.
occurs within the white square.
The problem described above is a common problem in GUI applications, and
there is a common solution (used by GTK \cite{gtkeventpropagation}, among
there is a common solution (used by GTK+ \cite{gtkeventpropagation}, among
others). An event is passed to an ``event handler''. If the handler returns
\texttt{true}, the event is considered ``handled'' and is not
``propagated'' to other widgets. Applied to the example of the draggable
......@@ -378,7 +377,7 @@ detection for every new gesture-based application.
object touching the screen is essentially touching the deepest event area
in the tree that contains the triggered event, which must be the first to
receive the event. When the gesture trackers of the event area are
finished with the event, it is propagated to the siblings and parent in the
finished with the event, it is propagated to the parent and siblings in the
event area tree. Optionally, a gesture tracker can stop the propagation of
the event by its corresponding event area. Figure
\ref{fig:eventpropagation} demonstrates event propagation in the example of
......@@ -387,13 +386,13 @@ detection for every new gesture-based application.
\eventpropagationfigure
An additional type of event propagation is ``immediate propagation'', which
indicates propagation of an event from one gesture detection component to
another. This is applicable when an event area uses more than one gesture
detection component. When regular propagation is stopped, the event is
propagated to other gesture detection components first, before actually
being stopped. One of the components can also stop the immediate
propagation of an event, so that the event is not passed to the next
gesture detection component, nor to the ancestors of the event area.
indicates propagation of an event from one gesture tracker to another. This
is applicable when an event area uses more than one gesture tracker. When
regular propagation is stopped, the event is propagated to other gesture
trackers first, before actually being stopped. One of the gesture trackers
can also stop the immediate propagation of an event, so that the event is
not passed to the next gesture tracker, nor to the ancestors of the event
area.
The concept of an event area is based on the assumption that the set of
originating events that form a particular gesture, can be determined
......@@ -408,7 +407,7 @@ detection for every new gesture-based application.
surface could make a distinction between gestures performed with a blue
gloved hand, and gestures performed with a green gloved hand.
As mentioned in the introduction chapter [\ref{chapter:introduction}], the
As mentioned in the introduction (chapter \ref{chapter:introduction}), the
scope of this thesis is limited to multi-touch surface based devices, for
which the \emph{event area} concept suffices. Section \ref{sec:eventfilter}
explores the possibility of event areas to be replaced with event filters.
......@@ -459,11 +458,11 @@ detection for every new gesture-based application.
compares event coordinates.
When a gesture tracker detects a gesture, this gesture is triggered in the
corresponding event area. The event area then calls the callbacks which are
bound to the gesture type by the application.
corresponding event area. The event area then calls the callback functions
that are bound to the gesture type by the application.
The use of gesture trackers as small detection units allows extendability
of the architecture. A developer can write a custom gesture tracker and
The use of gesture trackers as small detection units allows extension of
the architecture. A developer can write a custom gesture tracker and
register it in the architecture. The tracker can use any type of detection
logic internally, as long as it translates low-level events to high-level
gestures.
......@@ -488,12 +487,11 @@ detection for every new gesture-based application.
be a communication layer between the separate processes.
A common and efficient way of communication between two separate processes
is through the use of a network protocol. In this particular case, the
architecture can run as a daemon\footnote{``daemon'' is a name Unix uses to
indicate that a process runs as a background process.} process, listening
to driver messages and triggering gestures in registered applications.
is through the use of a network protocol. The architecture could run as a
daemon\footnote{``daemon'' is a name Unix uses to indicate that a process
runs as a background process.} process, listening to driver messages and
triggering gestures in registered applications.
\vspace{-0.3em}
\daemondiagram
An advantage of a daemon setup is that it can serve multiple applications
......@@ -504,36 +502,6 @@ detection for every new gesture-based application.
machines, thus distributing computational load. The other machine may even
use a different operating system.
%\section{Example usage}
%\label{sec:example}
%
%This section describes an extended example to illustrate the data flow of
%the architecture. The example application listens to tap events on a button
%within an application window. The window also contains a draggable circle.
%The application window can be resized using \emph{pinch} gestures. Figure
%\ref{fig:examplediagram} shows the architecture created by the pseudo code
%below.
%
%\begin{verbatim}
%initialize GUI framework, creating a window and nessecary GUI widgets
%
%create a root event area that synchronizes position and size with the application window
%define 'rotation' gesture handler and bind it to the root event area
%
%create an event area with the position and radius of the circle
%define 'drag' gesture handler and bind it to the circle event area
%
%create an event area with the position and size of the button
%define 'tap' gesture handler and bind it to the button event area
%
%create a new event server and assign the created root event area to it
%
%start the event server in a new thread
%start the GUI main loop in the current thread
%\end{verbatim}
%
%\examplediagram
\chapter{Reference implementation}
\label{chapter:implementation}
......@@ -590,13 +558,16 @@ When a gesture handler is added to an event area by an application, the event
area must create a gesture tracker that detects the corresponding gesture. To
do this, the architecture must be aware of the existing gesture trackers and
the gestures they support. The architecture provides a registration system for
gesture trackers. A gesture tracker implementation contains a list of supported
gesture types. These gesture types are mapped to the gesture tracker class by
the registration system. When an event area needs to create a gesture tracker
for a gesture type that is not yet being detected, the class of the new created
gesture tracker is loaded from this map. Registration of a gesture tracker is
very straight-forward, as shown by the following Python code:
gesture trackers. Each gesture tracker implementation contains a list of
supported gesture types. These gesture types are mapped to the gesture tracker
class by the registration system. When an event area needs to create a gesture
tracker for a gesture type that is not yet being detected, the class name of
the new created gesture tracker is loaded from this map. Registration of a
gesture tracker is very straight-forward, as shown by the following Python
code:
\begin{verbatim}
from trackers import register_tracker
# Create a gesture tracker implementation
class TapTracker(GestureTracker):
supported_gestures = ["tap", "single_tap", "double_tap"]
......@@ -627,17 +598,17 @@ The ``tap tracker'' detects three types of tap gestures:
position. When a \emph{point\_down} event is received, its location is
saved along with the current timestamp. On the next \emph{point\_up}
event of the touch point, the difference in time and position with its
saved values are compared with predefined thresholds to determine
whether a \emph{tap} gesture should be triggered.
saved values are compared to predefined thresholds to determine whether
a \emph{tap} gesture should be triggered.
\item A \emph{double tap} gesture consists of two sequential \emph{tap}
gestures that are located within a certain distance of each other, and
occur within a certain time window. When a \emph{tap} gesture is
triggered, the tracker saves it as the ``last tap'' along with the
current timestamp. When another \emph{tap} gesture is triggered, its
location and the current timestamp are compared with those of the
``last tap'' gesture to determine whether a \emph{double tap} gesture
should be triggered. If so, the gesture is triggered at the location of
the ``last tap'', because the second tap may be less accurate.
location and the current timestamp are compared to those of the ``last
tap'' gesture to determine whether a \emph{double tap} gesture should
be triggered. If so, the gesture is triggered at the location of the
``last tap'', because the second tap may be less accurate.
\item A separate thread handles detection of \emph{single tap} gestures at
a rate of thirty times per second. When the time since the ``last tap''
exceeds the maximum time between two taps of a \emph{double tap}
......@@ -764,10 +735,10 @@ file. As predicted by the GART article \cite{GART}, this leads to over-complex
code that is difficult to read and debug.
The original application code consists of two main classes. The ``multi-touch
server'' starts a ``TUIO server'' that translates TUIO events to ``point
\{down,move,up\}'' events. Detection of ``tap'' and ``double tap'' gestures is
performed immediately after an event is received. Other gesture detection runs
in a separate thread, using the following loop:
server'' starts a ``TUIO server'' that translates TUIO events to
``point\_\{down,move,up\}'' events. Detection of ``tap'' and ``double tap''
gestures is performed immediately after an event is received. Other gesture
detection runs in a separate thread, using the following loop:
\begin{verbatim}
60 times per second do:
detect `single tap' based on the time since the latest `tap' gesture
......@@ -856,8 +827,9 @@ class GtkEventWindow(Window):
The application window contains a number of polygons which can be dragged,
resized and rotated. Each polygon is represented by a separate event area to
allow simultaneous interaction with different polygons. The main window also
responds to transformation, by transforming all polygons. Additionally, double
tapping on a polygon changes its color.
responds to transformation, by transforming all polygons. Additionally, tapping
on a polygon changes its color. Double tapping on the application window
toggles its modus between full screen and windowed.
An ``overlay'' event area is used to detect all fingers currently touching the
screen. The application defines a custom gesture tracker, called the ``hand
......@@ -877,7 +849,7 @@ each finger to the hand it belongs to, as visible in figure \ref{fig:testapp}.
\end{figure}
To manage the propagation of events used for transformations and tapping, the
applications arranges its event areas in a tree structure as described in
application arranges its event areas in a tree structure as described in
section \ref{sec:tree}. Each transformable event area has its own
``transformation tracker'', which stops the propagation of events used for
transformation gestures. Because the propagation of these events is stopped,
......@@ -909,7 +881,7 @@ list of finger locations, and the centroid of those locations.
When a new finger is detected on the touch surface (a \emph{point\_down} event),
the distance from that finger to all hand centroids is calculated. The hand to
which the distance is the shortest can be the hand that the finger belongs to.
which the distance is the shortest may be the hand that the finger belongs to.
If the distance is larger than the predefined distance threshold, the finger is
assumed to be a new hand and \emph{hand\_down} gesture is triggered. Otherwise,
the finger is assigned to the closest hand. In both cases, a
......@@ -954,12 +926,12 @@ layer between the application window and its event area in the architecture.
This synchronization layer could be used in other applications that use GTK+.
The ``hand tracker'' used by the GTK+ application is not incorporated within
the architecture. The use of gesture trackers by the architecture allows he
application to add new gestures in a single line of code (see section
the architecture. The use of gesture trackers by the architecture allows the
application to add new gestures using a single line of code (see section
\ref{sec:tracker-registration}).
Apart from the synchronization of event areas with application widgets, both
applications have not trouble using the architecture implementation in
applications have no trouble using the architecture implementation in
combination with their application framework. Thus, the architecture can be
used alongside existing application frameworks.
......@@ -970,9 +942,6 @@ To support different devices, there must be an abstraction of device drivers so
that gesture detection can be performed on a common set of low-level events.
This abstraction is provided by the event driver.
% Door input van meerdere drivers door dezelfde event driver heen te laten gaan
% is er ondersteuning voor meerdere apparaten tegelijkertijd.
Gestures must be able to occur within a certain area of a touch surface that is
covered by an application widget. Therefore, low-level events must be divided
into separate groups before any gesture detection is performed. Event areas
......@@ -981,10 +950,11 @@ structure that can be synchronized with the widget tree of the application.
Some applications require the ability to handle an event exclusively for an
event area. An event propagation mechanism provides a solution for this: the
propagation of an event in the tree structure can be stopped after gesture
detection in an event area. Section \ref{sec:testapp} shows that the structure
of the event area tree is not necessarily equal to that of the application
widget tree. The design of the event area tree structure in complex situations
requires an understanding of event propagation by the application programmer.
detection in an event area. \\
Section \ref{sec:testapp} shows that the structure of the event area tree is
not necessarily equal to that of the application widget tree. The design of the
event area tree structure in complex situations requires an understanding of
event propagation by the application programmer.
The detection of complex gestures can be approached in several ways. If
explicit detection code for different gesture is not managed well, program code
......@@ -993,9 +963,9 @@ of different types of gesture is separated into different gesture trackers,
reduces complexity and provides a way to extend a set of detection algorithms.
The use of gesture trackers is flexible, e.g. complex detection algorithms such
as machine learning can be used simultaneously with other gesture trackers that
use explicit detection. Also, the modularity of this design allows extension of
the set of supported gestures. Section \ref{sec:testapp} demonstrates this
extendability.
use explicit detection code. Also, the modularity of this design allows
extension of the set of supported gestures. Section \ref{sec:testapp}
demonstrates this extendability.
A true generic architecture should provide a communication interface that
provides support for multiple programming languages. A daemon implementation as
......@@ -1011,7 +981,7 @@ application frameworks.
As mentioned in section \ref{sec:areas}, the concept of an event area is based
on the assumption that the set of originating events that form a particular
gesture, can be determined based exclusively on the location of the events.
gesture, can be determined exclusively based on the location of the events.
Since this thesis focuses on multi-touch surface based devices, and every
object on a multi-touch surface has a position, this assumption is valid.
However, the design of the architecture is meant to be more generic; to provide
......@@ -1021,15 +991,15 @@ An in-air gesture detection device, such as the Microsoft Kinect \cite{kinect},
provides 3D positions. Some multi-touch tables work with a camera that can also
determine the shape and rotational orientation of objects touching the surface.
For these devices, events delegated by the event driver have more parameters
than a 2D position alone. The term ``area'' is not suitable to describe a group
of events that consist of these parameters.
than a 2D position alone. The term ``event area'' is not suitable to describe a
group of events that consist of these parameters.
A more generic term for a component that groups similar events is the
A more generic term for a component that groups similar events is an
\emph{event filter}. The concept of an event filter is based on the same
principle as event areas, which is the assumption that gestures are formed from
a subset of all events. However, an event filter takes all parameters of an
event into account. An application on the camera-based multi-touch table could
be to group all objects that are triangular into one filter, and all
a subset of all low-level events. However, an event filter takes all parameters
of an event into account. An application on the camera-based multi-touch table
could be to group all objects that are triangular into one filter, and all
rectangular objects into another. Or, to separate small finger tips from large
ones to be able to recognize whether a child or an adult touches the table.
......@@ -1039,8 +1009,8 @@ All gesture trackers in the reference implementation are based on the explicit
analysis of events. Gesture detection is a widely researched subject, and the
separation of detection logic into different trackers allows for multiple types
of gesture detection in the same architecture. An interesting question is
whether multi-touch gestures can be described in a formal way so that explicit
detection code can be avoided.
whether multi-touch gestures can be described in a formal, generic way so that
explicit detection code can be avoided.
\cite{GART} and \cite{conf/gw/RigollKE97} propose the use of machine learning
to recognize gestures. To use machine learning, a set of input events forming a
......@@ -1054,9 +1024,9 @@ the performance of feature vector-based machine learning is dependent on the
quality of the learning set.
A better method to describe a gesture might be to specify its features as a
``signature''. The parameters of such a signature must be be based on input
``signature''. The parameters of such a signature must be be based on low-level
events. When a set of input events matches the signature of some gesture, the
gesture is be triggered. A gesture signature should be a complete description
gesture can be triggered. A gesture signature should be a complete description
of all requirements the set of events must meet to form the gesture.
A way to describe signatures on a multi-touch surface can be by the use of a
......@@ -1091,19 +1061,19 @@ implementation does not support network communication. If the architecture
design is to become successful in the future, the implementation of network
communication is a must. ZeroMQ (or $\emptyset$MQ) \cite{ZeroMQ} is a
high-performance software library with support for a wide range of programming
languages. A good basis for a future implementation could use this library as
the basis for its communication layer.
If an implementation of the architecture will be released, a good idea would be
to do so within a community of application developers. A community can
contribute to a central database of gesture trackers, making the interaction
from their applications available for use in other applications.
languages. A future implementation can use this library as the basis for its
communication layer.
Ideally, a user can install a daemon process containing the architecture so
that it is usable for any gesture-based application on the device. Applications
that use the architecture can specify it as being a software dependency, or
include it in a software distribution.
If a final implementation of the architecture is ever released, a good idea
would be to do so within a community of application developers. A community can
contribute to a central database of gesture trackers, making the interaction
from their applications available for use in other applications.
\bibliographystyle{plain}
\bibliography{report}{}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment