Commit 6c3205c8 authored by Taddeüs Kroes's avatar Taddeüs Kroes

Finished version 2 of report.

parent 6c12d81a
......@@ -88,13 +88,15 @@ detection for every new gesture-based application.
applications, whose purpose is to test the effectiveness of the design and
detect its shortcomings.
Chapter \ref{chapter:related} describes related work that inspired a design
for the architecture. The design is presented in chapter
\ref{chapter:design}. Chapter \ref{chapter:testapps} presents a reference
implementation of the architecture, and two test case applications that
show the practical use of its components as presented in chapter
\ref{chapter:design}. Finally, some suggestions for future research on the
subject are given in chapter \ref{chapter:futurework}.
Chapter \ref{chapter:related} describes related work that inspired the
design of the architecture. The design is described in chapter
\ref{chapter:design}. Chapter \ref{chapter:implementation} presents a
reference implementation of the architecture. Two test case applications
show the practical use of the architecture components in chapter
\ref{chapter:test-applications}. Chapter \ref{chapter:conclusions}
formulates some conclusions about the architecture design and its
practicality. Finally, some suggestions for future research on the subject
are given in chapter \ref{chapter:futurework}.
\chapter{Related work}
\label{chapter:related}
......@@ -207,16 +209,16 @@ detection for every new gesture-based application.
The TUIO protocol \cite{TUIO} is an example of a driver that can be used by
multi-touch devices. TUIO uses ALIVE- and SET-messages to communicate
low-level touch events (see appendix \ref{app:tuio} for more details).
These messages are specific to the API of the TUIO protocol. Other drivers
may use different messages types. To support more than one driver in the
architecture, there must be some translation from device-specific messages
to a common format for primitive touch events. After all, the gesture
detection logic in a ``generic'' architecture should not be implemented
based on device-specific messages. The event types in this format should be
chosen so that multiple drivers can trigger the same events. If each
supported driver would add its own set of event types to the common format,
the purpose of it being ``common'' would be defeated.
low-level touch events (section\ref{sec:tuio} describes these in more
detail). These messages are specific to the API of the TUIO protocol.
Other drivers may use different messages types. To support more than one
driver in the architecture, there must be some translation from
device-specific messages to a common format for primitive touch events.
After all, the gesture detection logic in a ``generic'' architecture should
not be implemented based on device-specific messages. The event types in
this format should be chosen so that multiple drivers can trigger the same
events. If each supported driver would add its own set of event types to
the common format, the purpose of it being ``common'' would be defeated.
A minimal expectation for a touch device driver is that it detects simple
touch points, with a ``point'' being an object at an $(x, y)$ position on
......@@ -533,30 +535,22 @@ detection for every new gesture-based application.
%
%\examplediagram
\chapter{Implementation and test applications}
\label{chapter:testapps}
\chapter{Reference implementation}
\label{chapter:implementation}
A reference implementation of the design has been written in Python. Two test
applications have been created to test if the design ``works'' in a practical
application, and to detect its flaws. One application is mainly used to test
the gesture tracker implementations. The other application uses multiple event
areas in a tree structure, demonstrating event delegation and propagation. The
second application also defines a custom gesture tracker.
A reference implementation of the design has been written in Python and is
available at \cite{gitrepos}. The implementation does not include a network
protocol to support the daemon setup as described in section \ref{sec:daemon}.
Therefore, it is only usable in Python programs. The two test applications
described in chapter \ref{chapter:test-applications} are also written in
Python.
To test multi-touch interaction properly, a multi-touch device is required. The
University of Amsterdam (UvA) has provided access to a multi-touch table from
PQlabs. The table uses the TUIO protocol \cite{TUIO} to communicate touch
events. See appendix \ref{app:tuio} for details regarding the TUIO protocol.
events.
%The reference implementation and its test applications are a Proof of Concept,
%meant to show that the architecture design is effective.
%that translates TUIO messages to some common multi-touch gestures.
\section{Reference implementation}
\label{sec:implementation}
The reference implementation is written in Python and available at
\cite{gitrepos}. The following component implementations are included:
The following component implementations are included in the implementation:
\textbf{Event drivers}
\begin{itemize}
......@@ -579,17 +573,42 @@ The reference implementation is written in Python and available at
\item Transformation tracker, supports $rotate,~pinch,~drag,~flick$ gestures.
\end{itemize}
The implementation does not include a network protocol to support the daemon
setup as described in section \ref{sec:daemon}. Therefore, it is only usable in
Python programs. The two test programs are also written in Python.
The reference implementation also contains some geometric functions that are
used by several event area implementations. The event area implementations are
trivial by name and are therefore not discussed in this report.
All gesture trackers have been implemented using an imperative programming
style. Section \ref{sec:tracker-registration} shows how gesture trackers can be
added to the architecture. Sections \ref{sec:basictracker} to
\ref{sec:transformationtracker} describe the gesture tracker implementations in
detail. The implementation of the TUIO event driver is described in section
\ref{sec:tuio}.
\section{Gesture tracker registration}
\label{sec:tracker-registration}
When a gesture handler is added to an event area by an application, the event
area must create a gesture tracker that detects the corresponding gesture. To
do this, the architecture must be aware of the existing gesture trackers and
the gestures they support. The architecture provides a registration system for
gesture trackers. A gesture tracker implementation contains a list of supported
gesture types. These gesture types are mapped to the gesture tracker class by
the registration system. When an event area needs to create a gesture tracker
for a gesture type that is not yet being detected, the class of the new created
gesture tracker is loaded from this map. Registration of a gesture tracker is
very straight-forward, as shown by the following Python code:
\begin{verbatim}
# Create a gesture tracker implementation
class TapTracker(GestureTracker):
supported_gestures = ["tap", "single_tap", "double_tap"]
# Methods for gesture detection go here
The event area implementations contain some geometric functions to determine
whether an event should be delegated to an event area. All gesture trackers
have been implemented using an imperative programming style. Sections
\ref{sec:basictracker} to \ref{sec:transformationtracker} describe the gesture
tracker implementations in detail.
# Register the gesture tracker with the architecture
register_tracker(TapTracker)
\end{verbatim}
\subsection{Basic tracker}
\section{Basic tracker}
\label{sec:basictracker}
The ``basic tracker'' implementation exists only to provide access to low-level
......@@ -598,7 +617,7 @@ trackers, not by the application itself. Therefore, the basic tracker maps
\emph{point\_\{down,move,up\}} events to equally named gestures that can be
handled by the application.
\subsection{Tap tracker}
\section{Tap tracker}
\label{sec:taptracker}
The ``tap tracker'' detects three types of tap gestures:
......@@ -632,7 +651,7 @@ regular \emph{tap} gesture, since the first \emph{tap} gesture has already been
handled by the application when the second \emph{tap} of a \emph{double tap}
gesture is triggered.
\subsection{Transformation tracker}
\section{Transformation tracker}
\label{sec:transformationtracker}
The transformation tracker triggers \emph{rotate}, \emph{pinch}, \emph{drag}
......@@ -668,6 +687,64 @@ The angle used for the \emph{rotate} gesture is only divided by the number of
touch points to obtain an average rotation of all touch points:
$$rotate.angle = \frac{\alpha}{N}$$
\section{The TUIO event driver}
\label{sec:tuio}
The TUIO protocol \cite{TUIO} defines a way to geometrically describe tangible
objects, such as fingers or objects on a multi-touch table. Object information
is sent to the TUIO UDP port (3333 by default). For efficiency reasons, the
TUIO protocol is encoded using the Open Sound Control \cite[OSC]{OSC} format.
An OSC server/client implementation is available for Python: pyOSC
\cite{pyOSC}.
A Python implementation of the TUIO protocol also exists: pyTUIO \cite{pyTUIO}.
However, a bug causes the execution of an example script to yield an error in
Python's built-in \texttt{socket} library. Therefore, the TUIO event driver
receives TUIO messages at a lower level, using the pyOSC package to receive
TUIO messages.
The two most important message types of the protocol are ALIVE and SET
messages. An ALIVE message contains the list of ``session'' id's that are
currently ``active'', which in the case of multi-touch a table means that they
are touching the touch surface. A SET message provides geometric information of
a session, such as position, velocity and acceleration. Each session represents
an object touching the touch surface. The only type of objects on the
multi-touch table are what the TUIO protocol calls ``2DCur'', which is a (x, y)
position on the touch surface.
ALIVE messages can be used to determine when an object touches and releases the
screen. E.g. if a session id was in the previous message but not in the
current, the object it represents has been lifted from the screen. SET messages
provide information about movement. In the case of simple (x, y) positions,
only the movement vector of the position itself can be calculated. For more
complex objects such as fiducials, arguments like rotational position and
acceleration are also included. ALIVE and SET messages are combined to create
\emph{point\_down}, \emph{point\_move} and \emph{point\_up} events by the TUIO
event driver.
TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the left
top corner of the touch surface and $(1.0, 1.0)$ the right bottom corner. The
TUIO event driver scales these to pixel coordinates so that event area
implementations can use pixel coordinates to determine whether an event is
located within them. This transformation is also mentioned by the online
TUIO specification \cite{TUIO_specification}:
\begin{quote}
In order to compute the X and Y coordinates for the 2D profiles a TUIO
tracker implementation needs to divide these values by the actual sensor
dimension, while a TUIO client implementation consequently can scale these
values back to the actual screen dimension.
\end{quote}
\chapter{Test applications}
\label{chapter:test-applications}
Two test case applications have been created to test if the design ``works'' in
a practical application, and to detect its flaws. One application is mainly
used to test the gesture tracker implementations. The second application uses
multiple event areas in a tree structure, demonstrating event delegation and
propagation. The second application also defines a custom gesture tracker.
\section{Full screen Pygame application}
%The goal of this application was to experiment with the TUIO
......@@ -692,7 +769,6 @@ server'' starts a ``TUIO server'' that translates TUIO events to ``point
\{down,move,up\}'' events. Detection of ``tap'' and ``double tap'' gestures is
performed immediately after an event is received. Other gesture detection runs
in a separate thread, using the following loop:
\begin{verbatim}
60 times per second do:
detect `single tap' based on the time since the latest `tap' gesture
......@@ -718,7 +794,7 @@ would become more and more complex when extended with new gestures. The two
problems have been solved using event areas and gesture trackers from the
reference implementation. The gesture detection code has been separated into
two different gesture trackers, which are the ``tap'' and ``transformation''
trackers mentioned in section \ref{sec:implementation}.
trackers mentioned in chapter \ref{chapter:implementation}.
The positions of all touch objects and their centroid are drawn using the
Pygame library. Since the Pygame library does not provide support to find the
......@@ -732,10 +808,12 @@ the entire touch surface. The output of the application can be seen in figure
\begin{figure}[h!]
\center
\includegraphics[scale=0.4]{data/pygame_draw.png}
\caption{Output of the experimental drawing program. It draws all touch
points and their centroid on the screen (the centroid is used for rotation
and pinch detection). It also draws a green rectangle which responds to
rotation and pinch events.}
\caption{
Output of the experimental drawing program. It draws all touch points
and their centroid on the screen (the centroid is used for rotation and
pinch detection). It also draws a green rectangle which responds to
rotation and pinch events.
}
\label{fig:draw}
\end{figure}
......@@ -789,7 +867,7 @@ section \ref{sec:handtracker} for details). The application draws a line from
each finger to the hand it belongs to, as visible in figure \ref{fig:testapp}.
\begin{figure}[h!]
\center
\includegraphics[scale=0.32]{data/testapp.png}
\includegraphics[scale=0.35]{data/testapp.png}
\caption{
Screenshot of the second test application. Two polygons can be dragged,
rotated and scaled. Separate groups of fingers are recognized as hands,
......@@ -806,8 +884,6 @@ transformation gestures. Because the propagation of these events is stopped,
overlapping polygons do not cause a problem. Figure \ref{fig:testappdiagram}
shows the tree structure used by the application.
\testappdiagram
Note that the overlay event area, though covering the entire screen surface, is
not used as the root of the event area tree. Instead, the overlay is placed on
top of the application window (being a rightmost sibling of the application
......@@ -822,6 +898,8 @@ event area implementation delegates events to its children in right-to left
order, because area's that are added to the tree later are assumed to be
positioned over their previously added siblings.
\testappdiagram
\subsection{Hand tracker}
\label{sec:handtracker}
......@@ -852,20 +930,38 @@ hand is removed with a \emph{hand\_up} gesture.
\section{Results}
\label{sec:results}
% TODO: Evalueer of de implementatie en testapplicaties voldoen aan de
% verwachtingen/eisen die je hebt gesteld in je ontwerp.
% - meerdere frameworks
% - event area dient zijn doel
% - Boomstructuur werkt goed, je moet alleen wel creatief zijn in hoe je hem
% maakt
% - Er gaat werk zitten in het schrijven van de synchronisatielaag
% tussen architectuur en applicatieframework, maar dit is wel herbruikbaar id
% vorm van een vaste set classes
% - gesture trackers houden code mooi gescheiden, en testapplicatie kan
% makkelijk eigen tracker toevoegen
The test applications show that the architecture implementation can be used
alongside existing application frameworks.
The Pygame application is based on existing program code, which has been be
broken up into the components of the architecture. The application incorporates
the most common multi-touch gestures, such as tapping and transformation
gestures. All features from the original application are still supported in the
revised application, so the component-based architecture design does not
propose a limiting factor. Rather than that, the program code has become more
maintainable and extendable due to the modular setup. The gesture tracker-based
design has even allowed the detection of tap and transformation gestures to be
moved to the reference implementation of the architecture, whereas it was
originally part of the test application.
The GTK+ application uses a more extended tree structure to arrange its event
areas, so that it can use the powerful concept of event propagation. The
application does show that the construction of such a tree is not always
straight-forward: the ``overlay'' event area covers the entire touch surface,
but is not the root of the tree. Designing the tree structure requires an
understanding of event propagation by the application developer.
Some work goes into the synchronization of application widgets with their event
areas. The GTK+ application defines a class that acts as a synchronization
layer between the application window and its event area in the architecture.
This synchronization layer could be used in other applications that use GTK+.
The ``hand tracker'' used by the GTK+ application is not incorporated within
the architecture. The use of gesture trackers by the architecture allows he
application to add new gestures in a single line of code (see section
\ref{sec:tracker-registration}).
Apart from the synchronization of event areas with application widgets, both
applications have not trouble using the architecture implementation in
combination with their application framework. Thus, the architecture can be
used alongside existing application frameworks.
\chapter{Conclusions}
\label{chapter:conclusions}
......@@ -885,16 +981,21 @@ structure that can be synchronized with the widget tree of the application.
Some applications require the ability to handle an event exclusively for an
event area. An event propagation mechanism provides a solution for this: the
propagation of an event in the tree structure can be stopped after gesture
detection in an event area.
detection in an event area. Section \ref{sec:testapp} shows that the structure
of the event area tree is not necessarily equal to that of the application
widget tree. The design of the event area tree structure in complex situations
requires an understanding of event propagation by the application programmer.
The detection of complex gestures can be approached in several ways. If
explicit detection code for different gesture is not managed well, program code
can become needlessly complex. A tracker-based design, in which the detection
of different types of gesture is separated into different gesture trackers,
reduces complexity and provides a way to extend a set of detection algorithms.
A gesture trackers implementation is flexible, e.g. complex detection
algorithms such as machine learning can be used simultaneously with other
gesture trackers that use explicit detection.
The use of gesture trackers is flexible, e.g. complex detection algorithms such
as machine learning can be used simultaneously with other gesture trackers that
use explicit detection. Also, the modularity of this design allows extension of
the set of supported gestures. Section \ref{sec:testapp} demonstrates this
extendability.
% TODO: Daemon implementatie
......@@ -1004,56 +1105,4 @@ include it in a software distribution.
\bibliographystyle{plain}
\bibliography{report}{}
\appendix
\chapter{The TUIO protocol}
\label{app:tuio}
The TUIO protocol \cite{TUIO} defines a way to geometrically describe tangible
objects, such as fingers or objects on a multi-touch table. Object information
is sent to the TUIO UDP port (3333 by default).
For efficiency reasons, the TUIO protocol is encoded using the Open Sound
Control \cite[OSC]{OSC} format. An OSC server/client implementation is
available for Python: pyOSC \cite{pyOSC}.
A Python implementation of the TUIO protocol also exists: pyTUIO \cite{pyTUIO}.
However, the execution of an example script yields an error regarding Python's
built-in \texttt{socket} library. Therefore, the reference implementation uses
the pyOSC package to receive TUIO messages.
The two most important message types of the protocol are ALIVE and SET
messages. An ALIVE message contains the list of session id's that are currently
``active'', which in the case of multi-touch a table means that they are
touching the screen. A SET message provides geometric information of a session
id, such as position, velocity and acceleration.
Each session id represents an object. The only type of objects on the
multi-touch table are what the TUIO protocol calls ``2DCur'', which is a (x, y)
position on the screen.
ALIVE messages can be used to determine when an object touches and releases the
screen. For example, if a session id was in the previous message but not in the
current, The object it represents has been lifted from the screen.
SET provide information about movement. In the case of simple (x, y) positions,
only the movement vector of the position itself can be calculated. For more
complex objects such as fiducials, arguments like rotational position and
acceleration are also included.
ALIVE and SET messages can be combined to create ``point down'', ``point move''
and ``point up'' events.
TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the left
top corner of the screen and $(1.0, 1.0)$ the right bottom corner. To focus
events within a window, a translation to window coordinates is required in the
client application, as stated by the online specification
\cite{TUIO_specification}:
\begin{quote}
In order to compute the X and Y coordinates for the 2D profiles a TUIO
tracker implementation needs to divide these values by the actual sensor
dimension, while a TUIO client implementation consequently can scale these
values back to the actual screen dimension.
\end{quote}
\end{document}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment