|
|
@@ -9,7 +9,7 @@
|
|
|
\hypersetup{colorlinks=true,linkcolor=black,urlcolor=blue,citecolor=DarkGreen}
|
|
|
|
|
|
% Title Page
|
|
|
-\title{A universal detection mechanism for multi-touch gestures}
|
|
|
+\title{A generic architecture for the detection of multi-touch gestures}
|
|
|
\author{Taddeüs Kroes}
|
|
|
\supervisors{Dr. Robert G. Belleman (UvA)}
|
|
|
\signedby{Dr. Robert G. Belleman (UvA)}
|
|
|
@@ -47,50 +47,44 @@ provides the application framework here, it is undesirable to use an entire
|
|
|
framework like Qt simultaneously only for its multi-touch support.
|
|
|
|
|
|
% Ruw doel
|
|
|
-The goal of this project is to define a universal multi-touch event triggering
|
|
|
-mechanism. To test the definition, a reference implementation is written in
|
|
|
+The goal of this project is to define a generic multi-touch event triggering
|
|
|
+architecture. To test the definition, a reference implementation is written in
|
|
|
Python.
|
|
|
|
|
|
-% Setting
|
|
|
-To test multi-touch interaction properly, a multi-touch device is required.
|
|
|
-The University of Amsterdam (UvA) has provided access to a multi-touch table
|
|
|
-from PQlabs. The table uses the TUIO protocol \cite{TUIO} to communicate touch
|
|
|
-events.
|
|
|
-
|
|
|
\section{Definition of the problem}
|
|
|
|
|
|
% Hoofdvraag
|
|
|
- The goal of this thesis is to create a multi-touch event triggering mechanism
|
|
|
- for use in a VTK interactor. The design of the mechanism must be universal.
|
|
|
+ The goal of this thesis is to a create generic architecture for a
|
|
|
+ multi-touch event triggering mechanism for use in multi-touch applications.
|
|
|
|
|
|
% Deelvragen
|
|
|
- To design such a mechanism properly, the following questions are relevant:
|
|
|
+ To design such an architecture properly, the following questions are relevant:
|
|
|
\begin{itemize}
|
|
|
- \item What is the input of the mechanism? Different touch drivers have
|
|
|
- different API's. To be able to support different drivers (which is
|
|
|
- highly desirable), there should probably be a translation from the
|
|
|
+ \item What is the input of the architecture? Different touch drivers
|
|
|
+ have different API's. To be able to support different drivers
|
|
|
+ (which is highly desirable), there should be a translation from the
|
|
|
driver API to a fixed input format.
|
|
|
- \item How can extendability be accomplished? The set of supported events
|
|
|
- should not be limited to a single implementation, but an application
|
|
|
- should be able to define its own custom events.
|
|
|
- \item How can the mechanism be used by different programming languages?
|
|
|
- A universal mechanism should not be limited to be used in only one
|
|
|
- language.
|
|
|
- \item Can events be shared with multiple processes at the same time? For
|
|
|
- example, a network implementation could run as a service instead of
|
|
|
- within a single application, triggering events in any application that
|
|
|
- needs them.
|
|
|
+ \item How can extendability be accomplished? The set of supported
|
|
|
+ events should not be limited to a single implementation, but an
|
|
|
+ application should be able to define its own custom events.
|
|
|
+ \item How can the architecture be used by different programming
|
|
|
+ languages? A generic architecture should not be limited to be used
|
|
|
+ in only one language.
|
|
|
+ \item Can events be shared with multiple processes at the same time?
|
|
|
+ For example, a network implementation could run as a service
|
|
|
+ instead of within a single application, triggering events in any
|
|
|
+ application that needs them.
|
|
|
% FIXME: gaan we nog wat doen met onderstaand?
|
|
|
%\item Is performance an issue? For example, an event loop with rotation
|
|
|
% detection could swallow up more processing resources than desired.
|
|
|
- \item How can the mechanism be integrated in a VTK interactor?
|
|
|
+ %\item How can the architecture be integrated in a VTK interactor?
|
|
|
\end{itemize}
|
|
|
|
|
|
% Afbakening
|
|
|
- The scope of this thesis includes the design of a universal multi-touch
|
|
|
- triggering mechanism, a reference implementation of this design, and its
|
|
|
- integration into a VTK interactor. To be successful, the design should
|
|
|
- allow for extensions to be added to any implementation.
|
|
|
+ The scope of this thesis includes the design of a generic multi-touch
|
|
|
+ triggering architecture, a reference implementation of this design, and its
|
|
|
+ integration into a test case application. To be successful, the design
|
|
|
+ should allow for extensions to be added to any implementation.
|
|
|
|
|
|
The reference implementation is a Proof of Concept that translates TUIO
|
|
|
events to some simple touch gestures that are used by a VTK interactor.
|
|
|
@@ -99,7 +93,7 @@ events.
|
|
|
|
|
|
\section{Structure of this document}
|
|
|
|
|
|
- % TODO: pas als het klaar is
|
|
|
+ % TODO: pas als thesis af is
|
|
|
|
|
|
\chapter{Related work}
|
|
|
|
|
|
@@ -144,7 +138,7 @@ events.
|
|
|
|
|
|
\section{Processing implementation of simple gestures in Android}
|
|
|
|
|
|
- An implementation of a detection mechanism for some simple multi-touch
|
|
|
+ An implementation of a detection architecture for some simple multi-touch
|
|
|
gestures (tap, double tap, rotation, pinch and drag) using
|
|
|
Processing\footnote{Processing is a Java-based development environment with
|
|
|
an export possibility for Android. See also \url{http://processing.org/.}}
|
|
|
@@ -155,69 +149,13 @@ events.
|
|
|
the complexity of this class would increase to an undesirable level (as
|
|
|
predicted by the GART article \cite{GART}). However, the detection logic
|
|
|
itself is partially re-used in the reference implementation of the
|
|
|
- universal gesture detection mechanism.
|
|
|
-
|
|
|
-\chapter{Preliminary}
|
|
|
-
|
|
|
- \section{The TUIO protocol}
|
|
|
- \label{sec:tuio}
|
|
|
-
|
|
|
- The TUIO protocol \cite{TUIO} defines a way to geometrically describe
|
|
|
- tangible objects, such as fingers or fiducials on a multi-touch table. The
|
|
|
- table used for this thesis uses the protocol in its driver. Object
|
|
|
- information is sent to the TUIO UDP port (3333 by default).
|
|
|
-
|
|
|
- For efficiency reasons, the TUIO protocol is encoded using the Open Sound
|
|
|
- Control \cite[OSC]{OSC} format. An OSC server/client implementation is
|
|
|
- available for Python: pyOSC \cite{pyOSC}.
|
|
|
-
|
|
|
- A Python implementation of the TUIO protocol also exists: pyTUIO
|
|
|
- \cite{pyTUIO}. However, the execution of an example script yields an error
|
|
|
- regarding Python's built-in \texttt{socket} library. Therefore, the
|
|
|
- reference implementation uses the pyOSC package to receive TUIO messages.
|
|
|
-
|
|
|
- The two most important message types of the protocol are ALIVE and SET
|
|
|
- messages. An ALIVE message contains the list of session id's that are
|
|
|
- currently ``active'', which in the case of multi-touch a table means that
|
|
|
- they are touching the screen. A SET message provides geometric information
|
|
|
- of a session id, such as position, velocity and acceleration.
|
|
|
-
|
|
|
- Each session id represents an object. The only type of objects on the
|
|
|
- multi-touch table are what the TUIO protocol calls ``2DCur'', which is a
|
|
|
- (x, y) position on the screen.
|
|
|
-
|
|
|
- ALIVE messages can be used to determine when an object touches and releases
|
|
|
- the screen. For example, if a session id was in the previous message but
|
|
|
- not in the current, The object it represents has been lifted from the
|
|
|
- screen.
|
|
|
-
|
|
|
- SET provide information about movement. In the case of simple (x, y)
|
|
|
- positions, only the movement vector of the position itself can be
|
|
|
- calculated. For more complex objects such as fiducials, arguments like
|
|
|
- rotational position is also included.
|
|
|
-
|
|
|
- ALIVE and SET messages can be combined to create ``point down'', ``point
|
|
|
- move'' and ``point up'' events (as used by the \cite[.NET
|
|
|
- application]{win7touch}).
|
|
|
-
|
|
|
- TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the
|
|
|
- left top corner of the screen and $(1.0, 1.0)$ the right bottom corner. To
|
|
|
- focus events within a window, a translation to window coordinates is
|
|
|
- required in the client application, as stated by the online specification
|
|
|
- \cite{TUIO_specification}:
|
|
|
- \begin{quote}
|
|
|
- In order to compute the X and Y coordinates for the 2D profiles a TUIO
|
|
|
- tracker implementation needs to divide these values by the actual
|
|
|
- sensor dimension, while a TUIO client implementation consequently can
|
|
|
- scale these values back to the actual screen dimension.
|
|
|
- \end{quote}
|
|
|
+ generic gesture detection architecture.
|
|
|
+
|
|
|
+ \section{Analysis}
|
|
|
|
|
|
- \section{The Visualization Toolkit}
|
|
|
- \label{sec:vtk}
|
|
|
|
|
|
- % TODO
|
|
|
|
|
|
-\chapter{Experiments}
|
|
|
+\chapter{Problem analysis}
|
|
|
|
|
|
% testimplementatie met taps, rotatie en pinch. Hieruit bleek:
|
|
|
% - dat er verschillende manieren zijn om bijv. "rotatie" te
|
|
|
@@ -235,6 +173,17 @@ events.
|
|
|
|
|
|
% Proof of Concept: VTK interactor
|
|
|
|
|
|
+ \section{Introduction}
|
|
|
+
|
|
|
+ % TODO
|
|
|
+ TODO: doel v/h experiment
|
|
|
+
|
|
|
+ To test multi-touch interaction properly, a multi-touch device is required.
|
|
|
+ The University of Amsterdam (UvA) has provided access to a multi-touch
|
|
|
+ table from PQlabs. The table uses the TUIO protocol \cite{TUIO} to
|
|
|
+ communicate touch events. See appendix \ref{app:tuio} for details regarding
|
|
|
+ the TUIO protocol.
|
|
|
+
|
|
|
\section{Experimenting with TUIO and event bindings}
|
|
|
\label{sec:experimental-draw}
|
|
|
|
|
|
@@ -260,7 +209,7 @@ events.
|
|
|
\end{figure}
|
|
|
|
|
|
One of the first observations is the fact that TUIO's \texttt{SET} messages
|
|
|
- use the TUIO coordinate system, as described in section \ref{sec:tuio}.
|
|
|
+ use the TUIO coordinate system, as described in appendix \ref{app:tuio}.
|
|
|
The test program multiplies these with its own dimensions, thus showing the
|
|
|
entire screen in its window. Also, the implementation only works using the
|
|
|
TUIO protocol. Other drivers are not supported.
|
|
|
@@ -277,16 +226,13 @@ events.
|
|
|
using all current touch points, there cannot be two or more rotation or
|
|
|
pinch gestures simultaneously. On a large multi-touch table, it is
|
|
|
desirable to support interaction with multiple hands, or multiple persons,
|
|
|
- at the same time.
|
|
|
+ at the same time. This kind of application-specific requirements should be
|
|
|
+ defined in the application itself, whereas the experimental implementation
|
|
|
+ defines detection algorithms based on its test program.
|
|
|
|
|
|
Also, the different detection algorithms are all implemented in the same
|
|
|
file, making it complex to read or debug, and difficult to extend.
|
|
|
|
|
|
- \section{VTK interactor}
|
|
|
-
|
|
|
- % TODO
|
|
|
- % VTK heeft eigen pipeline, mechanisme moet daarnaast draaien
|
|
|
-
|
|
|
\section{Summary of observations}
|
|
|
\label{sec:observations}
|
|
|
|
|
|
@@ -297,7 +243,8 @@ events.
|
|
|
\item Gestures that use multiple touch points are using all touch
|
|
|
points (not a subset of them).
|
|
|
\item Code complexity increases when detection algorithms are added.
|
|
|
- \item % TODO: VTK interactor observations
|
|
|
+ \item A multi-touch application can have very specific requirements for
|
|
|
+ gestures.
|
|
|
\end{itemize}
|
|
|
|
|
|
% -------
|
|
|
@@ -319,22 +266,26 @@ events.
|
|
|
that can be used in gesture detection algorithms.
|
|
|
% events toewijzen aan GUI window (windows)
|
|
|
\item An application GUI window should be able to receive only events
|
|
|
- occuring within that window, and not outside of it.
|
|
|
+ occurring within that window, and not outside of it.
|
|
|
% scheiden groepen touchpoints voor verschillende gestures (windows)
|
|
|
\item To support multiple objects that are performing different
|
|
|
- gestures at the same time, the mechanism must be able to perform
|
|
|
+ gestures at the same time, the architecture must be able to perform
|
|
|
gesture detection on a subset of the active touch points.
|
|
|
% scheiden van detectiecode voor verschillende gesture types
|
|
|
\item To avoid an increase in code complexity when adding new detection
|
|
|
algorithms, detection code of different gesture types must be
|
|
|
separated.
|
|
|
+ % extendability
|
|
|
+ \item The architecture should allow for extension with new detection
|
|
|
+ algorithms to be added to an implementation. This enables a
|
|
|
+ programmer to define custom gestures for an application.
|
|
|
\end{itemize}
|
|
|
|
|
|
\section{Components}
|
|
|
|
|
|
Based on the requirements from section \ref{sec:requirements}, a design
|
|
|
- for the mechanism has been created. The design consists of a number of
|
|
|
- components, each having a specific set of tasks.
|
|
|
+ for the architecture has been created. The design consists of a number
|
|
|
+ of components, each having a specific set of tasks.
|
|
|
|
|
|
\subsection{Event server}
|
|
|
|
|
|
@@ -353,11 +304,12 @@ events.
|
|
|
placed on the screen, moving along the surface of the screen, and being
|
|
|
released from the screen.
|
|
|
|
|
|
- A more extended set could also contain the same three events for a
|
|
|
- surface touching the screen. However, a surface can have a rotational
|
|
|
- property, like the ``fiducials'' type in the TUIO protocol. This
|
|
|
- results in as $\{point\_down, point\_move, point\_up, surface\_down,
|
|
|
- surface\_move, surface\_up,\\surface\_rotate\}$.
|
|
|
+ A more extended set could also contain the same three events for an
|
|
|
+ object touching the screen. However, a object can also have a
|
|
|
+ rotational property, like the ``fiducials'' type in the TUIO protocol.
|
|
|
+ This results in $\{point\_down, point\_move, point\_up, object\_down,
|
|
|
+ object\_move, object\_up,\\object\_rotate\}$.
|
|
|
+ % TODO: is dit handig? point_down/object_down op 1 of andere manier samenvoegen?
|
|
|
|
|
|
An important note here, is that similar events triggered by different
|
|
|
event servers must have the same event type and parameters. In other
|
|
|
@@ -367,8 +319,9 @@ events.
|
|
|
The output of an event server implementation should also use a common
|
|
|
coordinate system, that is the coordinate system used by the gesture
|
|
|
server. For example, the reference implementation uses screen
|
|
|
- coordinates in pixels, where (0, 0) is the upper left corner of the
|
|
|
- screen.
|
|
|
+ coordinates in pixels, where (0, 0) is the upper left corner and
|
|
|
+ (\emph{screen width}, \emph{screen height}) the lower right corner of
|
|
|
+ the screen.
|
|
|
|
|
|
The abstract class definition of the event server should provide some
|
|
|
functionality to detect which driver-specific event server
|
|
|
@@ -376,11 +329,15 @@ events.
|
|
|
|
|
|
\subsection{Gesture trackers}
|
|
|
|
|
|
- A \emph{gesture tracker} detects a single gesture type, given a set of
|
|
|
- touch points. If one group of points on the screen is assigned to one
|
|
|
- tracker and another group to another tracker, multiple gestures, an be
|
|
|
- detected at the same time. For this assignment, the mechanism uses
|
|
|
- windows. These will be described in the next section.
|
|
|
+ Like \cite[the .NET implementation]{win7touch}, the architecture uses a
|
|
|
+ \emph{gesture tracker} to detect if a sequence of events forms a
|
|
|
+ particular gesture. A gesture tracker detects and triggers events for a
|
|
|
+ limited set of gesture types, given a set of touch points. If one group
|
|
|
+ of touch points is assigned to one tracker and another group to another
|
|
|
+ tracker, multiple gestures can be detected at the same time. For the
|
|
|
+ assignment of different groups of touch points to different gesture
|
|
|
+ trackers, the architecture uses so-called \emph{windows}. These are
|
|
|
+ described in the next section.
|
|
|
|
|
|
% event binding/triggering
|
|
|
A gesture tracker triggers a gesture event by executing a callback.
|
|
|
@@ -402,7 +359,7 @@ events.
|
|
|
trackers can be saved in different files, reducing the complexity of
|
|
|
the code in a single file. \\
|
|
|
% extendability
|
|
|
- Because tacker defines its own set of gesture types, the application
|
|
|
+ Because a tracker defines its own set of gesture types, the application
|
|
|
developer can define application-specific trackers (by extending a base
|
|
|
\texttt{GestureTracker} class, for example). In fact, any built-in
|
|
|
gesture trackers of an implementation are also created this way. This
|
|
|
@@ -415,7 +372,7 @@ events.
|
|
|
A \emph{window} represents a subset of the entire screen surface. The
|
|
|
goal of a window is to restrict the detection of certain gestures to
|
|
|
certain areas. A window contains a list of touch points, and a list of
|
|
|
- trackers. A window server (defined in the next section) assigns touch
|
|
|
+ trackers. A gesture server (defined in the next section) assigns touch
|
|
|
points to a window, but the window itself defines functionality to
|
|
|
check whether a touch point is inside the window. This way, new windows
|
|
|
can be defined to fit over any 2D object used by the application.
|
|
|
@@ -490,7 +447,12 @@ events.
|
|
|
start server
|
|
|
\end{verbatim}
|
|
|
|
|
|
- \section{Network protocol}
|
|
|
+ \section{\emph{hier moet een conslusie komen die de componenten aansluit op de requirements(?)}}
|
|
|
+
|
|
|
+ % TODO
|
|
|
+ %
|
|
|
+
|
|
|
+ %\section{Network protocol}
|
|
|
|
|
|
% TODO
|
|
|
% ZeroMQ gebruiken voor communicatie tussen meerdere processen (in
|
|
|
@@ -503,6 +465,12 @@ events.
|
|
|
|
|
|
\chapter{Integration in VTK}
|
|
|
|
|
|
+ \section{The Visualization Toolkit}
|
|
|
+ \label{sec:vtk}
|
|
|
+
|
|
|
+ % TODO
|
|
|
+ % VTK heeft eigen pipeline, architectuur moet daarnaast draaien
|
|
|
+
|
|
|
% VTK interactor
|
|
|
|
|
|
%\chapter{Conclusions}
|
|
|
@@ -526,6 +494,56 @@ events.
|
|
|
\bibliographystyle{plain}
|
|
|
\bibliography{report}{}
|
|
|
|
|
|
-%\appendix
|
|
|
+\appendix
|
|
|
+
|
|
|
+\chapter{The TUIO protocol}
|
|
|
+\label{app:tuio}
|
|
|
+
|
|
|
+The TUIO protocol \cite{TUIO} defines a way to geometrically describe tangible
|
|
|
+objects, such as fingers or objects on a multi-touch table. Object information
|
|
|
+is sent to the TUIO UDP port (3333 by default).
|
|
|
+
|
|
|
+For efficiency reasons, the TUIO protocol is encoded using the Open Sound
|
|
|
+Control \cite[OSC]{OSC} format. An OSC server/client implementation is
|
|
|
+available for Python: pyOSC \cite{pyOSC}.
|
|
|
+
|
|
|
+A Python implementation of the TUIO protocol also exists: pyTUIO \cite{pyTUIO}.
|
|
|
+However, the execution of an example script yields an error regarding Python's
|
|
|
+built-in \texttt{socket} library. Therefore, the reference implementation uses
|
|
|
+the pyOSC package to receive TUIO messages.
|
|
|
+
|
|
|
+The two most important message types of the protocol are ALIVE and SET
|
|
|
+messages. An ALIVE message contains the list of session id's that are currently
|
|
|
+``active'', which in the case of multi-touch a table means that they are
|
|
|
+touching the screen. A SET message provides geometric information of a session
|
|
|
+id, such as position, velocity and acceleration.
|
|
|
+
|
|
|
+Each session id represents an object. The only type of objects on the
|
|
|
+multi-touch table are what the TUIO protocol calls ``2DCur'', which is a (x, y)
|
|
|
+position on the screen.
|
|
|
+
|
|
|
+ALIVE messages can be used to determine when an object touches and releases the
|
|
|
+screen. For example, if a session id was in the previous message but not in the
|
|
|
+current, The object it represents has been lifted from the screen.
|
|
|
+
|
|
|
+SET provide information about movement. In the case of simple (x, y) positions,
|
|
|
+only the movement vector of the position itself can be calculated. For more
|
|
|
+complex objects such as fiducials, arguments like rotational position is also
|
|
|
+included.
|
|
|
+
|
|
|
+ALIVE and SET messages can be combined to create ``point down'', ``point move''
|
|
|
+and ``point up'' events (as used by the \cite[.NET application]{win7touch}).
|
|
|
+
|
|
|
+TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the left
|
|
|
+top corner of the screen and $(1.0, 1.0)$ the right bottom corner. To focus
|
|
|
+events within a window, a translation to window coordinates is required in the
|
|
|
+client application, as stated by the online specification
|
|
|
+\cite{TUIO_specification}:
|
|
|
+\begin{quote}
|
|
|
+ In order to compute the X and Y coordinates for the 2D profiles a TUIO
|
|
|
+ tracker implementation needs to divide these values by the actual sensor
|
|
|
+ dimension, while a TUIO client implementation consequently can scale these
|
|
|
+ values back to the actual screen dimension.
|
|
|
+\end{quote}
|
|
|
|
|
|
\end{document}
|