| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312 |
- \documentclass[twoside,openright]{uva-bachelor-thesis}
- \usepackage[english]{babel}
- \usepackage[utf8]{inputenc}
- \usepackage{hyperref,graphicx,float}
- % Link colors
- \hypersetup{colorlinks=true,linkcolor=black,urlcolor=blue,citecolor=DarkGreen}
- % Title Page
- \title{A universal detection mechanism for multi-touch gestures}
- \author{Taddeüs Kroes}
- \supervisors{Dr. Robert G. Belleman (UvA)}
- \signedby{Dr. Robert G. Belleman (UvA)}
- \begin{document}
- % Title page
- \maketitle
- \begin{abstract}
- % TODO
- \end{abstract}
- % Set paragraph indentation
- \parindent 0pt
- \parskip 1.5ex plus 0.5ex minus 0.2ex
- % Table of contant on separate page
- \tableofcontents
- \chapter{Introduction}
- % Ruwe probleemstelling
- Multi-touch interaction is becoming increasingly common, mostly due to the wide
- use of touch screens in phones and tablets. When programming applications using
- this method of interaction, the programmer needs an abstraction of the raw data
- provided by the touch driver of the device. This abstraction exists in several
- multi-touch application frameworks like Nokia's
- Qt\footnote{\url{http://qt.nokia.com/}}. However, applications that do not use
- these frameworks have no access to their multi-touch events.
- % Aanleiding
- This problem was observed during an attempt to create a multi-touch
- ``interactor'' class for the Visualization Toolkit (VTK \cite{VTK}). Because
- VTK provides the application framework here, it is undesirable to use an entire
- framework like Qt simultaneously only for its multi-touch support.
- % Ruw doel
- The goal of this project is to define a universal multi-touch event triggering
- mechanism. To test the definition, a reference implementation is written in
- Python.
- % Setting
- To test multi-touch interaction properly, a multi-touch device is required.
- The University of Amsterdam (UvA) has provided access to a multi-touch table
- from PQlabs. The table uses the TUIO protocol \cite{TUIO} to communicate touch
- events.
- \section{Definition of the problem}
- % Hoofdvraag
- The goal of this thesis is to create a multi-touch event triggering mechanism
- for use in a VTK interactor. The design of the mechanism must be universal.
- % Deelvragen
- To design such a mechanism properly, the following questions are relevant:
- \begin{itemize}
- \item What are the requirements of the mechanism to be universal?
- \item What is the input of the mechanism? Different touch drivers have
- different API's. To be able to support different drivers (which is
- highly desirable), there should probably be a translation from the
- driver API to a fixed input format.
- \item How can extendability be accomplished? The set of supported events
- should not be limited to a single implementation, but an application
- should be able to define its own custom events.
- \item Can events be shared with multiple processes at the same time? For
- example, a network implementation could run as a service instead of
- within a single application, triggering events in any application that
- needs them.
- \item Is performance an issue? For example, an event loop with rotation
- detection could swallow up more processing resources than desired.
- \end{itemize}
- % Afbakening
- The scope of this thesis includes the design of an multi-touch triggering
- mechanism, a reference implementation of this design, and its integration
- into a VTK interactor. To be successful, the design should allow for
- extensions to be added to any implementation. The reference implementation
- is a Proof of Concept that translates TUIO events to some simple touch
- gestures that are used by a VTK interactor.
- \section{Structure of this document}
- % TODO
- \chapter{Related work}
- \section{Gesture and Activity Recognition Toolkit}
- The Gesture and Activity Recognition Toolkit (GART) \cite{GART} is a
- toolkit for the development of gesture-based applications. The toolkit
- states that the best way to classify gestures is to use machine learning.
- The programmer trains a program to recognize using the machine learning
- library from the toolkit. The toolkit contains a callback-mechanism that
- the programmer uses to execute custom code when a gesture is recognized.
- Though multi-touch input is not directly supported by the toolkit, the
- level of abstraction does allow for it to be implemented in the form of a
- ``touch'' sensor.
- The reason to use machine learning is the statement that gesture detection
- ``is likely to become increasingly complex and unmanageable'' when using a
- set of predefined rules to detect whether some sensor input can be seen as
- a specific gesture. This statement is not necessarily true. If the
- programmer is given a way to separate the detection of different types of
- gestures and flexibility in rule definitions, over-complexity can be
- avoided.
- % oplossing: trackers. bijv. TapTracker, TransformationTracker gescheiden
- \section{Gesture recognition software for Windows 7}
- % TODO
- The online article \cite{win7touch} presents a Windows 7 application,
- written in Microsofts .NET. The application shows detected gestures in a
- canvas. Gesture trackers keep track of stylus locations to detect specific
- gestures. The event types required to track a touch stylus are ``stylus
- down'', ``stylus move'' and ``stylus up'' events. A
- \texttt{GestureTrackerManager} object dispatches these events to gesture
- trackers. The application supports a limited number of pre-defined
- gestures.
- An important observation in this application is that different gestures are
- detected by different gesture trackers, thus separating gesture detection
- code into maintainable parts.
- \section{The TUIO protocol}
- The TUIO protocol \cite{TUIO} defines a way to geometrically describe
- tangible objects, such as fingers or fiducials on a multi-touch table. The
- table used for this thesis uses the protocol in its driver. Object
- information is sent to the TUIO UDP port (3333 by default).
- For efficiency reasons, the TUIO protocol is encoded using the Open Sound
- Control (OSC)\footnote{\url{http://opensoundcontrol.org/specification}}
- format. An OSC server/client implementation is available for Python:
- pyOSC\footnote{\url{https://trac.v2.nl/wiki/pyOSC}}.
- A Python implementation of the TUIO protocol also exists:
- pyTUIO\footnote{\url{http://code.google.com/p/pytuio/}}. However, the
- execution of an example script yields an error regarding Python's built-in
- \texttt{socket} library. Therefore, the reference implementation uses the
- pyOSC package to receive TUIO messages.
- The two most important message types of the protocol are ALIVE and SET
- messages. An ALIVE message contains the list of session id's that are
- currently ``active'', which in the case of multi-touch a table means that
- they are touching the screen. A SET message provides geometric information
- of a session id, such as position, velocity and acceleration.
- Each session id represents an object. The only type of objects on the
- multi-touch table are what the TUIO protocol calls ``2DCur'', which is a
- (x, y) position on the screen.
- ALIVE messages can be used to determine when an object touches and releases
- the screen. For example, if a session id was in the previous message but
- not in the current, The object it represents has been lifted from the
- screen.
- SET provide information about movement. In the case of simple (x, y)
- positions, only the movement vector of the position itself can be
- calculated. For more complex objects such as fiducials, arguments like
- rotational position is also included.
- ALIVE and SET messages can be combined to create ``point down'', ``point
- move'' and ``point up'' events (as used by the \cite[.NET
- application]{win7touch}).
- TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the
- left top corner of the screen and $(1.0, 1.0)$ the right bottom corner. To
- focus events within a window, a translation to window coordinates is
- required in the client application, as stated by the online specification
- \cite{TUIO_specification}:
- \begin{quote}
- In order to compute the X and Y coordinates for the 2D profiles a TUIO
- tracker implementation needs to divide these values by the actual
- sensor dimension, while a TUIO client implementation consequently can
- scale these values back to the actual screen dimension.
- \end{quote}
- In other words, the design of the gesture detection mechanism should
- incorporate a translation from driver-specific coordinates to pixel
- coordinates.
- \section{Processing implementation of simple gestures in Android}
- An implementation of a detection mechanism for some simple multi-touch
- gestures (tap, double tap, rotation, pinch and drag) using
- Processing\footnote{Processing is a Java-based development environment with
- an export possibility for Android. See also \url{http://processing.org/.}}
- can be found found in a forum on the Processing website
- \cite{processingMT}. The implementation is fairly simple, but it yields
- some very appealing results. The detection logic of all gestures is
- combined in a single class. This does not allow for extendability, because
- the complexity of this class would increase to an undesirable level (as
- predicted by the GART article \cite{GART}). However, the detection logic
- itself is partially re-used in the reference implementation of the
- universal gesture detection mechanism.
- % TODO
- \chapter{Experiments}
- % testimplementatie met taps, rotatie en pinch. Hieruit bleek:
- % - dat er verschillende manieren zijn om bijv. "rotatie" te
- % detecteren, (en dat daartussen onderscheid moet kunnen worden
- % gemaakt)
- % - dat detectie van verschillende soorten gestures moet kunnen
- % worden gescheiden, anders wordt het een chaos.
- % - Er zijn een aantal keuzes gemaakt bij het ontwerpen van de gestures,
- % bijv dat rotatie ALLE vingers gebruikt voor het centroid. Het is
- % wellicht in een ander programma nodig om maar 1 hand te gebruiken, en
- % dus punten dicht bij elkaar te kiezen (oplossing: windows).
- % Tekenprogramma dat huidige points + centroid tekent en waarmee
- % transformatie kan worden getest Link naar appendix "supported events"
- % Proof of Concept: VTK interactor
- % -------
- % Results
- % -------
- \chapter{Design}
- \section{Requirements}
- % TODO
- % ondersteunen van meerdere drivers
- % gesture detectie koppelen aan bepaald gedeelte van het scherm
- % scheiden van detectiecode voor verschillende gesture types
- % eventueel te gebruiken in meerdere talen
- \section{Input server}
- % TODO
- % vertaling driver naar point down, move, up
- % TUIO in reference implementation
- \section{Gesture server}
- % TODO
- % vertaling naar pixelcoordinaten
- % toewijzing aan windows
- \section{Windows}
- % TODO
- % toewijzen even aan deel v/h scherm:
- % TUIO coördinaten zijn over het hele scherm en van 0.0 tot 1.0, dus moeten
- % worden vertaald naar pixelcoördinaten binnen een ``window''
- \section{Trackers}
- % TODO
- % event binding/triggering
- % extendability
- % TODO: link naar appendix met schema
- \chapter{Reference implementation}
- % TODO
- \chapter{Integration in VTK}
- % VTK interactor
- \chapter{Conclusions}
- % TODO
- % Windows zijn een manier om globale events toe te wijzen aan vensters
- % Trackers zijn een effectieve manier om gebaren te detecteren
- % Trackers zijn uitbreidbaar door object-orientatie
- \chapter{Suggestions for future work}
- % TODO
- % Network protocol (ZeroMQ)
- % State machine
- \bibliographystyle{plain}
- \bibliography{report}{}
- \appendix
- \chapter{Diagram of mechanism structure}
- \label{app:schema}
- \begin{figure}[H]
- \hspace{-14em}
- \includegraphics{data/server_scheme.pdf}
- \caption{}
- %TODO: caption
- \end{figure}
- \chapter{Supported events in reference implementation}
- \label{app:supported-events}
- % TODO
- \end{document}
|