|
|
@@ -2,7 +2,7 @@
|
|
|
|
|
|
\usepackage[english]{babel}
|
|
|
\usepackage[utf8]{inputenc}
|
|
|
-\usepackage{hyperref,graphicx,tikz,subfigure}
|
|
|
+\usepackage{hyperref,graphicx,tikz,subfigure,float}
|
|
|
|
|
|
% Link colors
|
|
|
\hypersetup{colorlinks=true,linkcolor=black,urlcolor=blue,citecolor=DarkGreen}
|
|
|
@@ -137,21 +137,6 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
detected by different gesture trackers, thus separating gesture detection
|
|
|
code into maintainable parts.
|
|
|
|
|
|
- % TODO: This is not really 'related', move it to somewhere else
|
|
|
- \section{Processing implementation of simple gestures in Android}
|
|
|
-
|
|
|
- An implementation of a detection architecture for some simple multi-touch
|
|
|
- gestures (tap, double tap, rotation, pinch and drag) using
|
|
|
- Processing\footnote{Processing is a Java-based development environment with
|
|
|
- an export possibility for Android. See also \url{http://processing.org/.}}
|
|
|
- can be found in a forum on the Processing website \cite{processingMT}. The
|
|
|
- implementation is fairly simple, but it yields some very appealing results.
|
|
|
- The detection logic of all gestures is combined in a single class. This
|
|
|
- does not allow for extendability, because the complexity of this class
|
|
|
- would increase to an undesirable level (as predicted by the GART article
|
|
|
- \cite{GART}). However, the detection logic itself is partially re-used in
|
|
|
- the reference implementation of the generic gesture detection architecture.
|
|
|
-
|
|
|
\section{Analysis of related work}
|
|
|
|
|
|
The simple Processing implementation of multi-touch events provides most of
|
|
|
@@ -165,7 +150,6 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
of gesture detection code, thus keeping a code library manageable and
|
|
|
extendable, is to user different gesture trackers.
|
|
|
|
|
|
-% FIXME: change title below
|
|
|
\chapter{Design}
|
|
|
\label{chapter:design}
|
|
|
|
|
|
@@ -174,41 +158,40 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
|
|
+ % TODO: rewrite intro?
|
|
|
This chapter describes the realization of a design for the generic
|
|
|
multi-touch gesture detection architecture. The chapter represents the
|
|
|
architecture as a diagram of relations between different components.
|
|
|
- Sections \ref{sec:driver-support} to \ref{sec:event-analysis} define
|
|
|
+ Sections \ref{sec:driver-support} to \ref{sec:multiple-applications} define
|
|
|
requirements for the architecture, and extend the diagram with components
|
|
|
that meet these requirements. Section \ref{sec:example} describes an
|
|
|
example usage of the architecture in an application.
|
|
|
|
|
|
- \subsection*{Position of architecture in software}
|
|
|
+ The input of the architecture comes from a multi-touch device driver.
|
|
|
+ The task of the architecture is to translate this input to multi-touch
|
|
|
+ gestures that are used by an application, as illustrated in figure
|
|
|
+ \ref{fig:basicdiagram}. In the course of this chapter, the diagram is
|
|
|
+ extended with the different components of the architecture.
|
|
|
|
|
|
- The input of the architecture comes from a multi-touch device driver.
|
|
|
- The task of the architecture is to translate this input to multi-touch
|
|
|
- gestures that are used by an application, as illustrated in figure
|
|
|
- \ref{fig:basicdiagram}. In the course of this chapter, the diagram is
|
|
|
- extended with the different components of the architecture.
|
|
|
-
|
|
|
- \basicdiagram{A diagram showing the position of the architecture
|
|
|
- relative to the device driver and a multi-touch application. The input
|
|
|
- of the architecture is given by a touch device driver. This output is
|
|
|
- translated to complex interaction gestures and passed to the
|
|
|
- application that is using the architecture.}
|
|
|
+ \basicdiagram{A diagram showing the position of the architecture
|
|
|
+ relative to the device driver and a multi-touch application. The input
|
|
|
+ of the architecture is given by a touch device driver. This output is
|
|
|
+ translated to complex interaction gestures and passed to the
|
|
|
+ application that is using the architecture.}
|
|
|
|
|
|
\section{Supporting multiple drivers}
|
|
|
\label{sec:driver-support}
|
|
|
|
|
|
- The TUIO protocol \cite{TUIO} is an example of a touch driver that can be
|
|
|
- used by multi-touch devices. TUIO uses ALIVE- and SET-messages to communicate
|
|
|
+ The TUIO protocol \cite{TUIO} is an example of a driver that can be used by
|
|
|
+ multi-touch devices. TUIO uses ALIVE- and SET-messages to communicate
|
|
|
low-level touch events (see appendix \ref{app:tuio} for more details).
|
|
|
- These messages are specific to the API of the TUIO protocol. Other touch
|
|
|
- drivers may use very different messages types. To support more than
|
|
|
- one driver in the architecture, there must be some translation from
|
|
|
- driver-specific messages to a common format for primitive touch events.
|
|
|
- After all, the gesture detection logic in a ``generic'' architecture should
|
|
|
- not be implemented based on driver-specific messages. The event types in
|
|
|
- this format should be chosen so that multiple drivers can trigger the same
|
|
|
+ These messages are specific to the API of the TUIO protocol. Other drivers
|
|
|
+ may use very different messages types. To support more than one driver in
|
|
|
+ the architecture, there must be some translation from driver-specific
|
|
|
+ messages to a common format for primitive touch events. After all, the
|
|
|
+ gesture detection logic in a ``generic'' architecture should not be
|
|
|
+ implemented based on driver-specific messages. The event types in this
|
|
|
+ format should be chosen so that multiple drivers can trigger the same
|
|
|
events. If each supported driver would add its own set of event types to
|
|
|
the common format, it the purpose of being ``common'' would be defeated.
|
|
|
|
|
|
@@ -232,16 +215,18 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
components in the architecture for translation to gestures. This
|
|
|
communication flow is illustrated in figure \ref{fig:driverdiagram}.
|
|
|
|
|
|
- Support for a touch device driver can be added by adding an event driver
|
|
|
+ \driverdiagram
|
|
|
+
|
|
|
+ Support for a touch driver can be added by adding an event driver
|
|
|
implementation. The choice of event driver implementation that is used in an
|
|
|
application is dependent on the driver support of the touch device being
|
|
|
used.
|
|
|
|
|
|
- \driverdiagram{Extension of the diagram from figure \ref{fig:basicdiagram},
|
|
|
- showing the position of the event driver in the architecture. The event
|
|
|
- driver translates driver-specific to a common set of events, which are
|
|
|
- delegated to analysis components that will interpret them as more complex
|
|
|
- gestures.}
|
|
|
+ Because driver implementations have a common output format in the form of
|
|
|
+ events, multiple event drivers can run at the same time (see figure
|
|
|
+ \ref{fig:multipledrivers}).
|
|
|
+
|
|
|
+ \multipledriversdiagram
|
|
|
|
|
|
\section{Restricting events to a screen area}
|
|
|
\label{sec:restricting-gestures}
|
|
|
@@ -294,8 +279,8 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
leads to the concept of an \emph{area}, which represents an area on the
|
|
|
touch surface in which events should be grouped before being delegated to a
|
|
|
form of gesture detection. Examples of simple area implementations are
|
|
|
- rectangles and circles. However, area's could be made to represent more
|
|
|
- complex shapes.
|
|
|
+ rectangles and circles. However, area's could also be made to represent
|
|
|
+ more complex shapes.
|
|
|
|
|
|
An area groups events and assigns them to some piece of gesture detection
|
|
|
logic. This possibly triggers a gesture, which must be handled by the
|
|
|
@@ -314,6 +299,10 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
to a gesture detection component that trigger gestures. The area then calls
|
|
|
the handler that is bound to the gesture type by the application.}
|
|
|
|
|
|
+ An area can be seen as an independent subset of a touch surface. Therefore,
|
|
|
+ the parameters (coordinates) of events and gestures within an area should
|
|
|
+ be relative to the area.
|
|
|
+
|
|
|
Note that the boundaries of an area are only used to group events, not
|
|
|
gestures. A gesture could occur outside the area that contains its
|
|
|
originating events, as illustrated by the example in figure \ref{fig:ex2}.
|
|
|
@@ -337,122 +326,127 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
concept suffices. Section \ref{sec:eventfilter} explores the possibility of
|
|
|
areas to be replaced with event filters.
|
|
|
|
|
|
- \subsection*{Reserving an event for a gesture}
|
|
|
+ \subsection{Area tree}
|
|
|
+ \label{sec:tree}
|
|
|
|
|
|
The most simple implementation of areas in the architecture is a list of
|
|
|
areas. When the event driver delegates an event, it is delegated to gesture
|
|
|
- detection by each area that contains the event coordinates. A problem
|
|
|
- occurs when areas overlap, as shown by figure \ref{fig:ex3}. When the
|
|
|
- white rectangle is rotated, the gray square should keep its current
|
|
|
- orientation. This means that events that are used for rotation of the white
|
|
|
- square, should not be used for rotation of the gray square. To achieve
|
|
|
- this, there must be some communication between the rotation detection
|
|
|
- components of the two squares.
|
|
|
-
|
|
|
- \examplefigurethree
|
|
|
-
|
|
|
- a
|
|
|
-
|
|
|
- --------
|
|
|
-
|
|
|
- % simpelste aanpak is een lijst van area's, als event erin past dan
|
|
|
- % delegeren. probleem (aangeven met voorbeeld van geneste widgets die
|
|
|
- % allebei naar tap luisteren): als area's overlappen wil je bepaalde events
|
|
|
- % reserveren voor bepaalde stukjes detection logic
|
|
|
-
|
|
|
- % oplossing: area'a opslaan in boomstructuur en event propagatie gebruiken
|
|
|
- % -> area binnenin een parent area kan events propageren naar die parent,
|
|
|
- % detection logic kan propagatie tegenhouden. om omhoog in de boom te
|
|
|
- % propageren moet het event eerst bij de leaf aankomen, dus eerst delegatie
|
|
|
- % tot laagste leaf node die het event bevat.
|
|
|
-
|
|
|
- % speciaal geval: overlappende area's in dezelfde laag v/d boom. in dat
|
|
|
- % geval: area die later is toegevoegd (rechter sibling) wordt aangenomen
|
|
|
- % bovenop de sibling links ervan te liggen en krijgt dus eerst het event.
|
|
|
- % Als propagatie in bovenste (rechter) area wordt gestopt, krijgt de
|
|
|
- % achterste (linker) sibling deze ook niet meer
|
|
|
-
|
|
|
- % bijkomend voordeel van boomstructuur: makkelijk te integreren in bijv GTK
|
|
|
- % die voor widgets een boomstructuur gebruikt -> voor elke widget die touch
|
|
|
- % events heeft een area aanmaken
|
|
|
-
|
|
|
- %For example, a button tap\footnote{A ``tap'' gesture is triggered when a
|
|
|
- %touch object releases a touch surface within a certain time and distance
|
|
|
- %from the point where it initially touched the surface.} should only occur
|
|
|
- %on the button itself, and not in any other area of the screen. A solution
|
|
|
- %to this problem is the use of \emph{widgets}. The button from the example
|
|
|
- %can be represented as a rectangular widget with a position and size. The
|
|
|
- %position and size are compared with event coordinates to determine whether
|
|
|
- %an event should occur within the button.
|
|
|
-
|
|
|
- \subsection*{Area tree}
|
|
|
-
|
|
|
- A problem occurs when widgets overlap. If a button in placed over a
|
|
|
- container and an event occurs occurs inside the button, should the
|
|
|
- button handle the event first? And, should the container receive the
|
|
|
- event at all or should it be reserved for the button?.
|
|
|
-
|
|
|
- The solution to this problem is to save widgets in a tree structure.
|
|
|
- There is one root widget, whose size is limited by the size of the
|
|
|
- touch screen. Being the leaf widget, and thus the widget that is
|
|
|
- actually touched when an object touches the device, the button widget
|
|
|
- should receive an event before its container does. However, events
|
|
|
- occur on a screen-wide level and thus at the root level of the widget
|
|
|
- tree. Therefore, an event is delegated in the tree before any analysis
|
|
|
- is performed. Delegation stops at the ``lowest'' widget in the three
|
|
|
- containing the event coordinates. That widget then performs some
|
|
|
- analysis of the event, after which the event is released back to the
|
|
|
- parent widget for analysis. This release of an event to a parent widget
|
|
|
- is called \emph{propagation}. To be able to reserve an event to some
|
|
|
- widget or analysis, the propagation of an event can be stopped during
|
|
|
- analysis.
|
|
|
- % TODO: inspired by JavaScript DOM
|
|
|
-
|
|
|
- Many GUI frameworks, like GTK \cite{GTK}, also use a tree structure to
|
|
|
- manage their widgets. This makes it easy to connect the architecture to
|
|
|
- such a framework. For example, the programmer can define a
|
|
|
- \texttt{GtkTouchWidget} that synchronises the position of a touch
|
|
|
- widget with that of a GTK widget, using GTK signals.
|
|
|
+ detection by each area that contains the event coordinates.
|
|
|
+
|
|
|
+ If the architecture were to be used in combination with an application
|
|
|
+ framework like GTK \cite{GTK}, each GTK widget that must receive gestures
|
|
|
+ should have a mirroring area that synchronizes its position with that of
|
|
|
+ the widget. Consider a panel with five buttons that all listen to a
|
|
|
+ ``tap'' event. If the panel is moved as a result of movement of the
|
|
|
+ application window, the position of each button has to be updated.
|
|
|
+
|
|
|
+ This process is simplified by the arrangement of areas in a tree structure.
|
|
|
+ A root area represents the panel, containing five subareas which are
|
|
|
+ positioned relative to the root area. The relative positions do not need to
|
|
|
+ be updated when the panel area changes its position. GUI frameworks, like
|
|
|
+ GTK, use this kind of tree structure to manage widgets. A recommended first
|
|
|
+ step when developing an application is to create some subclass of the area
|
|
|
+ that synchronizes with the position of a widget from the GUI framework
|
|
|
+ automatically.
|
|
|
|
|
|
\section{Detecting gestures from events}
|
|
|
\label{sec:gesture-detection}
|
|
|
|
|
|
The events that are grouped by areas must be translated to complex gestures
|
|
|
- in some way. This analysis is specific to the type of gesture being
|
|
|
- detected. E.g. the detection of a ``tap'' gesture is very different from
|
|
|
- detection of a ``rotate'' gesture. The architecture has adopted the
|
|
|
- \emph{gesture tracker}-based design described by \cite{win7touch}, which
|
|
|
- separates the detection of different gestures into different \emph{gesture
|
|
|
- trackers}. This keeps the different pieces of gesture detection code
|
|
|
- manageable and extendable. A single gesture tracker detects a specific set
|
|
|
- of gesture types, given a set of primitive events. An example of a possible
|
|
|
- gesture tracker implementation is a ``transformation tracker'' that detects
|
|
|
- rotation, scaling and translation gestures.
|
|
|
-
|
|
|
- % TODO: een formele definitie van gestures zou wellicht beter zijn, maar
|
|
|
- % wordt niet gegeven in deze thesis (wel besproken in future work)
|
|
|
-
|
|
|
- \subsection*{Assignment of a gesture tracker to an area}
|
|
|
-
|
|
|
- As explained in section \ref{sec:callbacks}, events are delegated from
|
|
|
- a widget to some event analysis. The analysis component of a widget
|
|
|
- consists of a list of gesture trackers, each tracking a specific set of
|
|
|
- gestures. No two trackers in the list should be tracking the same
|
|
|
- gesture type.
|
|
|
-
|
|
|
- When a handler for a gesture is ``bound'' to a widget, the widget
|
|
|
- asserts that it has a tracker that is tracking this gesture. Thus, the
|
|
|
- programmer does not create gesture trackers manually. Figure
|
|
|
- \ref{fig:trackerdiagram} shows the position of gesture trackers in the
|
|
|
- architecture.
|
|
|
-
|
|
|
- \trackerdiagram{Extension of the diagram from figure
|
|
|
- \ref{fig:widgetdiagram}, showing the position of gesture trackers in
|
|
|
- the architecture.}
|
|
|
+ in some way. Gestures such as a button tap or the dragging of an object
|
|
|
+ using one finger are easy to detect by comparing the positions of
|
|
|
+ sequential $point\_down$ and $point\_move$ events.
|
|
|
+
|
|
|
+ A way to detect more complex gestures is based on a sequence of input
|
|
|
+ features is with the use of machine learning methods, such as Hidden Markov
|
|
|
+ Models \footnote{A Hidden Markov Model (HMM) is a statistical model without
|
|
|
+ a memory, it can be used to detect gestures based on the current input
|
|
|
+ state alone.} \cite{conf/gw/RigollKE97}. A sequence of input states can be
|
|
|
+ mapped to a feature vector that is recognized as a particular gesture with
|
|
|
+ some probability. This type of gesture recognition is often used in video
|
|
|
+ processing, where large sets of data have to be processed. Using an
|
|
|
+ imperative programming style to recognize each possible sign in sign
|
|
|
+ language detection is near impossible, and certainly not desirable.
|
|
|
+
|
|
|
+ Sequences of events that are triggered by a multi-touch based surfaces are
|
|
|
+ often of a manageable complexity. An imperative programming style is
|
|
|
+ sufficient to detect many common gestures. The imperative programming style
|
|
|
+ is also familiar and understandable for a wide range of application
|
|
|
+ developers. Therefore, the aim is to use this programming style in the
|
|
|
+ architecture implementation that is developed during this project.
|
|
|
+
|
|
|
+ However, the architecture should not be limited to multi-touch surfaces
|
|
|
+ alone. For example, the architecture should also be fit to be used in an
|
|
|
+ application that detects hand gestures from video input.
|
|
|
+
|
|
|
+ A problem with the imperative programming style is that the detection of
|
|
|
+ different gestures requires different pieces of detection code. If this is
|
|
|
+ not managed well, the detection logic is prone to become chaotic and
|
|
|
+ over-complex.
|
|
|
+
|
|
|
+ To manage complexity and support multiple methods of gesture detection, the
|
|
|
+ architecture has adopted the tracker-based design as described by
|
|
|
+ \cite{win7touch}. Different detection components are wrapped in separate
|
|
|
+ gesture tracking units, or \emph{gesture trackers} The input of a gesture
|
|
|
+ tracker is provided by an area in the form of events. When a gesture
|
|
|
+ tracker detects a gesture, this gesture is triggered in the corresponding
|
|
|
+ area. The area then calls the callbacks which are bound to the gesture
|
|
|
+ type by the application. Figure \ref{fig:trackerdiagram} shows the position
|
|
|
+ of gesture trackers in the architecture.
|
|
|
+
|
|
|
+ \trackerdiagram{Extension of the diagram from figure
|
|
|
+ \ref{fig:areadiagram}, showing the position of gesture trackers in the
|
|
|
+ architecture.}
|
|
|
+
|
|
|
+ The use of gesture trackers as small detection units provides extendability
|
|
|
+ of the architecture. A developer can write a custom gesture tracker and
|
|
|
+ register it in the architecture. The tracker can use any type of detection
|
|
|
+ logic internally, as long as it translates events to gestures.
|
|
|
+
|
|
|
+ An example of a possible gesture tracker implementation is a
|
|
|
+ ``transformation tracker'' that detects rotation, scaling and translation
|
|
|
+ gestures.
|
|
|
+
|
|
|
+ \section{Reserving an event for a gesture}
|
|
|
+ \label{sec:reserve-event}
|
|
|
+
|
|
|
+ A problem occurs when areas overlap, as shown by figure
|
|
|
+ \ref{fig:eventpropagation}. When the white square is rotated, the gray
|
|
|
+ square should keep its current orientation. This means that events that are
|
|
|
+ used for rotation of the white square, should not be used for rotation of
|
|
|
+ the gray square. To achieve this, there must be some communication between
|
|
|
+ the gesture trackers of the two squares. When an event in the white square
|
|
|
+ is used for rotation, that event should not be used for rotation in the
|
|
|
+ gray square. In other words, the event must be \emph{reserved} for the
|
|
|
+ rotation gesture in the white square. In order to reserve an event, the
|
|
|
+ event needs to be handled by the rotation tracker of the white before the
|
|
|
+ rotation tracker of the grey square receives it. Otherwise, the gray square
|
|
|
+ has already triggered a rotation gesture and it will be too late to reserve
|
|
|
+ the event for rotation of the white square.
|
|
|
+
|
|
|
+ When an object touches the touch surface, the event that is triggered
|
|
|
+ should be delegated according to the order in which its corresponding areas
|
|
|
+ are positioned over each other. The tree structure in which areas are
|
|
|
+ arranged (see section \ref{sec:tree}), is an ideal tool to determine the
|
|
|
+ order in which an event is delegated to different areas. Areas in the tree
|
|
|
+ are positioned on top of their parent. An object touching the screen is
|
|
|
+ essentially touching the deepest area in the tree that contains the
|
|
|
+ triggered event. That area should be the first to delegate the event to its
|
|
|
+ gesture trackers, and then move the event up in the tree to its ancestors.
|
|
|
+ The movement of an event up in the area tree will be called \emph{event
|
|
|
+ propagation}. To reserve an event for a particular gesture, a gesture
|
|
|
+ tracker can stop its propagation. When propagation of an event is stopped,
|
|
|
+ it will not be passed on the ancestor areas, thus reserving the event.
|
|
|
+ Figure \ref{fig:eventpropagation} illustrates the use of event propagation,
|
|
|
+ applied to the example of the white and gray squares.
|
|
|
+
|
|
|
+ \eventpropagationfigure
|
|
|
|
|
|
\section{Serving multiple applications}
|
|
|
+ \label{sec:multiple-applications}
|
|
|
|
|
|
% TODO
|
|
|
+ \emph{TODO}
|
|
|
|
|
|
\section{Example usage}
|
|
|
\label{sec:example}
|
|
|
@@ -467,16 +461,11 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
\begin{verbatim}
|
|
|
initialize GUI, creating a window
|
|
|
|
|
|
- # Add widgets representing the application window and button
|
|
|
- rootwidget = new rectangular Widget object
|
|
|
- set rootwidget position and size to that of the application window
|
|
|
-
|
|
|
- buttonwidget = new rectangular Widget object
|
|
|
- set buttonwidget position and size to that of the GUI button
|
|
|
+ create a root area with the position and size of the application window
|
|
|
+ create an area with the position and size of the button
|
|
|
|
|
|
- # Create an event server that will be started later
|
|
|
- server = new EventServer object
|
|
|
- set rootwidget as root widget for server
|
|
|
+ create a new event server
|
|
|
+ set 'rootwidget' as root widget for 'server'
|
|
|
|
|
|
# Define handlers and bind them to corresponding widgets
|
|
|
begin function resize_handler(gesture)
|
|
|
@@ -593,6 +582,21 @@ client application, as stated by the online specification
|
|
|
\chapter{Experimental program}
|
|
|
\label{app:experiment}
|
|
|
|
|
|
+ % TODO: This is not really 'related', move it to somewhere else
|
|
|
+ \section{Processing implementation of simple gestures in Android}
|
|
|
+
|
|
|
+ An implementation of a detection architecture for some simple multi-touch
|
|
|
+ gestures (tap, double tap, rotation, pinch and drag) using
|
|
|
+ Processing\footnote{Processing is a Java-based development environment with
|
|
|
+ an export possibility for Android. See also \url{http://processing.org/.}}
|
|
|
+ can be found in a forum on the Processing website \cite{processingMT}. The
|
|
|
+ implementation is fairly simple, but it yields some very appealing results.
|
|
|
+ The detection logic of all gestures is combined in a single class. This
|
|
|
+ does not allow for extendability, because the complexity of this class
|
|
|
+ would increase to an undesirable level (as predicted by the GART article
|
|
|
+ \cite{GART}). However, the detection logic itself is partially re-used in
|
|
|
+ the reference implementation of the generic gesture detection architecture.
|
|
|
+
|
|
|
% TODO: rewrite intro
|
|
|
When designing a software library, its API should be understandable and easy to
|
|
|
use for programmers. To find out the basic requirements of the API to be
|