|
@@ -18,20 +18,19 @@
|
|
|
% Title page
|
|
% Title page
|
|
|
\maketitle
|
|
\maketitle
|
|
|
\begin{abstract}
|
|
\begin{abstract}
|
|
|
- Applications that use
|
|
|
|
|
- complex gesture-based interaction need to translate primitive messages from
|
|
|
|
|
- low-level device drivers to complex, high-level gestures, and map these
|
|
|
|
|
- gestures to elements in an application. This report presents a generic
|
|
|
|
|
- architecture for the detection of complex gestures in an application. The
|
|
|
|
|
- architecture translates device driver messages to a common set of
|
|
|
|
|
- ``events''. The events are then delegated to a tree of ``areas'', which are
|
|
|
|
|
- used to separate groups of events and assign these groups to an element in
|
|
|
|
|
- the application. Gesture detection is performed on a group of events
|
|
|
|
|
- assigned to an area, using detection units called ``gesture tackers''. An
|
|
|
|
|
- implementation of the architecture as a daemon process would be capable of
|
|
|
|
|
- serving gestures to multiple applications at the same time. A reference
|
|
|
|
|
- implementation and two test case applications have been created to test the
|
|
|
|
|
- effectiveness of the architecture design.
|
|
|
|
|
|
|
+ Applications that use complex gesture-based interaction need to translate
|
|
|
|
|
+ primitive messages from low-level device drivers to complex, high-level
|
|
|
|
|
+ gestures, and map these gestures to elements in an application. This report
|
|
|
|
|
+ presents a generic architecture for the detection of complex gestures in an
|
|
|
|
|
+ application. The architecture translates device driver messages to a common
|
|
|
|
|
+ set of ``events''. The events are then delegated to a tree of ``areas'',
|
|
|
|
|
+ which are used to separate groups of events and assign these groups to an
|
|
|
|
|
+ element in the application. Gesture detection is performed on a group of
|
|
|
|
|
+ events assigned to an area, using detection units called ``gesture
|
|
|
|
|
+ tackers''. An implementation of the architecture as a daemon process would
|
|
|
|
|
+ be capable of serving gestures to multiple applications at the same time. A
|
|
|
|
|
+ reference implementation and two test case applications have been created
|
|
|
|
|
+ to test the effectiveness of the architecture design.
|
|
|
\end{abstract}
|
|
\end{abstract}
|
|
|
|
|
|
|
|
% Set paragraph indentation
|
|
% Set paragraph indentation
|
|
@@ -243,26 +242,25 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
% TODO: in introduction: gestures zijn opgebouwd uit meerdere primitieven
|
|
% TODO: in introduction: gestures zijn opgebouwd uit meerdere primitieven
|
|
|
Touch input devices are unaware of the graphical input
|
|
Touch input devices are unaware of the graphical input
|
|
|
widgets\footnote{``Widget'' is a name commonly used to identify an element
|
|
widgets\footnote{``Widget'' is a name commonly used to identify an element
|
|
|
- of a graphical user interface (GUI).} of an application, and therefore
|
|
|
|
|
- generate events that simply identify the screen location at which an event
|
|
|
|
|
- takes place. User interfaces of applications that do not run in full screen
|
|
|
|
|
- modus are contained in a window. Events which occur outside the application
|
|
|
|
|
- window should not be handled by the program in most cases. What's more,
|
|
|
|
|
- widget within the application window itself should be able to respond to
|
|
|
|
|
- different gestures. E.g. a button widget may respond to a ``tap'' gesture
|
|
|
|
|
- to be activated, whereas the application window responds to a ``pinch''
|
|
|
|
|
- gesture to be resized. In order to be able to direct a gesture to a
|
|
|
|
|
- particular widget in an application, a gesture must be restricted to the
|
|
|
|
|
- area of the screen covered by that widget. An important question is if the
|
|
|
|
|
- architecture should offer a solution to this problem, or leave the task of
|
|
|
|
|
- assigning gestures to application widgets to the application developer.
|
|
|
|
|
|
|
+ of a graphical user interface (GUI).} rendered by an application, and
|
|
|
|
|
+ therefore generate events that simply identify the screen location at which
|
|
|
|
|
+ an event takes place. User interfaces of applications that do not run in
|
|
|
|
|
+ full screen modus are contained in a window. Events which occur outside the
|
|
|
|
|
+ application window should not be handled by the program in most cases.
|
|
|
|
|
+ What's more, widget within the application window itself should be able to
|
|
|
|
|
+ respond to different gestures. E.g. a button widget may respond to a
|
|
|
|
|
+ ``tap'' gesture to be activated, whereas the application window responds to
|
|
|
|
|
+ a ``pinch'' gesture to be resized. In order to be able to direct a gesture
|
|
|
|
|
+ to a particular widget in an application, a gesture must be restricted to
|
|
|
|
|
+ the area of the screen covered by that widget. An important question is if
|
|
|
|
|
+ the architecture should offer a solution to this problem, or leave the task
|
|
|
|
|
+ of assigning gestures to application widgets to the application developer.
|
|
|
|
|
|
|
|
If the architecture does not provide a solution, the ``Event analysis''
|
|
If the architecture does not provide a solution, the ``Event analysis''
|
|
|
component in figure \ref{fig:multipledrivers} receives all events that
|
|
component in figure \ref{fig:multipledrivers} receives all events that
|
|
|
occur on the screen surface. The gesture detection logic thus uses all
|
|
occur on the screen surface. The gesture detection logic thus uses all
|
|
|
events as input to detect a gesture. This leaves no possibility for a
|
|
events as input to detect a gesture. This leaves no possibility for a
|
|
|
- gesture to occur at multiple screen positions at the same time, unless the
|
|
|
|
|
- gesture detection logic incorporates event cluster detection. The problem
|
|
|
|
|
|
|
+ gesture to occur at multiple screen positions at the same time. The problem
|
|
|
is illustrated in figure \ref{fig:ex1}, where two widgets on the screen can
|
|
is illustrated in figure \ref{fig:ex1}, where two widgets on the screen can
|
|
|
be rotated independently. The rotation detection component that detects
|
|
be rotated independently. The rotation detection component that detects
|
|
|
rotation gestures receives all four fingers as input. If the two groups of
|
|
rotation gestures receives all four fingers as input. If the two groups of
|
|
@@ -274,11 +272,12 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
A gesture detection component could perform a heuristic way of cluster
|
|
A gesture detection component could perform a heuristic way of cluster
|
|
|
detection based on the distance between events. However, this method cannot
|
|
detection based on the distance between events. However, this method cannot
|
|
|
guarantee that a cluster of events corresponds with a particular
|
|
guarantee that a cluster of events corresponds with a particular
|
|
|
- application widget. In short, gesture detection is difficult to implement
|
|
|
|
|
- without awareness of the location of application widgets. Moreover, the
|
|
|
|
|
- application developer still needs to direct gestures to a particular widget
|
|
|
|
|
- manually. This requires geometric calculations in the application logic,
|
|
|
|
|
- which is a tedious and error-prone task for the developer.
|
|
|
|
|
|
|
+ application widget. In short, a gesture detection component is difficult to
|
|
|
|
|
+ implement without awareness of the location of application widgets.
|
|
|
|
|
+ Secondly, the application developer still needs to direct gestures to a
|
|
|
|
|
+ particular widget manually. This requires geometric calculations in the
|
|
|
|
|
+ application logic, which is a tedious and error-prone task for the
|
|
|
|
|
+ developer.
|
|
|
|
|
|
|
|
A better solution is to group events that occur inside the area covered by
|
|
A better solution is to group events that occur inside the area covered by
|
|
|
a widget, before passing them on to a gesture detection component.
|
|
a widget, before passing them on to a gesture detection component.
|
|
@@ -301,9 +300,9 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
application is a ``callback'' mechanism: the application developer binds a
|
|
application is a ``callback'' mechanism: the application developer binds a
|
|
|
function to an event, that is called when the event occurs. Because of the
|
|
function to an event, that is called when the event occurs. Because of the
|
|
|
familiarity of this concept with developers, the architecture uses a
|
|
familiarity of this concept with developers, the architecture uses a
|
|
|
- callback mechanism to handle gestures in an application. Since an area
|
|
|
|
|
- controls the grouping of events and thus the occurrence of gestures in an
|
|
|
|
|
- area, gesture handlers for a specific gesture type are bound to an area.
|
|
|
|
|
|
|
+ callback mechanism to handle gestures in an application. Callback handlers
|
|
|
|
|
+ are bound to event areas, since events areas controls the grouping of
|
|
|
|
|
+ events and thus the occurrence of gestures in an area of the screen.
|
|
|
Figure \ref{fig:areadiagram} shows the position of areas in the
|
|
Figure \ref{fig:areadiagram} shows the position of areas in the
|
|
|
architecture.
|
|
architecture.
|
|
|
|
|
|
|
@@ -324,13 +323,16 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
complex touch objects can have additional parameters, such as rotational
|
|
complex touch objects can have additional parameters, such as rotational
|
|
|
orientation or color. An even more generic concept is the \emph{event
|
|
orientation or color. An even more generic concept is the \emph{event
|
|
|
filter}, which detects whether an event should be assigned to a particular
|
|
filter}, which detects whether an event should be assigned to a particular
|
|
|
- gesture detection component based on all available parameters. This level of
|
|
|
|
|
- abstraction allows for constraints like ``Use all blue objects within a
|
|
|
|
|
- widget for rotation, and green objects for dragging.''. As mentioned in the
|
|
|
|
|
- introduction chapter [\ref{chapter:introduction}], the scope of this thesis
|
|
|
|
|
- is limited to multi-touch surface based devices, for which the \emph{event
|
|
|
|
|
- area} concept suffices. Section \ref{sec:eventfilter} explores the
|
|
|
|
|
- possibility of event areas to be replaced with event filters.
|
|
|
|
|
|
|
+ gesture detection component based on all available parameters. This level
|
|
|
|
|
+ of abstraction provides additional methods of interaction. For example, a
|
|
|
|
|
+ camera-based multi-touch surface could make a distinction between gestures
|
|
|
|
|
+ performed with a blue gloved hand, and gestures performed with a green
|
|
|
|
|
+ gloved hand.
|
|
|
|
|
+
|
|
|
|
|
+ As mentioned in the introduction chapter [\ref{chapter:introduction}], the
|
|
|
|
|
+ scope of this thesis is limited to multi-touch surface based devices, for
|
|
|
|
|
+ which the \emph{event area} concept suffices. Section \ref{sec:eventfilter}
|
|
|
|
|
+ explores the possibility of event areas to be replaced with event filters.
|
|
|
|
|
|
|
|
\subsection{Area tree}
|
|
\subsection{Area tree}
|
|
|
\label{sec:tree}
|
|
\label{sec:tree}
|
|
@@ -362,7 +364,7 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
\subsection{Event propagation}
|
|
\subsection{Event propagation}
|
|
|
\label{sec:eventpropagation}
|
|
\label{sec:eventpropagation}
|
|
|
|
|
|
|
|
- A problem occurs when event areas overlap, as shown by figure
|
|
|
|
|
|
|
+ Another problem occurs when event areas overlap, as shown by figure
|
|
|
\ref{fig:eventpropagation}. When the white square is rotated, the gray
|
|
\ref{fig:eventpropagation}. When the white square is rotated, the gray
|
|
|
square should keep its current orientation. This means that events that are
|
|
square should keep its current orientation. This means that events that are
|
|
|
used for rotation of the white square, should not be used for rotation of
|
|
used for rotation of the white square, should not be used for rotation of
|
|
@@ -416,14 +418,15 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
complex gestures, like the writing of a character from the alphabet,
|
|
complex gestures, like the writing of a character from the alphabet,
|
|
|
require more advanced detection algorithms.
|
|
require more advanced detection algorithms.
|
|
|
|
|
|
|
|
- A way to detect complex gestures based on a sequence of input features
|
|
|
|
|
- is with the use of machine learning methods, such as Hidden Markov Models
|
|
|
|
|
- \footnote{A Hidden Markov Model (HMM) is a statistical model without a
|
|
|
|
|
- memory, it can be used to detect gestures based on the current input state
|
|
|
|
|
- alone.} \cite{conf/gw/RigollKE97}. A sequence of input states can be mapped
|
|
|
|
|
- to a feature vector that is recognized as a particular gesture with a
|
|
|
|
|
- certain probability. An advantage of using machine learning with respect to
|
|
|
|
|
- an imperative programming style is that complex gestures can be described
|
|
|
|
|
|
|
+ A way to detect these complex gestures based on a sequence of input events,
|
|
|
|
|
+ is with the use of machine learning methods, such as the Hidden Markov
|
|
|
|
|
+ Models \footnote{A Hidden Markov Model (HMM) is a statistical model without
|
|
|
|
|
+ a memory, it can be used to detect gestures based on the current input
|
|
|
|
|
+ state alone.} used for sign language detection by
|
|
|
|
|
+ \cite{conf/gw/RigollKE97}. A sequence of input states can be mapped to a
|
|
|
|
|
+ feature vector that is recognized as a particular gesture with a certain
|
|
|
|
|
+ probability. An advantage of using machine learning with respect to an
|
|
|
|
|
+ imperative programming style is that complex gestures can be described
|
|
|
without the use of explicit detection logic. For example, the detection of
|
|
without the use of explicit detection logic. For example, the detection of
|
|
|
the character `A' being written on the screen is difficult to implement
|
|
the character `A' being written on the screen is difficult to implement
|
|
|
using an imperative programming style, while a trained machine learning
|
|
using an imperative programming style, while a trained machine learning
|
|
@@ -434,25 +437,31 @@ goal is to test the effectiveness of the design and detect its shortcomings.
|
|
|
sufficient to detect many common gestures, like rotation and dragging. The
|
|
sufficient to detect many common gestures, like rotation and dragging. The
|
|
|
imperative programming style is also familiar and understandable for a wide
|
|
imperative programming style is also familiar and understandable for a wide
|
|
|
range of application developers. Therefore, the architecture should support
|
|
range of application developers. Therefore, the architecture should support
|
|
|
- an imperative style of gesture detection.
|
|
|
|
|
-
|
|
|
|
|
- A problem with the imperative programming style is that the explicit
|
|
|
|
|
- detection of different gestures requires different gesture detection
|
|
|
|
|
- components. If these components is not managed well, the detection logic is
|
|
|
|
|
- prone to become chaotic and over-complex.
|
|
|
|
|
-
|
|
|
|
|
- To manage complexity and support multiple methods of gesture detection, the
|
|
|
|
|
- architecture has adopted the tracker-based design as described by
|
|
|
|
|
- \cite{win7touch}. Different detection components are wrapped in separate
|
|
|
|
|
|
|
+ an imperative style of gesture detection. A problem with an imperative
|
|
|
|
|
+ programming style is that the explicit detection of different gestures
|
|
|
|
|
+ requires different gesture detection components. If these components are
|
|
|
|
|
+ not managed well, the detection logic is prone to become chaotic and
|
|
|
|
|
+ over-complex.
|
|
|
|
|
+
|
|
|
|
|
+ To manage complexity and support multiple styles of gesture detection
|
|
|
|
|
+ logic, the architecture has adopted the tracker-based design as described
|
|
|
|
|
+ by \cite{win7touch}. Different detection components are wrapped in separate
|
|
|
gesture tracking units, or \emph{gesture trackers}. The input of a gesture
|
|
gesture tracking units, or \emph{gesture trackers}. The input of a gesture
|
|
|
- tracker is provided by an event area in the form of events. When a gesture
|
|
|
|
|
- tracker detects a gesture, this gesture is triggered in the corresponding
|
|
|
|
|
- event area. The event area then calls the callbacks which are bound to the
|
|
|
|
|
- gesture type by the application. Figure \ref{fig:trackerdiagram} shows the
|
|
|
|
|
- position of gesture trackers in the architecture.
|
|
|
|
|
|
|
+ tracker is provided by an event area in the form of events. Each gesture
|
|
|
|
|
+ detection component is wrapped in a gesture tracker with a fixed type of
|
|
|
|
|
+ input and output. Internally, the gesture tracker can adopt any programming
|
|
|
|
|
+ style. A character recognition component can use an HMM, whereas a tap
|
|
|
|
|
+ detection component defines a simple function that compares event
|
|
|
|
|
+ coordinates.
|
|
|
|
|
|
|
|
\trackerdiagram
|
|
\trackerdiagram
|
|
|
|
|
|
|
|
|
|
+ When a gesture tracker detects a gesture, this gesture is triggered in the
|
|
|
|
|
+ corresponding event area. The event area then calls the callbacks which are
|
|
|
|
|
+ bound to the gesture type by the application. Figure
|
|
|
|
|
+ \ref{fig:trackerdiagram} shows the position of gesture trackers in the
|
|
|
|
|
+ architecture.
|
|
|
|
|
+
|
|
|
The use of gesture trackers as small detection units provides extendability
|
|
The use of gesture trackers as small detection units provides extendability
|
|
|
of the architecture. A developer can write a custom gesture tracker and
|
|
of the architecture. A developer can write a custom gesture tracker and
|
|
|
register it in the architecture. The tracker can use any type of detection
|
|
register it in the architecture. The tracker can use any type of detection
|