Commit 294c0773 authored by Taddeüs Kroes's avatar Taddeüs Kroes

Some final tweaks to 'Design' chapter.

parent 89d950e4
......@@ -140,7 +140,7 @@
}
\def\trackerdiagram{
\begin{figure}[h]
\begin{figure}[h!]
\center
\architecture{
\node[block, below of=driver] (eventdriver) {Event driver}
......
......@@ -18,20 +18,19 @@
% Title page
\maketitle
\begin{abstract}
Applications that use
complex gesture-based interaction need to translate primitive messages from
low-level device drivers to complex, high-level gestures, and map these
gestures to elements in an application. This report presents a generic
architecture for the detection of complex gestures in an application. The
architecture translates device driver messages to a common set of
``events''. The events are then delegated to a tree of ``areas'', which are
used to separate groups of events and assign these groups to an element in
the application. Gesture detection is performed on a group of events
assigned to an area, using detection units called ``gesture tackers''. An
implementation of the architecture as a daemon process would be capable of
serving gestures to multiple applications at the same time. A reference
implementation and two test case applications have been created to test the
effectiveness of the architecture design.
Applications that use complex gesture-based interaction need to translate
primitive messages from low-level device drivers to complex, high-level
gestures, and map these gestures to elements in an application. This report
presents a generic architecture for the detection of complex gestures in an
application. The architecture translates device driver messages to a common
set of ``events''. The events are then delegated to a tree of ``areas'',
which are used to separate groups of events and assign these groups to an
element in the application. Gesture detection is performed on a group of
events assigned to an area, using detection units called ``gesture
tackers''. An implementation of the architecture as a daemon process would
be capable of serving gestures to multiple applications at the same time. A
reference implementation and two test case applications have been created
to test the effectiveness of the architecture design.
\end{abstract}
% Set paragraph indentation
......@@ -243,26 +242,25 @@ goal is to test the effectiveness of the design and detect its shortcomings.
% TODO: in introduction: gestures zijn opgebouwd uit meerdere primitieven
Touch input devices are unaware of the graphical input
widgets\footnote{``Widget'' is a name commonly used to identify an element
of a graphical user interface (GUI).} of an application, and therefore
generate events that simply identify the screen location at which an event
takes place. User interfaces of applications that do not run in full screen
modus are contained in a window. Events which occur outside the application
window should not be handled by the program in most cases. What's more,
widget within the application window itself should be able to respond to
different gestures. E.g. a button widget may respond to a ``tap'' gesture
to be activated, whereas the application window responds to a ``pinch''
gesture to be resized. In order to be able to direct a gesture to a
particular widget in an application, a gesture must be restricted to the
area of the screen covered by that widget. An important question is if the
architecture should offer a solution to this problem, or leave the task of
assigning gestures to application widgets to the application developer.
of a graphical user interface (GUI).} rendered by an application, and
therefore generate events that simply identify the screen location at which
an event takes place. User interfaces of applications that do not run in
full screen modus are contained in a window. Events which occur outside the
application window should not be handled by the program in most cases.
What's more, widget within the application window itself should be able to
respond to different gestures. E.g. a button widget may respond to a
``tap'' gesture to be activated, whereas the application window responds to
a ``pinch'' gesture to be resized. In order to be able to direct a gesture
to a particular widget in an application, a gesture must be restricted to
the area of the screen covered by that widget. An important question is if
the architecture should offer a solution to this problem, or leave the task
of assigning gestures to application widgets to the application developer.
If the architecture does not provide a solution, the ``Event analysis''
component in figure \ref{fig:multipledrivers} receives all events that
occur on the screen surface. The gesture detection logic thus uses all
events as input to detect a gesture. This leaves no possibility for a
gesture to occur at multiple screen positions at the same time, unless the
gesture detection logic incorporates event cluster detection. The problem
gesture to occur at multiple screen positions at the same time. The problem
is illustrated in figure \ref{fig:ex1}, where two widgets on the screen can
be rotated independently. The rotation detection component that detects
rotation gestures receives all four fingers as input. If the two groups of
......@@ -274,11 +272,12 @@ goal is to test the effectiveness of the design and detect its shortcomings.
A gesture detection component could perform a heuristic way of cluster
detection based on the distance between events. However, this method cannot
guarantee that a cluster of events corresponds with a particular
application widget. In short, gesture detection is difficult to implement
without awareness of the location of application widgets. Moreover, the
application developer still needs to direct gestures to a particular widget
manually. This requires geometric calculations in the application logic,
which is a tedious and error-prone task for the developer.
application widget. In short, a gesture detection component is difficult to
implement without awareness of the location of application widgets.
Secondly, the application developer still needs to direct gestures to a
particular widget manually. This requires geometric calculations in the
application logic, which is a tedious and error-prone task for the
developer.
A better solution is to group events that occur inside the area covered by
a widget, before passing them on to a gesture detection component.
......@@ -301,9 +300,9 @@ goal is to test the effectiveness of the design and detect its shortcomings.
application is a ``callback'' mechanism: the application developer binds a
function to an event, that is called when the event occurs. Because of the
familiarity of this concept with developers, the architecture uses a
callback mechanism to handle gestures in an application. Since an area
controls the grouping of events and thus the occurrence of gestures in an
area, gesture handlers for a specific gesture type are bound to an area.
callback mechanism to handle gestures in an application. Callback handlers
are bound to event areas, since events areas controls the grouping of
events and thus the occurrence of gestures in an area of the screen.
Figure \ref{fig:areadiagram} shows the position of areas in the
architecture.
......@@ -324,13 +323,16 @@ goal is to test the effectiveness of the design and detect its shortcomings.
complex touch objects can have additional parameters, such as rotational
orientation or color. An even more generic concept is the \emph{event
filter}, which detects whether an event should be assigned to a particular
gesture detection component based on all available parameters. This level of
abstraction allows for constraints like ``Use all blue objects within a
widget for rotation, and green objects for dragging.''. As mentioned in the
introduction chapter [\ref{chapter:introduction}], the scope of this thesis
is limited to multi-touch surface based devices, for which the \emph{event
area} concept suffices. Section \ref{sec:eventfilter} explores the
possibility of event areas to be replaced with event filters.
gesture detection component based on all available parameters. This level
of abstraction provides additional methods of interaction. For example, a
camera-based multi-touch surface could make a distinction between gestures
performed with a blue gloved hand, and gestures performed with a green
gloved hand.
As mentioned in the introduction chapter [\ref{chapter:introduction}], the
scope of this thesis is limited to multi-touch surface based devices, for
which the \emph{event area} concept suffices. Section \ref{sec:eventfilter}
explores the possibility of event areas to be replaced with event filters.
\subsection{Area tree}
\label{sec:tree}
......@@ -362,7 +364,7 @@ goal is to test the effectiveness of the design and detect its shortcomings.
\subsection{Event propagation}
\label{sec:eventpropagation}
A problem occurs when event areas overlap, as shown by figure
Another problem occurs when event areas overlap, as shown by figure
\ref{fig:eventpropagation}. When the white square is rotated, the gray
square should keep its current orientation. This means that events that are
used for rotation of the white square, should not be used for rotation of
......@@ -416,14 +418,15 @@ goal is to test the effectiveness of the design and detect its shortcomings.
complex gestures, like the writing of a character from the alphabet,
require more advanced detection algorithms.
A way to detect complex gestures based on a sequence of input features
is with the use of machine learning methods, such as Hidden Markov Models
\footnote{A Hidden Markov Model (HMM) is a statistical model without a
memory, it can be used to detect gestures based on the current input state
alone.} \cite{conf/gw/RigollKE97}. A sequence of input states can be mapped
to a feature vector that is recognized as a particular gesture with a
certain probability. An advantage of using machine learning with respect to
an imperative programming style is that complex gestures can be described
A way to detect these complex gestures based on a sequence of input events,
is with the use of machine learning methods, such as the Hidden Markov
Models \footnote{A Hidden Markov Model (HMM) is a statistical model without
a memory, it can be used to detect gestures based on the current input
state alone.} used for sign language detection by
\cite{conf/gw/RigollKE97}. A sequence of input states can be mapped to a
feature vector that is recognized as a particular gesture with a certain
probability. An advantage of using machine learning with respect to an
imperative programming style is that complex gestures can be described
without the use of explicit detection logic. For example, the detection of
the character `A' being written on the screen is difficult to implement
using an imperative programming style, while a trained machine learning
......@@ -434,25 +437,31 @@ goal is to test the effectiveness of the design and detect its shortcomings.
sufficient to detect many common gestures, like rotation and dragging. The
imperative programming style is also familiar and understandable for a wide
range of application developers. Therefore, the architecture should support
an imperative style of gesture detection.
A problem with the imperative programming style is that the explicit
detection of different gestures requires different gesture detection
components. If these components is not managed well, the detection logic is
prone to become chaotic and over-complex.
To manage complexity and support multiple methods of gesture detection, the
architecture has adopted the tracker-based design as described by
\cite{win7touch}. Different detection components are wrapped in separate
an imperative style of gesture detection. A problem with an imperative
programming style is that the explicit detection of different gestures
requires different gesture detection components. If these components are
not managed well, the detection logic is prone to become chaotic and
over-complex.
To manage complexity and support multiple styles of gesture detection
logic, the architecture has adopted the tracker-based design as described
by \cite{win7touch}. Different detection components are wrapped in separate
gesture tracking units, or \emph{gesture trackers}. The input of a gesture
tracker is provided by an event area in the form of events. When a gesture
tracker detects a gesture, this gesture is triggered in the corresponding
event area. The event area then calls the callbacks which are bound to the
gesture type by the application. Figure \ref{fig:trackerdiagram} shows the
position of gesture trackers in the architecture.
tracker is provided by an event area in the form of events. Each gesture
detection component is wrapped in a gesture tracker with a fixed type of
input and output. Internally, the gesture tracker can adopt any programming
style. A character recognition component can use an HMM, whereas a tap
detection component defines a simple function that compares event
coordinates.
\trackerdiagram
When a gesture tracker detects a gesture, this gesture is triggered in the
corresponding event area. The event area then calls the callbacks which are
bound to the gesture type by the application. Figure
\ref{fig:trackerdiagram} shows the position of gesture trackers in the
architecture.
The use of gesture trackers as small detection units provides extendability
of the architecture. A developer can write a custom gesture tracker and
register it in the architecture. The tracker can use any type of detection
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment