Tried to bring more structure into the Related Work section.

5252a330 · Taddeüs Kroes · 5b2f6525 · 5252a330 · 5252a330
Commit 5252a330 authored Jun 20, 2012 by Taddeüs Kroes
Show whitespace changes
Inline Side-by-side

Showing with 130 additions and 75 deletions

docs/report.bib docs/report.bib +45 -0

docs/report.tex docs/report.tex +85 -75

No files found.
--- a/docs/report.bib
+++ b/docs/report.bib
@@ -185,3 +185,48 @@
 	year = "2002"
 }

+@misc{kivy,
+	author = "{Mathieu Virbel}, Thomas Hansen Christopher Denter Gabriel Pettier Akshay Arora",
+	howpublished = "http://kivy.org/",
+	title = "{Kivy}",
+	year = "2011"
+}
+
+@inproceedings{VRPN,
+	abstract = "dblp",
+	added-at = "2002-06-07T00:00:00.000+0200",
+	author = "Taylor, Russell M. and Hudson, Thomas C. and Seeger, Adam and Weber, Hans and Juliano, Jeffrey and Helser, Aron T.",
+	biburl = "http://www.bibsonomy.org/bibtex/23a77e2b392bb9adfce8b0334400bb8da/dblp",
+	booktitle = "{VRST}",
+	date = "2002-06-07",
+	ee = "http://doi.acm.org/10.1145/505008.505019",
+	interhash = "f5bb371469d96d00e725eb14bede5662",
+	intrahash = "3a77e2b392bb9adfce8b0334400bb8da",
+	keywords = "dblp",
+	pages = "55--61",
+	title = "{VRPN: a device-independent, network-transparent VR peripheral system.}",
+	url = "http://dblp.uni-trier.de/db/conf/vrst/vrst2001.html#TaylorHSWJH01",
+	x-fetchedfrom = "Bibsonomy",
+	year = 2001
+}
+
+@unpublished{kivygesture,
+	author = "organization, kivy",
+	note = "\url{http://kivy.org/docs/api-kivy.gesture.html}",
+	title = "{Gesture recognition in kivy}"
+}
+
+@manual{OpenNI2010,
+	added-at = "2011-06-10T23:03:19.000+0200",
+	author = "organization, OpenNI",
+	interhash = "23b4f1fbccf97a3523078da0d5c924d1",
+	intrahash = "d7953305373f5ce2ec6ab43e80306fdc",
+	keywords = "kinect",
+	month = "November",
+	organization = "OpenNI organization",
+	title = "{OpenNI User Guide}",
+	url = "http://www.openni.org/documentation; http://www.bibsonomy.org/bibtex/2d7953305373f5ce2ec6ab43e80306fdc/lightraven",
+	x-fetchedfrom = "Bibsonomy",
+	year = 2010
+}
+
--- a/docs/report.tex
+++ b/docs/report.tex
@@ -84,7 +84,7 @@ detection for every new gesture-based application.
    multi-touch surface devices. It presents a design for a generic gesture
    detection architecture for use in multi-touch based applications. A
    reference implementation of this design is used in some test case
-    applications, whose goal is to test the effectiveness of the design and
+    applications, whose purpose is to test the effectiveness of the design and
    detect its shortcomings.

    Chapter \ref{chapter:related} describes related work that inspired a design
@@ -102,72 +102,82 @@ detection for every new gesture-based application.
 \chapter{Related work}
 \label{chapter:related}

-    % TODO: herstructureren
-
-    \section{Existing application frameworks}
-
-    Application frameworks for surface-touch devices, such as Nokia's Qt
-    \cite{qt}, do already include the detection of commonly used gestures like
-    \emph{pinch} gestures. However, this detection logic is dependent on the
-    application framework. Consequently, an application developer who wants to
-    use multi-touch interaction in an application is forced to use an
-    application framework that includes support for multi-touch gestures.
-    Moreover, the set of supported gestures is limited by the application
-    framework of choice. To incorporate a custom event in an application, the
-    application developer needs to extend the framework. This requires
-    extensive knowledge of the framework's architecture.  Also, if the same
-    gesture is needed in another application that is based on another
-    framework, the detection logic has to be translated for use in that
-    framework.
-
-    \section{Gesture and Activity Recognition Toolkit}
-
-    The Gesture and Activity Recognition Toolkit (GART) \cite{GART} is a
-    toolkit for the development of gesture-based applications. The toolkit
-    states that the best way to classify gestures is to use machine learning.
-    The programmer trains a program to recognize using the machine learning
-    library from the toolkit. The toolkit contains a callback mechanism that
-    the programmer uses to execute custom code when a gesture is recognized.
-
-    Though multi-touch input is not directly supported by the toolkit, the
-    level of abstraction does allow for it to be implemented in the form of a
-    ``touch'' sensor.
-
-    The reason to use machine learning is the statement that gesture detection
-    ``is likely to become increasingly complex and unmanageable'' when using a
-    set of predefined rules to detect whether some sensor input can be seen as
-    a specific gesture. This statement is not necessarily true. If the
-    programmer is given a way to separate the detection of different types of
-    gestures and flexibility in rule definitions, over-complexity can be
-    avoided.
-
-    \section{Gesture recognition implementation for Windows 7}
-
-    The online article \cite{win7touch} presents a Windows 7 application,
-    written in Microsofts .NET. The application shows detected gestures in a
-    canvas. Gesture trackers keep track of stylus locations to detect specific
-    gestures. The event types required to track a touch stylus are ``stylus
-    down'', ``stylus move'' and ``stylus up'' events. A
-    \texttt{GestureTrackerManager} object dispatches these events to gesture
-    trackers. The application supports a limited number of pre-defined
-    gestures.
-
-    An important observation in this application is that different gestures are
-    detected by different gesture trackers, thus separating gesture detection
-    code into maintainable parts.
+    Applications that use gesture-based interaction need a graphical user
+    interface (GUI) on which gestures can be performed. The creation of a GUI
+    is a platform-specific task. For instance, Windows and Linux support
+    different window managers. To create a window in a platform-independent
+    application, the application would need to include separate functionalities
+    for supported platforms. For this reason, GUI-based applications are often
+    built on top of an application framework that abstracts platform-specific
+    tasks. Frameworks often include a set of tools and events that help the
+    developer to easily build advanced GUI widgets.
+
+    % Existing frameworks (and why they're not good enough)
+    Some frameworks, such as Nokia's Qt \cite{qt}, provide support for basic
+    multi-touch gestures like tapping, rotation or pinching. However, the
+    detection of gestures is embedded in the framework code in an inseparable
+    way. Consequently, an application developer who wants to use multi-touch
+    interaction in an application, is forced to use an application framework
+    that includes support for those multi-touch gestures that are required by
+    the application. Kivy \cite{kivy} is a GUI framework for Python
+    applications, with support for multi-touch gestures. It uses a basic
+    gesture detection algorithm that allows developers to define custom
+    gestures to some degree \cite{kivygesture} using a set of touch point
+    coordinates. However, these frameworks do not provide support for extension
+    with custom complex gestures.
+
+    Many frameworks are also device-specific, meaning that they are developed
+    for use on either a tablet, smartphone, PC or other device. OpenNI
+    \cite{OpenNI2010}, for example, provides API's for only natural interaction
+    (NI) devices such as webcams and microphones. The concept of complex
+    gesture-based interaction, however, is applicable to a much wider set of
+    devices. VRPN \cite{VRPN} provides a software library that abstracts the
+    output of devices, which enables it to support a wide set of devices used
+    in Virtual Reality (VR) interaction. The framework makes the low-level
+    events of these devices accessible in a client application using network
+    communication. Gesture detection is not included in VRPN.
+
+    % Methods of gesture detection
+    The detection of high-level gestures from low-level events can be
+    approached in several ways. GART \cite{GART} is a toolkit for the
+    development of gesture-based applications, which states that the best way
+    to classify gestures is to use machine learning. The programmer trains an
+    application to recognize gestures using a machine learning library from the
+    toolkit. Though multi-touch input is not directly supported by the toolkit,
+    the level of abstraction does allow for it to be implemented in the form of
+    a ``touch'' sensor. The reason to use machine learning is that gesture
+    detection ``is likely to become increasingly complex and unmanageable''
+    when using a predefined set of rules to detect whether some sensor input
+    can be classified as a specific gesture.
+
+    The alternative to machine learning is to define a predefined set of rules
+    for each gesture. Manoj Kumar \cite{win7touch} presents a Windows 7
+    application, written in Microsofts .NET, which detects a set of basic
+    directional gestures based the movement of a stylus. The complexity of the
+    code is managed by the separation of different gesture types in different
+    detection units called ``gesture trackers''. The application shows that
+    predefined gesture detection rules do not necessarily produce unmanageable
+    code.

    \section{Analysis of related work}

-    The simple Processing implementation of multi-touch events provides most of
-    the functionality that can be found in existing multi-touch applications.
-    In fact, many applications for mobile phones and tablets only use tap and
-    scroll events. For this category of applications, using machine learning
-    seems excessive. Though the representation of a gesture using a feature
-    vector in a machine learning algorithm is a generic and formal way to
-    define a gesture, a programmer-friendly architecture should also support
-    simple, ``hard-coded'' detection code. A way to separate different pieces
-    of gesture detection code, thus keeping a code library manageable and
-    extendable, is to user different gesture trackers.
+    Implementations for the support of complex gesture based interaction do
+    already exist. However, gesture detection in these implementations is
+    device-specific (Nokia Qt and OpenNI) or limited to use within an
+    application framework (Kivy).
+
+    An abstraction of device output allows VRPN and GART to support multiple
+    devices. However, VRPN does not incorporate gesture detection. GART does,
+    but only in the form of machine learning algorithms. Many applications for
+    mobile phones and tablets only use simple gestures such as taps. For this
+    category of applications, machine learning is an excessively complex method
+    of gesture detection. Manoj Kumar shows that when managed well, a
+    predefined set of gesture detection rules is sufficient to detect simple
+    gestures.
+
+    This thesis explores the possibility to create an architecture that
+    combines support for multiple input devices with different methods of
+    gesture detection.

 \chapter{Design}
 \label{chapter:design}
@@ -180,17 +190,17 @@ detection for every new gesture-based application.
    Application frameworks are a necessity when it comes to fast,
    cross-platform development. A generic architecture design should aim to be
    compatible with existing frameworks, and provide a way to detect and extend
-    gestures independent of the framework. An application framework is written
-    in a specific programming language. To support multiple frameworks and
-    programming languages, the architecture should be accessible for
-    applications using a language-independent method of communication. This
-    intention leads towards the concept of a dedicated gesture detection
-    application that serves gestures to multiple applications at the same time.
+    gestures independent of the framework. Since an application framework is
+    written in a specific programming language, the architecture should be
+    accessible for applications using a language-independent method of
+    communication. This intention leads towards the concept of a dedicated
+    gesture detection application that serves gestures to multiple applications
+    at the same time.

    This chapter describes a design for such an architecture. The architecture
    is represented as diagram of relations between different components.
    Sections \ref{sec:multipledrivers} to \ref{sec:daemon} define requirements
-    for the architecture, and extend the diagram with components that meet
+    for the architecture, and extend this diagram with components that meet
    these requirements. Section \ref{sec:example} describes an example usage of
    the architecture in an application.

@@ -255,7 +265,6 @@ detection for every new gesture-based application.
    \section{Restricting events to a screen area}
    \label{sec:areas}

-    % TODO: in introduction: gestures zijn opgebouwd uit meerdere primitieven
    Touch input devices are unaware of the graphical input
    widgets\footnote{``Widget'' is a name commonly used to identify an element
    of a graphical user interface (GUI).} rendered by an application, and
@@ -581,18 +590,20 @@ The reference implementation is written in Python and available at
    \item Basic tracker, supports $point\_down,~point\_move,~point\_up$ gestures.
    \item Tap tracker, supports $tap,~single\_tap,~double\_tap$ gestures.
    \item Transformation tracker, supports $rotate,~pinch,~drag$ gestures.
+    \item Hand tracker, supports $hand\_down,~hand\_up$ gestures.
 \end{itemize}

 \textbf{Event areas}
 \begin{itemize}
    \item Circular area
    \item Rectangular area
+    \item Polygon area
    \item Full screen area
 \end{itemize}

 The implementation does not include a network protocol to support the daemon
 setup as described in section \ref{sec:daemon}. Therefore, it is only usable in
-Python programs. Thus, the two test programs are also written in Python.
+Python programs. The two test programs are also written in Python.

 The event area implementations contain some geometric functions to determine
 whether an event should be delegated to an event area. All gesture trackers
@@ -819,8 +830,7 @@ complex objects such as fiducials, arguments like rotational position and
 acceleration are also included.

 ALIVE and SET messages can be combined to create ``point down'', ``point move''
-and ``point up'' events (as used by the Windows 7 implementation
-\cite{win7touch}).
+and ``point up'' events.

 TUIO coordinates range from $0.0$ to $1.0$, with $(0.0, 0.0)$ being the left
 top corner of the screen and $(1.0, 1.0)$ the right bottom corner. To focus