SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
Exploring Interaction Modes for Image Retrieval
    Corey Engelman1                      Rui Li1                  Jeff Pelz2           Pengcheng Shi1                       Anne Haake1

ABSTRACT                                                                       applications, where information about the images can be extracted
The number of digital images in use is growing at an increasing                from experts and utilized. Major questions remain as to how best
rate across a wide array of application domains. That being said,              to bring users “into the loop” [2,3].
there is an ever-growing need for innovative ways to help end-                 Multimodal user interfaces are promising as the interactive
users gain access to these images quickly and effectively.                     component of CBIR systems because different modes are best
Moreover, it is becoming increasingly more difficult to manually               suited to expressing different kinds of information. Recent
annotate these images, for example with text labels, to generate               research efforts have been focused on developing and studying
useful metadata. One such method for helping users gain access to              usability for multimodal interaction [4,5,6]. Designing natural,
digital images is content-based image retrieval (CBIR). Practical              usable interaction will require an understanding of which user
use of CBIR systems has been limited by several “gaps”,                        interactions should be explicit and which implicit. Consider query
including the well-known semantic gap and usability gaps [1].                  by example (QBE), which requires users to select a representative
Innovative designs are needed to bring end users into the loop to              image and often a region of that image. It is the usual paradigm in
bridge these gaps. Our human-centered approaches integrate                     CBIR but users have difficulty forming such queries. There is a
human perception and multimodal interaction to facilitate more                 need for innovative new methods to support QBE. Beyond QBE,
usable and effective image retrieval. Here we show that multi-                 more effective methods are needed for gaining input from the user
touch interaction is more usable than gaze based interaction for               for relevance feedback to refine the results of a search. For
explicit image region selection. 1                                             example, this could be done explicitly, by actually having the user
                                                                               directly specify which images were close to what they were
Categories and Subject Descriptors                                             looking for, or implicitly by simply making note of which images
H.5.2 [Information Interfaces and Presentation]: User                          they looked at with interest (e.g via gaze). Finally, better
Interfaces – Graphical user interfaces, input devices and                      organization of the images returned from a query is as important
strategies, prototyping, user-centered design, voice I/O,                      as the underlying retrieval system itself, in that it allows the user
interaction styles.                                                            to quickly scan the results and find what they are looking for.

General Terms                                                                  Our approach to overcoming the interactivity challenges of CBIR
Measurement, Performance, Design, Experimentation, Human                       is largely based on bringing the user into the process by
Factors                                                                        combining traditional modes of input such as the keyboard and
                                                                               mouse with interaction styles that may be more natural such as
Keywords                                                                       gaze input (eye-tracking), voice recognition, and multi-touch
Multimodal, eye tracking, image retrieval, human-centered                      interaction. A software framework was developed for such a
computing                                                                      system using existing graphical user interface (GUI) libraries and
                                                                               then designing several subcomponents that allow for interaction
1. INTRODUCTION                                                                via the new methods within a GUI. With the implementation of
Research in CBIR has shown that image content is more                          this basic framework for multimodal interface design it is now
expressive of users’ perception than is textual annotation. A                  possible to quickly develop and test prototypes for different
semantic gap occurs, however, when low-level image features,                   interface layouts and even prototypes for different modes of
such as color or texture, are insufficient in completely                       interaction using one or more of the input modes (mouse,
representing an image in a way that reflects human perception.                 keyboard, gaze, voice, touch).
One possible way to bridge the semantic gap is to take a “human-               A series of studies will be performed to determine which of these
centered” approach in system design. This is particularly                      prototypes are most efficient and usable across a range of image
important in knowledge rich domains, such as biomedical                        types and among varied end user groups. The first of these,
                                                                               described here, involves study of modes of interaction for
1
 B. Thomas Golisano College of Computing and Information Sciences,             performing QBE through explicit region of interest selection. The
Rochester Institute of Technology                                              main goal is to effectively compare the efficiency of different
1 Lomb Memorial Drive, Rochester, NY 14623-5603                                interaction methods, as well as user preference, ease-of-use, and
{cde7825, rxl5604, spcast, arhics}@rit.edu                                     ease-of-learning.
2
 College of Imaging Arts and Sciences, Rochester Institute Technology          2. Methods
1 Lomb Memorial Drive, Rochester, NY 14623-5603
{jbppph }@rit.edu                                                              2.1 Design And Implementation
                                                                               The best approach to developing a multimodal user interface such
Permission to make digital or hard copies of all or part of this work for      as the one described here is an evolutionary approach. This means
personal or classroom use is granted without fee provided that copies are      breaking the overall large goal of building a multimodal user
not made or distributed for profit or commercial advantage and that copies     interface into smaller obtainable goals, and designing,
bear this notice and the full citation on the first page. To copy otherwise,   implementing, testing, and integrating these smaller portions. In
or republish, to post on servers or to redistribute to lists, requires prior   this way, the developer can ensure that separate components are
specific permission and/or a fee.                                              not dependent on one another, because one builds stand-alone
                                                                               subsystems, and then integrates them.
NGCA '11, May 26-27 2011, Karlskrona, Sweden
Copyright 2011 ACM 978-1-4503-0680-5/11/05…$10.00.
2.1.1 Eye Tracking                                                      window (JFrame) and the LayoutManager class for managing
The Sensomotoric Instruments (SMI) RED 250 Hz eye-tracking              placement of components within the window. Furthermore, a
device, was used to track the position of the user’s gaze on the        system for allowing rapid prototyping of UI layouts can be put in
monitor. SMI’s iViewX software was used to run the eye tracker          place to facilitate development. This involves creating an Abstract
during use and SMI’s Experiment Center was used to perform a            class called PrototypeUI that inherits from javas JFrame class.
calibration prior to use. Our custom software, written in Java,         Any number of prototype UI layouts can be created and tested
communicates with the device using Unified Data Protocol (UDP)          without changing the code for core functionality of the system or
to send signals to the eye-tracker to start and stop recording. Once    for the previously mentioned subcomponents that are handling
the eye tracker receives the start signal, it begins streaming screen   different modes of input.
coordinates to the program. A separate program thread can then          2.2 Experimental Design
repeatedly get the new coordinates and update respective variables
                                                                        To evaluate prototype interaction styles for QBE, we recruited 9
corresponding to the users gaze. Because the human eye is
                                                                        undergraduate and graduate students at Rochester Institute of
naturally jittery, it is necessary to implement an algorithm for
                                                                        Technology as study participants. Participants were given an
smoothing/filtering the data coming from the eye tracker. Because
                                                                        explanation of the CBIR paradigm and of QBE and then were
the system is developed in an Object Oriented Programming
                                                                        given a brief tutorial on each prototype mode they would be using.
Language (OOP), implementing such functionality is as simple as
                                                                        For the study they were shown a set of ten images, four separate
creating an abstract Filter class, and then creating several
                                                                        times, in randomized order. Each of the four times they were
instances of that abstract Filter. This allows multiple different
                                                                        shown the ten images, their task was to perform QBE by explicit
filtering algorithms to be created easily. Even this functionality
                                                                        region of interest selection using one of the four prototype
affords a vast array of possibilities then for how the eye input data
                                                                        methods of interaction. Because we are not concerned in this
can be used for interaction. For example, eye tracking could be
                                                                        study about regions of interest within objects but rather whether
used to replace mouse/keyboard scrolling and panning [7].
                                                                        the user can effectively select an object, we instructed the user to
2.1.2 Voice Recognition                                                 select a specific object from each image (e.g select the eight ball
Java defines the Java Speech Application Programming Interface          from an image of billiard balls on a pool table; see Figure 1C).
(JSAPI), implemented by several open source libraries. Any
                                                                        2.2.1 Image Selection
implementation of the JSAPI is a suitable choice as they all
                                                                        When choosing the images to use for the study, there were two
perform the functionality specified by Java. For our system, we
                                                                        main considerations. First, because we specified what to select,
chose Cloud Garden JSAPI (http://www.cloudgarden.com).
                                                                        there was a requirement for obvious, discrete objects in the image
Beyond a suitable library that implements the JSAPI, a speech
                                                                        to eliminate ambiguity. Next, we wanted to test our four
recognition engine is required on the computer running the
                                                                        prototypes across a variety of images and so we defined categories
multimodal system. For our system, we have used Windows
                                                                        of images. These categories; simple, intermediate, and complex,
Speech Recognition, because it is included in the Windows
                                                                        were based on the complexity of the object the user was to select.
operating system (Windows 7). A custom “grammar” can be
                                                                        For the simple category, we photographed billiard balls in
written to specify which commands the system will accept. Then a
                                                                        different configurations. This covers both criteria, because the
simple controller can be implemented to receive commands,
                                                                        shape is simply a circle, and it allows us to instruct the user to
interpret them, and pass them on to the proper event handler.
                                                                        select the eight ball. For the intermediate category, we used dice.
Voice recognition has the potential to greatly increase the
                                                                        This allowed us to construct a number of intermediate complexity
efficiency of interaction between system and user. Furthermore, it
                                                                        shapes. We considered them to be intermediate, because the edges
is simple to include basic functions such as a speech lock, so that
                                                                        were always straight and in a 2D image, the shapes formed by the
the user can easily turn on/off voice recognition.
                                                                        dice are essentially polygons. Finally, for the complex images, we
2.1.3 Multi-Touch Interaction                                           chose to use images of horses. This is obviously a more complex
For multi-touch, an open source library called MT4J                     shape than the previous example, and it still allows for easy
(http://www.mt4j.org) was used. This library allows the Windows         instruction of what to select, because each of the images contained
7 touch screen commands to be used within a Java application.           a brown pony and a larger whitish/greyish horse.
From here, it is possible to implement custom gesture processors,
                                                                        2.2.2 Prototype Interaction Methods
or use a number of predefined processors. Touch interaction can
be applied to QBE, and a number of other interactions with the          2.2.2.1 The Anchor Method
user. Beyond this, the library allows creation of custom multi-         The anchor method combines interaction styles of gaze, voice and
touch user interface components. Another benefit is that it is          either the mouse or touch screen. The user looks at the center of
simple to create stand-alone multi-touch applications and then          the object they want to select, then says the command “set
embed them in the system. This follows the previously mentioned         anchor”. This places a small selection circle on screen where the
evolutionary prototyping engineering methodology, because it            user was looking. Next to this selection circle is a slider object
easily allows simple standalone prototypes to be developed, then        which can slide left to decrease the radius of the selection circle,
integrated into the existing system. For our experiment, a Dell         or right to increase the radius of the selection circle. The slider
SX2210T Touch Screen Monitor was used                                   can be adjusted using either mouse or touch, depending on the
                                                                        user’s preference.
2.1.4 Traditional GUI Components
Because the subcomponents of the multimodal user interface were         2.2.2.2 Gaze Interaction
developed in Java, the Swing GUI libraries can be used to create        Unlike the anchor method, this method uses eye tracking almost
traditional visual components and handle input from the mouse           exclusively. The user finds the object to select, then clicks a
and keyboard. This also makes developing the basic framework            button using either mouse or touch screen to begin eye tracking.
for the user interface (i.e windowing and layout structure) very        Once turned on, the program begins painting over the area to
simple, because Java’s Swing library includes classes for a UI          provide feedback, as the user glances over the object. When
finished, the user presses the same button to stop the eye tracker.     participants missed, a measure of precision by showing excess
Alternatively, eye tracking can be started by saying the command        selection as the percentage of the users total selection that was not
“start eye tracking” and stopped by saying, “stop eye tracking”.        the object, and a measure of efficiency by showing the time to
While painting, saccades are not drawn; rather fixations are            complete the image.
visualized by placing translucent circles on the screen. The radius
of the circle is determined by the fixation duration (i.e a longer      3.2 Efficiency of Interaction Methods
fixation duration means a larger radius).                               Descriptive statistical analysis of the data was performed to
                                                                        determine efficiency of the different prototypes in terms of
2.2.2.3 Mouse Selection                                                 accuracy, precision, and time to complete. Box plots were
For this method, the user finds the object of interest and then         constructed to show the comparison of the different prototypes.
presses and holds the mouse button to begin drawing a selection
window. The selection auto-completes by always drawing a
straight line from the point of the initial click to where the mouse
is currently located. When the user finishes their selection, they
simply release the mouse button.
2.2.2.4 Touch Selection
This method works similarly to mouse selection except that rather
than pointing and clicking with the mouse, the user traces the
object with a finger to form the selection window. The window
auto-completes in the same fashion as for mouse selection.


                                                                                                    Figure 2.a




Figure 1. From left to right, images from the intermediate,
complex and simple categories. The first is a selection made
using the touch screen. The second uses gaze interaction, and
the third uses the anchor method
2.2.3 Metrics
To evaluate the usability attributes of efficiency and usefulness
for each style of interaction we defined several metrics. Accuracy
was measured by calculating the area of the object in the image
(in pixels) prior to selection using the GNU Image Manipulation                                     Figure 2.b
Program (GIMP), then calculating the area of the object in a given
selection. To determine the percentage of the object the user
missed. Precision was determined by calculating how much of the
users selection was outside of the object. The amount of excess
selection (in pixels), was divided by the total selection (in pixels)
to calculate a relative excess value of the user’s selection.
Efficiency of the different modes was determined by measuring
the time (in seconds) to complete a selection. We also asked the
users to rate each of the prototypes in three categories on a scale
from one to five. The categories were ease-of-use, ease-of-
learning, and how natural the method felt. Also, we counted the
number of times the user had to use the undo function. These
measurements show more the usability of a prototype rather than                                     Figure 2.c
its efficiency and accuracy.                                            Figures 2.a-2.c show comparison of box plots of the data
3. Data Analysis                                                        collected from the nine participants on all four interaction
                                                                        methods for one of the images of horses. 2.a shows the
3.1 Data Collection                                                     percentage of the selection that was excess, 2.b shows the
Camtasia Studio (TechSmith) was used to record the screen               percentage of the object missed by the user, and 2.c shows the
during the study. Data were extracted from the images captured          time taken to complete the selection
from the video. These images showed the participants selections         In all three of the plots above, the touch screen method has the
for each of the ten images four separate times (one for each            most consistent results (smaller size of the box). The touch screen
method). The data extracted included, the area (in pixels) that they    also has the lowest median value for percentage of the selection
selected within the object, and the area that was excess selection.     missed and time taken to complete. For percentage of excess
Again, the values were measured using GIMP. Viewing of the              selection, the mouse has the lowest median, but the touch screen
data suggested that the best way to effectively show the                still had a more consistent set of values in which the bulk of the
comparison of the four prototypes would be to show a measure of         values were lower than those from the mouse.
accuracy by displaying the percentage of the object that the
Table 1. The table below shows the average values of excess            requires the user to coordinate between their hand and eye without
 selection, percentage of the object missed, and time taken for          the hand being in their field of vision. Furthermore, the average
                  all four prototype methods.                            user prefers to use a mouse or touch screen for this type of task.
                 Anchor Method         Touch      Mouse       Gaze       4.1.3 Individual Differences
   Excess            48.4%             17.7%       17.1%      49.4%      Finally, our study metrics show that interaction with the mouse
  Missed             9.0%              4.7%        9.8%       7.6%       and touch screen is generally consistent across participants,
                                                                         whereas there is greater variability with eye tracking, This
  Time (s)            17.6              13.9        16.3       20.8
                                                                         probably occurs because using one’s eyes to select or trace
                                                                         something is not natural, and so while some people may learn the
3.3 User Preference                                                      method very quickly, others will not.
  Table 2. The table below shows the average values of user              4.1.4 Future Studies
 preference (scale of one to five) and the average undo usage            Studies are ongoing to prototype and test additional interaction
                    for all four prototypes                              styles which may be useful for image retrieval. For example, a
                                                                         study to show the efficiency of different modes in a search related
                           Anchor        Touch      Mouse      Gaze
                                                                         task, like scrolling, selection of an entire image from a set, or
     Ease-of-Use             2.9          4.5         4.7       3.3      using gestures, see [10], would be useful. This would be
  Ease-of-Learning           3.5          4.8         4.4       3.8      interesting to see, because it might be the case that in these types
                                                                         of tasks, mouse and touch screen are not the most efficient. We
       Natural               2.6          4.7          4        2.4      are also engaged in using gaze for implicit interaction, such as in
     Undo Usage               8            1           1         1       [5,9], towards our long-term goals of creating adaptive,
                                                                         multimodal systems for image retrieval.
The table above clearly shows that the mouse and touch screen
received higher ratings than the two methods using eye tracking.         5. ACKNOWLEDGMENTS
In general, the users were in agreement about the different              This work is supported by NSF grant IIS-0941452. Any opinions,
prototypes, with the standard deviation on average being below           findings, conclusions, or recommendations expressed in this
one (SD ≈ .86). Undo usage was fairly low with the average user          material are those of the authors and do not necessarily reflect the
pressing undo just once per 10 images when using touch, mouse,           views of the NSF.
or gaze. However, the Anchor method had a significantly higher
undo usage. Furthermore, the variance with undo usage for the            6. REFERENCES
anchor method is relatively high (SD ≈ 10.2). This variance is           [1] Deserno TM, Antani S, Long R. Ontology of gaps in
likely caused by a combination of the high learning curve that this          content-based image retrieval. J Digit Imaging.2009
method has. It requires the user to coordinate use of three input            Apr;22(2):202-15. Epub 2008 Feb 1.
methods. Furthermore, the inaccuracy of the eye tracker, plus or
                                                                         [2] Lew S.L., Sebe N., Lifl D. C., and J. Ramesh. Content-based
minus two visual degrees, plays a more significant factor here,
                                                                             multimedia information retrieval: State of the art and
because unlike the gaze method where the user can see where they
                                                                             challenges. ACM Transactions on Multimedia Computing,
are painting, and adjust their eyes, in this method if the tracker is
                                                                             Communications and Applications, 2(1): 1-19, 2006.
off, then the user only sees this after the anchor is placed. Then
the user must click undo.                                                [3] Müller H, Michoux N, Bandon D, A. Geissbuhler. A review
                                                                             of CBIR systems in medical applications-clinical benefits
4. Conclusions                                                               and future directions. Int J Med Inform., 73(1):1-23, 2004.
4.1.1 Eye Tracking Interaction Methods                                   [4] Qvarfordt P. and Zhai S. Conversing with the User Based on
This study shows clearly that using eye tracking for explicit user           Eye-Gaze Patterns. Proc. CHI (2005), ACM, 221-230.
interaction in a task that requires the user to be precise and           [5] Sadeghi M., TienG., Hamarneh G., and Atkins A. . Hands-
accurate is not effective. This is not surprising since people have          free Interactive Image Segmentation Using Eyegaze. In SPIE
difficulty with smooth pursuit, that might be required for drawing           Medical Imaging, 2009.
or tracing activities, when objects are stationary [8] This, in
combination with some inaccuracy of the eye tracker, does not            [6] Ren, J., Zhao, R., Feng, D.D., and Siu, W. Multimodal
allow enough accuracy using interaction styles implemented for               Interface Techniques in Content-Based Multimedia
this study. It is more likely that implicit interaction i.e. selection       Retrieval. In Proceedings of ICMI. 2000, 634-641.
based on more natural gaze behavior as a user is browsing or             [7] Kumar, M., and Winograd, T. Gaze-enhanced Scrolling
examining an image, such as in [5,9], will be effective for QBE.             Techniques, UIST: Symposium on User Interface Software
                                                                             and Technology. New Port, RI. 2007
4.1.2 Touch Screen and Mouse Interaction Methods
For the user group studied here, touch screen and mouse show             [8] Krauzlis, RJ. The control of voluntary eye movements: new
similar results for a task such as tracing/selecting. The general            perspectives. The Neuroscientist. 2005 Apr;11(2):124-37.
case is that touch screen is slightly more efficient than the mouse.         PMID 15746381.
However, when we consider the images from the category of                [9] Santella, A., Agrawala, M., DeCarlo D., Saleshin, D., Cohen,
complexly shaped images, it is apparent that the trend does not              M., Gaze-Based Interaction for Semi-Automatic Photo
apply. The touch screen is more efficient than the mouse. This is            Cropping. CHI proceedings: Collecting and Editing Photos,
likely caused by the fact that the touch screen is more natural than         2006
mouse even for technically-savvy, college-age participants
                                                                         [10] Heikkilä, H., Räihä, K-J. Speed and Accuracy of Gaze
because it is closer to the human’s natural interaction process. In
                                                                              Gestures, Journal of Eye Movement Research. 2009
contrast, the mouse somewhat mimics a natural interaction, but
Exploring Modes for Image Retrieval

Mais conteúdo relacionado

Mais procurados

Aum workshop paper_presentation
Aum workshop paper_presentationAum workshop paper_presentation
Aum workshop paper_presentationAhmad Ammari
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Editor IJARCET
 
Implementation of Knowledge Based Authentication System Using Persuasive Cued...
Implementation of Knowledge Based Authentication System Using Persuasive Cued...Implementation of Knowledge Based Authentication System Using Persuasive Cued...
Implementation of Knowledge Based Authentication System Using Persuasive Cued...IOSR Journals
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization ijcga
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization ijcga
 
A Pattern Language for semi-automatic generation of Digital Animation through...
A Pattern Language for semi-automatic generation of Digital Animation through...A Pattern Language for semi-automatic generation of Digital Animation through...
A Pattern Language for semi-automatic generation of Digital Animation through...Pedro Henrique Cacique Braga
 
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...Kalle
 
Inverted File Based Search Technique for Video Copy Retrieval
Inverted File Based Search Technique for Video Copy RetrievalInverted File Based Search Technique for Video Copy Retrieval
Inverted File Based Search Technique for Video Copy Retrievalijcsa
 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesCSCJournals
 
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical Images
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical ImagesAdvanced Fuzzy Logic Based Image Watermarking Technique for Medical Images
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical ImagesIJARIIT
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...IJERD Editor
 
Istance Designing Gaze Gestures For Gaming An Investigation Of Performance
Istance Designing Gaze Gestures For Gaming An Investigation Of PerformanceIstance Designing Gaze Gestures For Gaming An Investigation Of Performance
Istance Designing Gaze Gestures For Gaming An Investigation Of PerformanceKalle
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneKalle
 
International Journal of Image Processing (IJIP) Volume (3) Issue (2)
International Journal of Image Processing (IJIP) Volume (3) Issue (2)International Journal of Image Processing (IJIP) Volume (3) Issue (2)
International Journal of Image Processing (IJIP) Volume (3) Issue (2)CSCJournals
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksLaurens De Vocht
 

Mais procurados (19)

Aum workshop paper_presentation
Aum workshop paper_presentationAum workshop paper_presentation
Aum workshop paper_presentation
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
 
Implementation of Knowledge Based Authentication System Using Persuasive Cued...
Implementation of Knowledge Based Authentication System Using Persuasive Cued...Implementation of Knowledge Based Authentication System Using Persuasive Cued...
Implementation of Knowledge Based Authentication System Using Persuasive Cued...
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
A Pattern Language for semi-automatic generation of Digital Animation through...
A Pattern Language for semi-automatic generation of Digital Animation through...A Pattern Language for semi-automatic generation of Digital Animation through...
A Pattern Language for semi-automatic generation of Digital Animation through...
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...
Kinnunen Towards Task Independent Person Authentication Using Eye Movement Si...
 
Inverted File Based Search Technique for Video Copy Retrieval
Inverted File Based Search Technique for Video Copy RetrievalInverted File Based Search Technique for Video Copy Retrieval
Inverted File Based Search Technique for Video Copy Retrieval
 
Scientific visualization
Scientific visualizationScientific visualization
Scientific visualization
 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
 
Making Conversations Visible
Making Conversations VisibleMaking Conversations Visible
Making Conversations Visible
 
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical Images
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical ImagesAdvanced Fuzzy Logic Based Image Watermarking Technique for Medical Images
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical Images
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
 
Istance Designing Gaze Gestures For Gaming An Investigation Of Performance
Istance Designing Gaze Gestures For Gaming An Investigation Of PerformanceIstance Designing Gaze Gestures For Gaming An Investigation Of Performance
Istance Designing Gaze Gestures For Gaming An Investigation Of Performance
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze Alone
 
International Journal of Image Processing (IJIP) Volume (3) Issue (2)
International Journal of Image Processing (IJIP) Volume (3) Issue (2)International Journal of Image Processing (IJIP) Volume (3) Issue (2)
International Journal of Image Processing (IJIP) Volume (3) Issue (2)
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social Networks
 
Image Annotation
Image AnnotationImage Annotation
Image Annotation
 

Destaque

Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]
Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]
Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]Sahil Anande
 
Material 1 introduction to eviews
Material 1 introduction to eviewsMaterial 1 introduction to eviews
Material 1 introduction to eviewsDr. Vignes Gopal
 
Scholarship Activities of Ferrum College Faculty and Staff
Scholarship Activities of Ferrum College Faculty and StaffScholarship Activities of Ferrum College Faculty and Staff
Scholarship Activities of Ferrum College Faculty and Staffferrumcollege
 
Log Management for PCI Compliance [OLD]
Log Management for PCI Compliance [OLD]Log Management for PCI Compliance [OLD]
Log Management for PCI Compliance [OLD]Anton Chuvakin
 
#GSummit: Using Gamification to Drive Salesforce Engagement
#GSummit: Using Gamification to Drive Salesforce Engagement#GSummit: Using Gamification to Drive Salesforce Engagement
#GSummit: Using Gamification to Drive Salesforce EngagementLevelEleven
 
Project sponsorship - Bridging the Gap. APM event - 21 July 2015
Project sponsorship - Bridging the Gap. APM event - 21 July 2015Project sponsorship - Bridging the Gap. APM event - 21 July 2015
Project sponsorship - Bridging the Gap. APM event - 21 July 2015Association for Project Management
 
Five Best and Five Worst Practices for SIEM by Dr. Anton Chuvakin
Five Best and Five Worst Practices for SIEM by Dr. Anton ChuvakinFive Best and Five Worst Practices for SIEM by Dr. Anton Chuvakin
Five Best and Five Worst Practices for SIEM by Dr. Anton ChuvakinAnton Chuvakin
 
Tutorial spyout
Tutorial spyoutTutorial spyout
Tutorial spyoutpyhasse
 
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...ijceronline
 
Inspirational Storytelling (On Public Speaking)
Inspirational Storytelling (On Public Speaking)Inspirational Storytelling (On Public Speaking)
Inspirational Storytelling (On Public Speaking)Montecarlo -
 
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...Richa Ranjan
 
A study on the presence of fecal pollution indicator
A study on the presence of fecal pollution indicatorA study on the presence of fecal pollution indicator
A study on the presence of fecal pollution indicatoriaemedu
 
Chile 1208 generating innovation hubs 2010
Chile 1208 generating innovation hubs 2010Chile 1208 generating innovation hubs 2010
Chile 1208 generating innovation hubs 2010Stanford University
 
디지털아카이빙계획V03312010
디지털아카이빙계획V03312010디지털아카이빙계획V03312010
디지털아카이빙계획V03312010광영 김
 
연말정산계산흐름(2012년)
연말정산계산흐름(2012년)연말정산계산흐름(2012년)
연말정산계산흐름(2012년)dandystar
 
Oracle Vs Google
Oracle Vs GoogleOracle Vs Google
Oracle Vs Googlesrchalla
 
IRS Reporting Requirement 6055 And 6056
IRS Reporting Requirement 6055 And 6056IRS Reporting Requirement 6055 And 6056
IRS Reporting Requirement 6055 And 6056TeemWurk
 

Destaque (20)

Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]
Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]
Qcl 15-v4 [5-bestpractices]_[vjti]_[sahilanande]
 
Material 1 introduction to eviews
Material 1 introduction to eviewsMaterial 1 introduction to eviews
Material 1 introduction to eviews
 
Scholarship Activities of Ferrum College Faculty and Staff
Scholarship Activities of Ferrum College Faculty and StaffScholarship Activities of Ferrum College Faculty and Staff
Scholarship Activities of Ferrum College Faculty and Staff
 
Log Management for PCI Compliance [OLD]
Log Management for PCI Compliance [OLD]Log Management for PCI Compliance [OLD]
Log Management for PCI Compliance [OLD]
 
#GSummit: Using Gamification to Drive Salesforce Engagement
#GSummit: Using Gamification to Drive Salesforce Engagement#GSummit: Using Gamification to Drive Salesforce Engagement
#GSummit: Using Gamification to Drive Salesforce Engagement
 
Project sponsorship - Bridging the Gap. APM event - 21 July 2015
Project sponsorship - Bridging the Gap. APM event - 21 July 2015Project sponsorship - Bridging the Gap. APM event - 21 July 2015
Project sponsorship - Bridging the Gap. APM event - 21 July 2015
 
Five Best and Five Worst Practices for SIEM by Dr. Anton Chuvakin
Five Best and Five Worst Practices for SIEM by Dr. Anton ChuvakinFive Best and Five Worst Practices for SIEM by Dr. Anton Chuvakin
Five Best and Five Worst Practices for SIEM by Dr. Anton Chuvakin
 
Tutorial spyout
Tutorial spyoutTutorial spyout
Tutorial spyout
 
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...
Ultrasonic velocity and allied parameters of tetrahexylammonium iodidein bina...
 
Inspirational Storytelling (On Public Speaking)
Inspirational Storytelling (On Public Speaking)Inspirational Storytelling (On Public Speaking)
Inspirational Storytelling (On Public Speaking)
 
Fourier series
Fourier seriesFourier series
Fourier series
 
Agriprom Dairy
Agriprom DairyAgriprom Dairy
Agriprom Dairy
 
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...
Comparative Analysis of Compesation Structure of HPC Ltd. vis-a-vis Other Com...
 
A study on the presence of fecal pollution indicator
A study on the presence of fecal pollution indicatorA study on the presence of fecal pollution indicator
A study on the presence of fecal pollution indicator
 
Chile 1208 generating innovation hubs 2010
Chile 1208 generating innovation hubs 2010Chile 1208 generating innovation hubs 2010
Chile 1208 generating innovation hubs 2010
 
디지털아카이빙계획V03312010
디지털아카이빙계획V03312010디지털아카이빙계획V03312010
디지털아카이빙계획V03312010
 
연말정산계산흐름(2012년)
연말정산계산흐름(2012년)연말정산계산흐름(2012년)
연말정산계산흐름(2012년)
 
Oracle Vs Google
Oracle Vs GoogleOracle Vs Google
Oracle Vs Google
 
Summit FAQs
Summit FAQsSummit FAQs
Summit FAQs
 
IRS Reporting Requirement 6055 And 6056
IRS Reporting Requirement 6055 And 6056IRS Reporting Requirement 6055 And 6056
IRS Reporting Requirement 6055 And 6056
 

Semelhante a Exploring Modes for Image Retrieval

Using Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsUsing Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsArnold Rudorfer
 
IRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET Journal
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsAndrea Ferracani
 
A Framework To Generate 3D Learning Experience
A Framework To Generate 3D Learning ExperienceA Framework To Generate 3D Learning Experience
A Framework To Generate 3D Learning ExperienceNathan Mathis
 
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...Editor IJMTER
 
Image recognition
Image recognitionImage recognition
Image recognitionJoel Jose
 
OpenVis Conference Report Part 1 (and Introduction to D3.js)
OpenVis Conference Report Part 1 (and Introduction to D3.js)OpenVis Conference Report Part 1 (and Introduction to D3.js)
OpenVis Conference Report Part 1 (and Introduction to D3.js)Keiichiro Ono
 
Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? IJORCS
 
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Kalle
 
Synopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji ConversionSynopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji ConversionIRJET Journal
 
Sign Language Detection using Action Recognition
Sign Language Detection using Action RecognitionSign Language Detection using Action Recognition
Sign Language Detection using Action RecognitionIRJET Journal
 
Embry-Riddle Campus Solutions UX Design
Embry-Riddle Campus Solutions UX Design Embry-Riddle Campus Solutions UX Design
Embry-Riddle Campus Solutions UX Design paulodavila
 
IRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET Journal
 
Note on Tool to Measure Complexity
Note on Tool to Measure Complexity Note on Tool to Measure Complexity
Note on Tool to Measure Complexity John Thomas
 
Human Computer Interaction
Human Computer InteractionHuman Computer Interaction
Human Computer InteractionIRJET Journal
 
A case study analysis on digital convergent design: Skynet Platform
A case study analysis on digital convergent design: Skynet PlatformA case study analysis on digital convergent design: Skynet Platform
A case study analysis on digital convergent design: Skynet Platformdi8it
 
Eye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIEye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIDr. Amarjeet Singh
 

Semelhante a Exploring Modes for Image Retrieval (20)

C0353018026
C0353018026C0353018026
C0353018026
 
Using Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product RequirementsUsing Evolutionary Prototypes To Formalize Product Requirements
Using Evolutionary Prototypes To Formalize Product Requirements
 
IRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine Learning
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing Systems
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing Systems
 
A Framework To Generate 3D Learning Experience
A Framework To Generate 3D Learning ExperienceA Framework To Generate 3D Learning Experience
A Framework To Generate 3D Learning Experience
 
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
 
Image recognition
Image recognitionImage recognition
Image recognition
 
OpenVis Conference Report Part 1 (and Introduction to D3.js)
OpenVis Conference Report Part 1 (and Introduction to D3.js)OpenVis Conference Report Part 1 (and Introduction to D3.js)
OpenVis Conference Report Part 1 (and Introduction to D3.js)
 
Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies?
 
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
 
Synopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji ConversionSynopsis of Facial Emotion Recognition to Emoji Conversion
Synopsis of Facial Emotion Recognition to Emoji Conversion
 
Sign Language Detection using Action Recognition
Sign Language Detection using Action RecognitionSign Language Detection using Action Recognition
Sign Language Detection using Action Recognition
 
Embry-Riddle Campus Solutions UX Design
Embry-Riddle Campus Solutions UX Design Embry-Riddle Campus Solutions UX Design
Embry-Riddle Campus Solutions UX Design
 
IRJET- Sign Language Interpreter
IRJET- Sign Language InterpreterIRJET- Sign Language Interpreter
IRJET- Sign Language Interpreter
 
Note on Tool to Measure Complexity
Note on Tool to Measure Complexity Note on Tool to Measure Complexity
Note on Tool to Measure Complexity
 
Human Computer Interaction
Human Computer InteractionHuman Computer Interaction
Human Computer Interaction
 
40120140501006
4012014050100640120140501006
40120140501006
 
A case study analysis on digital convergent design: Skynet Platform
A case study analysis on digital convergent design: Skynet PlatformA case study analysis on digital convergent design: Skynet Platform
A case study analysis on digital convergent design: Skynet Platform
 
Eye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIEye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AI
 

Mais de mrgazer

Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...mrgazer
 
Van der kamp.2011.gaze and voice controlled drawing
Van der kamp.2011.gaze and voice controlled drawingVan der kamp.2011.gaze and voice controlled drawing
Van der kamp.2011.gaze and voice controlled drawingmrgazer
 
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...Tonkin.2011.eye tracking within the packaging design workflow interaction wit...
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...mrgazer
 
Paulin hansen.2011.gaze interaction from bed
Paulin hansen.2011.gaze interaction from bedPaulin hansen.2011.gaze interaction from bed
Paulin hansen.2011.gaze interaction from bedmrgazer
 
Spakov.2011.comparison of gaze to-objects mapping algorithms
Spakov.2011.comparison of gaze to-objects mapping algorithmsSpakov.2011.comparison of gaze to-objects mapping algorithms
Spakov.2011.comparison of gaze to-objects mapping algorithmsmrgazer
 
Schneider.2011.an open source low-cost eye-tracking system for portable real-...
Schneider.2011.an open source low-cost eye-tracking system for portable real-...Schneider.2011.an open source low-cost eye-tracking system for portable real-...
Schneider.2011.an open source low-cost eye-tracking system for portable real-...mrgazer
 
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environments
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environmentsMardanbegi.2011.mobile gaze based screen interaction in 3 d environments
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environmentsmrgazer
 
Koesling.2011.towards intelligent user interfaces anticipating actions in com...
Koesling.2011.towards intelligent user interfaces anticipating actions in com...Koesling.2011.towards intelligent user interfaces anticipating actions in com...
Koesling.2011.towards intelligent user interfaces anticipating actions in com...mrgazer
 
Skovsgaard.2011.evaluation of a remote webcam based eye tracker
Skovsgaard.2011.evaluation of a remote webcam based eye trackerSkovsgaard.2011.evaluation of a remote webcam based eye tracker
Skovsgaard.2011.evaluation of a remote webcam based eye trackermrgazer
 

Mais de mrgazer (9)

Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
Yamamoto.2011.hyakunin eyesshu a tabletop hyakunin-isshu game with computer o...
 
Van der kamp.2011.gaze and voice controlled drawing
Van der kamp.2011.gaze and voice controlled drawingVan der kamp.2011.gaze and voice controlled drawing
Van der kamp.2011.gaze and voice controlled drawing
 
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...Tonkin.2011.eye tracking within the packaging design workflow interaction wit...
Tonkin.2011.eye tracking within the packaging design workflow interaction wit...
 
Paulin hansen.2011.gaze interaction from bed
Paulin hansen.2011.gaze interaction from bedPaulin hansen.2011.gaze interaction from bed
Paulin hansen.2011.gaze interaction from bed
 
Spakov.2011.comparison of gaze to-objects mapping algorithms
Spakov.2011.comparison of gaze to-objects mapping algorithmsSpakov.2011.comparison of gaze to-objects mapping algorithms
Spakov.2011.comparison of gaze to-objects mapping algorithms
 
Schneider.2011.an open source low-cost eye-tracking system for portable real-...
Schneider.2011.an open source low-cost eye-tracking system for portable real-...Schneider.2011.an open source low-cost eye-tracking system for portable real-...
Schneider.2011.an open source low-cost eye-tracking system for portable real-...
 
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environments
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environmentsMardanbegi.2011.mobile gaze based screen interaction in 3 d environments
Mardanbegi.2011.mobile gaze based screen interaction in 3 d environments
 
Koesling.2011.towards intelligent user interfaces anticipating actions in com...
Koesling.2011.towards intelligent user interfaces anticipating actions in com...Koesling.2011.towards intelligent user interfaces anticipating actions in com...
Koesling.2011.towards intelligent user interfaces anticipating actions in com...
 
Skovsgaard.2011.evaluation of a remote webcam based eye tracker
Skovsgaard.2011.evaluation of a remote webcam based eye trackerSkovsgaard.2011.evaluation of a remote webcam based eye tracker
Skovsgaard.2011.evaluation of a remote webcam based eye tracker
 

Último

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Último (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Exploring Modes for Image Retrieval

  • 1. Exploring Interaction Modes for Image Retrieval Corey Engelman1 Rui Li1 Jeff Pelz2 Pengcheng Shi1 Anne Haake1 ABSTRACT applications, where information about the images can be extracted The number of digital images in use is growing at an increasing from experts and utilized. Major questions remain as to how best rate across a wide array of application domains. That being said, to bring users “into the loop” [2,3]. there is an ever-growing need for innovative ways to help end- Multimodal user interfaces are promising as the interactive users gain access to these images quickly and effectively. component of CBIR systems because different modes are best Moreover, it is becoming increasingly more difficult to manually suited to expressing different kinds of information. Recent annotate these images, for example with text labels, to generate research efforts have been focused on developing and studying useful metadata. One such method for helping users gain access to usability for multimodal interaction [4,5,6]. Designing natural, digital images is content-based image retrieval (CBIR). Practical usable interaction will require an understanding of which user use of CBIR systems has been limited by several “gaps”, interactions should be explicit and which implicit. Consider query including the well-known semantic gap and usability gaps [1]. by example (QBE), which requires users to select a representative Innovative designs are needed to bring end users into the loop to image and often a region of that image. It is the usual paradigm in bridge these gaps. Our human-centered approaches integrate CBIR but users have difficulty forming such queries. There is a human perception and multimodal interaction to facilitate more need for innovative new methods to support QBE. Beyond QBE, usable and effective image retrieval. Here we show that multi- more effective methods are needed for gaining input from the user touch interaction is more usable than gaze based interaction for for relevance feedback to refine the results of a search. For explicit image region selection. 1 example, this could be done explicitly, by actually having the user directly specify which images were close to what they were Categories and Subject Descriptors looking for, or implicitly by simply making note of which images H.5.2 [Information Interfaces and Presentation]: User they looked at with interest (e.g via gaze). Finally, better Interfaces – Graphical user interfaces, input devices and organization of the images returned from a query is as important strategies, prototyping, user-centered design, voice I/O, as the underlying retrieval system itself, in that it allows the user interaction styles. to quickly scan the results and find what they are looking for. General Terms Our approach to overcoming the interactivity challenges of CBIR Measurement, Performance, Design, Experimentation, Human is largely based on bringing the user into the process by Factors combining traditional modes of input such as the keyboard and mouse with interaction styles that may be more natural such as Keywords gaze input (eye-tracking), voice recognition, and multi-touch Multimodal, eye tracking, image retrieval, human-centered interaction. A software framework was developed for such a computing system using existing graphical user interface (GUI) libraries and then designing several subcomponents that allow for interaction 1. INTRODUCTION via the new methods within a GUI. With the implementation of Research in CBIR has shown that image content is more this basic framework for multimodal interface design it is now expressive of users’ perception than is textual annotation. A possible to quickly develop and test prototypes for different semantic gap occurs, however, when low-level image features, interface layouts and even prototypes for different modes of such as color or texture, are insufficient in completely interaction using one or more of the input modes (mouse, representing an image in a way that reflects human perception. keyboard, gaze, voice, touch). One possible way to bridge the semantic gap is to take a “human- A series of studies will be performed to determine which of these centered” approach in system design. This is particularly prototypes are most efficient and usable across a range of image important in knowledge rich domains, such as biomedical types and among varied end user groups. The first of these, described here, involves study of modes of interaction for 1 B. Thomas Golisano College of Computing and Information Sciences, performing QBE through explicit region of interest selection. The Rochester Institute of Technology main goal is to effectively compare the efficiency of different 1 Lomb Memorial Drive, Rochester, NY 14623-5603 interaction methods, as well as user preference, ease-of-use, and {cde7825, rxl5604, spcast, arhics}@rit.edu ease-of-learning. 2 College of Imaging Arts and Sciences, Rochester Institute Technology 2. Methods 1 Lomb Memorial Drive, Rochester, NY 14623-5603 {jbppph }@rit.edu 2.1 Design And Implementation The best approach to developing a multimodal user interface such Permission to make digital or hard copies of all or part of this work for as the one described here is an evolutionary approach. This means personal or classroom use is granted without fee provided that copies are breaking the overall large goal of building a multimodal user not made or distributed for profit or commercial advantage and that copies interface into smaller obtainable goals, and designing, bear this notice and the full citation on the first page. To copy otherwise, implementing, testing, and integrating these smaller portions. In or republish, to post on servers or to redistribute to lists, requires prior this way, the developer can ensure that separate components are specific permission and/or a fee. not dependent on one another, because one builds stand-alone subsystems, and then integrates them. NGCA '11, May 26-27 2011, Karlskrona, Sweden Copyright 2011 ACM 978-1-4503-0680-5/11/05…$10.00.
  • 2. 2.1.1 Eye Tracking window (JFrame) and the LayoutManager class for managing The Sensomotoric Instruments (SMI) RED 250 Hz eye-tracking placement of components within the window. Furthermore, a device, was used to track the position of the user’s gaze on the system for allowing rapid prototyping of UI layouts can be put in monitor. SMI’s iViewX software was used to run the eye tracker place to facilitate development. This involves creating an Abstract during use and SMI’s Experiment Center was used to perform a class called PrototypeUI that inherits from javas JFrame class. calibration prior to use. Our custom software, written in Java, Any number of prototype UI layouts can be created and tested communicates with the device using Unified Data Protocol (UDP) without changing the code for core functionality of the system or to send signals to the eye-tracker to start and stop recording. Once for the previously mentioned subcomponents that are handling the eye tracker receives the start signal, it begins streaming screen different modes of input. coordinates to the program. A separate program thread can then 2.2 Experimental Design repeatedly get the new coordinates and update respective variables To evaluate prototype interaction styles for QBE, we recruited 9 corresponding to the users gaze. Because the human eye is undergraduate and graduate students at Rochester Institute of naturally jittery, it is necessary to implement an algorithm for Technology as study participants. Participants were given an smoothing/filtering the data coming from the eye tracker. Because explanation of the CBIR paradigm and of QBE and then were the system is developed in an Object Oriented Programming given a brief tutorial on each prototype mode they would be using. Language (OOP), implementing such functionality is as simple as For the study they were shown a set of ten images, four separate creating an abstract Filter class, and then creating several times, in randomized order. Each of the four times they were instances of that abstract Filter. This allows multiple different shown the ten images, their task was to perform QBE by explicit filtering algorithms to be created easily. Even this functionality region of interest selection using one of the four prototype affords a vast array of possibilities then for how the eye input data methods of interaction. Because we are not concerned in this can be used for interaction. For example, eye tracking could be study about regions of interest within objects but rather whether used to replace mouse/keyboard scrolling and panning [7]. the user can effectively select an object, we instructed the user to 2.1.2 Voice Recognition select a specific object from each image (e.g select the eight ball Java defines the Java Speech Application Programming Interface from an image of billiard balls on a pool table; see Figure 1C). (JSAPI), implemented by several open source libraries. Any 2.2.1 Image Selection implementation of the JSAPI is a suitable choice as they all When choosing the images to use for the study, there were two perform the functionality specified by Java. For our system, we main considerations. First, because we specified what to select, chose Cloud Garden JSAPI (http://www.cloudgarden.com). there was a requirement for obvious, discrete objects in the image Beyond a suitable library that implements the JSAPI, a speech to eliminate ambiguity. Next, we wanted to test our four recognition engine is required on the computer running the prototypes across a variety of images and so we defined categories multimodal system. For our system, we have used Windows of images. These categories; simple, intermediate, and complex, Speech Recognition, because it is included in the Windows were based on the complexity of the object the user was to select. operating system (Windows 7). A custom “grammar” can be For the simple category, we photographed billiard balls in written to specify which commands the system will accept. Then a different configurations. This covers both criteria, because the simple controller can be implemented to receive commands, shape is simply a circle, and it allows us to instruct the user to interpret them, and pass them on to the proper event handler. select the eight ball. For the intermediate category, we used dice. Voice recognition has the potential to greatly increase the This allowed us to construct a number of intermediate complexity efficiency of interaction between system and user. Furthermore, it shapes. We considered them to be intermediate, because the edges is simple to include basic functions such as a speech lock, so that were always straight and in a 2D image, the shapes formed by the the user can easily turn on/off voice recognition. dice are essentially polygons. Finally, for the complex images, we 2.1.3 Multi-Touch Interaction chose to use images of horses. This is obviously a more complex For multi-touch, an open source library called MT4J shape than the previous example, and it still allows for easy (http://www.mt4j.org) was used. This library allows the Windows instruction of what to select, because each of the images contained 7 touch screen commands to be used within a Java application. a brown pony and a larger whitish/greyish horse. From here, it is possible to implement custom gesture processors, 2.2.2 Prototype Interaction Methods or use a number of predefined processors. Touch interaction can be applied to QBE, and a number of other interactions with the 2.2.2.1 The Anchor Method user. Beyond this, the library allows creation of custom multi- The anchor method combines interaction styles of gaze, voice and touch user interface components. Another benefit is that it is either the mouse or touch screen. The user looks at the center of simple to create stand-alone multi-touch applications and then the object they want to select, then says the command “set embed them in the system. This follows the previously mentioned anchor”. This places a small selection circle on screen where the evolutionary prototyping engineering methodology, because it user was looking. Next to this selection circle is a slider object easily allows simple standalone prototypes to be developed, then which can slide left to decrease the radius of the selection circle, integrated into the existing system. For our experiment, a Dell or right to increase the radius of the selection circle. The slider SX2210T Touch Screen Monitor was used can be adjusted using either mouse or touch, depending on the user’s preference. 2.1.4 Traditional GUI Components Because the subcomponents of the multimodal user interface were 2.2.2.2 Gaze Interaction developed in Java, the Swing GUI libraries can be used to create Unlike the anchor method, this method uses eye tracking almost traditional visual components and handle input from the mouse exclusively. The user finds the object to select, then clicks a and keyboard. This also makes developing the basic framework button using either mouse or touch screen to begin eye tracking. for the user interface (i.e windowing and layout structure) very Once turned on, the program begins painting over the area to simple, because Java’s Swing library includes classes for a UI provide feedback, as the user glances over the object. When
  • 3. finished, the user presses the same button to stop the eye tracker. participants missed, a measure of precision by showing excess Alternatively, eye tracking can be started by saying the command selection as the percentage of the users total selection that was not “start eye tracking” and stopped by saying, “stop eye tracking”. the object, and a measure of efficiency by showing the time to While painting, saccades are not drawn; rather fixations are complete the image. visualized by placing translucent circles on the screen. The radius of the circle is determined by the fixation duration (i.e a longer 3.2 Efficiency of Interaction Methods fixation duration means a larger radius). Descriptive statistical analysis of the data was performed to determine efficiency of the different prototypes in terms of 2.2.2.3 Mouse Selection accuracy, precision, and time to complete. Box plots were For this method, the user finds the object of interest and then constructed to show the comparison of the different prototypes. presses and holds the mouse button to begin drawing a selection window. The selection auto-completes by always drawing a straight line from the point of the initial click to where the mouse is currently located. When the user finishes their selection, they simply release the mouse button. 2.2.2.4 Touch Selection This method works similarly to mouse selection except that rather than pointing and clicking with the mouse, the user traces the object with a finger to form the selection window. The window auto-completes in the same fashion as for mouse selection. Figure 2.a Figure 1. From left to right, images from the intermediate, complex and simple categories. The first is a selection made using the touch screen. The second uses gaze interaction, and the third uses the anchor method 2.2.3 Metrics To evaluate the usability attributes of efficiency and usefulness for each style of interaction we defined several metrics. Accuracy was measured by calculating the area of the object in the image (in pixels) prior to selection using the GNU Image Manipulation Figure 2.b Program (GIMP), then calculating the area of the object in a given selection. To determine the percentage of the object the user missed. Precision was determined by calculating how much of the users selection was outside of the object. The amount of excess selection (in pixels), was divided by the total selection (in pixels) to calculate a relative excess value of the user’s selection. Efficiency of the different modes was determined by measuring the time (in seconds) to complete a selection. We also asked the users to rate each of the prototypes in three categories on a scale from one to five. The categories were ease-of-use, ease-of- learning, and how natural the method felt. Also, we counted the number of times the user had to use the undo function. These measurements show more the usability of a prototype rather than Figure 2.c its efficiency and accuracy. Figures 2.a-2.c show comparison of box plots of the data 3. Data Analysis collected from the nine participants on all four interaction methods for one of the images of horses. 2.a shows the 3.1 Data Collection percentage of the selection that was excess, 2.b shows the Camtasia Studio (TechSmith) was used to record the screen percentage of the object missed by the user, and 2.c shows the during the study. Data were extracted from the images captured time taken to complete the selection from the video. These images showed the participants selections In all three of the plots above, the touch screen method has the for each of the ten images four separate times (one for each most consistent results (smaller size of the box). The touch screen method). The data extracted included, the area (in pixels) that they also has the lowest median value for percentage of the selection selected within the object, and the area that was excess selection. missed and time taken to complete. For percentage of excess Again, the values were measured using GIMP. Viewing of the selection, the mouse has the lowest median, but the touch screen data suggested that the best way to effectively show the still had a more consistent set of values in which the bulk of the comparison of the four prototypes would be to show a measure of values were lower than those from the mouse. accuracy by displaying the percentage of the object that the
  • 4. Table 1. The table below shows the average values of excess requires the user to coordinate between their hand and eye without selection, percentage of the object missed, and time taken for the hand being in their field of vision. Furthermore, the average all four prototype methods. user prefers to use a mouse or touch screen for this type of task. Anchor Method Touch Mouse Gaze 4.1.3 Individual Differences Excess 48.4% 17.7% 17.1% 49.4% Finally, our study metrics show that interaction with the mouse Missed 9.0% 4.7% 9.8% 7.6% and touch screen is generally consistent across participants, whereas there is greater variability with eye tracking, This Time (s) 17.6 13.9 16.3 20.8 probably occurs because using one’s eyes to select or trace something is not natural, and so while some people may learn the 3.3 User Preference method very quickly, others will not. Table 2. The table below shows the average values of user 4.1.4 Future Studies preference (scale of one to five) and the average undo usage Studies are ongoing to prototype and test additional interaction for all four prototypes styles which may be useful for image retrieval. For example, a study to show the efficiency of different modes in a search related Anchor Touch Mouse Gaze task, like scrolling, selection of an entire image from a set, or Ease-of-Use 2.9 4.5 4.7 3.3 using gestures, see [10], would be useful. This would be Ease-of-Learning 3.5 4.8 4.4 3.8 interesting to see, because it might be the case that in these types of tasks, mouse and touch screen are not the most efficient. We Natural 2.6 4.7 4 2.4 are also engaged in using gaze for implicit interaction, such as in Undo Usage 8 1 1 1 [5,9], towards our long-term goals of creating adaptive, multimodal systems for image retrieval. The table above clearly shows that the mouse and touch screen received higher ratings than the two methods using eye tracking. 5. ACKNOWLEDGMENTS In general, the users were in agreement about the different This work is supported by NSF grant IIS-0941452. Any opinions, prototypes, with the standard deviation on average being below findings, conclusions, or recommendations expressed in this one (SD ≈ .86). Undo usage was fairly low with the average user material are those of the authors and do not necessarily reflect the pressing undo just once per 10 images when using touch, mouse, views of the NSF. or gaze. However, the Anchor method had a significantly higher undo usage. Furthermore, the variance with undo usage for the 6. REFERENCES anchor method is relatively high (SD ≈ 10.2). This variance is [1] Deserno TM, Antani S, Long R. Ontology of gaps in likely caused by a combination of the high learning curve that this content-based image retrieval. J Digit Imaging.2009 method has. It requires the user to coordinate use of three input Apr;22(2):202-15. Epub 2008 Feb 1. methods. Furthermore, the inaccuracy of the eye tracker, plus or [2] Lew S.L., Sebe N., Lifl D. C., and J. Ramesh. Content-based minus two visual degrees, plays a more significant factor here, multimedia information retrieval: State of the art and because unlike the gaze method where the user can see where they challenges. ACM Transactions on Multimedia Computing, are painting, and adjust their eyes, in this method if the tracker is Communications and Applications, 2(1): 1-19, 2006. off, then the user only sees this after the anchor is placed. Then the user must click undo. [3] Müller H, Michoux N, Bandon D, A. Geissbuhler. A review of CBIR systems in medical applications-clinical benefits 4. Conclusions and future directions. Int J Med Inform., 73(1):1-23, 2004. 4.1.1 Eye Tracking Interaction Methods [4] Qvarfordt P. and Zhai S. Conversing with the User Based on This study shows clearly that using eye tracking for explicit user Eye-Gaze Patterns. Proc. CHI (2005), ACM, 221-230. interaction in a task that requires the user to be precise and [5] Sadeghi M., TienG., Hamarneh G., and Atkins A. . Hands- accurate is not effective. This is not surprising since people have free Interactive Image Segmentation Using Eyegaze. In SPIE difficulty with smooth pursuit, that might be required for drawing Medical Imaging, 2009. or tracing activities, when objects are stationary [8] This, in combination with some inaccuracy of the eye tracker, does not [6] Ren, J., Zhao, R., Feng, D.D., and Siu, W. Multimodal allow enough accuracy using interaction styles implemented for Interface Techniques in Content-Based Multimedia this study. It is more likely that implicit interaction i.e. selection Retrieval. In Proceedings of ICMI. 2000, 634-641. based on more natural gaze behavior as a user is browsing or [7] Kumar, M., and Winograd, T. Gaze-enhanced Scrolling examining an image, such as in [5,9], will be effective for QBE. Techniques, UIST: Symposium on User Interface Software and Technology. New Port, RI. 2007 4.1.2 Touch Screen and Mouse Interaction Methods For the user group studied here, touch screen and mouse show [8] Krauzlis, RJ. The control of voluntary eye movements: new similar results for a task such as tracing/selecting. The general perspectives. The Neuroscientist. 2005 Apr;11(2):124-37. case is that touch screen is slightly more efficient than the mouse. PMID 15746381. However, when we consider the images from the category of [9] Santella, A., Agrawala, M., DeCarlo D., Saleshin, D., Cohen, complexly shaped images, it is apparent that the trend does not M., Gaze-Based Interaction for Semi-Automatic Photo apply. The touch screen is more efficient than the mouse. This is Cropping. CHI proceedings: Collecting and Editing Photos, likely caused by the fact that the touch screen is more natural than 2006 mouse even for technically-savvy, college-age participants [10] Heikkilä, H., Räihä, K-J. Speed and Accuracy of Gaze because it is closer to the human’s natural interaction process. In Gestures, Journal of Eye Movement Research. 2009 contrast, the mouse somewhat mimics a natural interaction, but