SlideShare uma empresa Scribd logo
1 de 4
Baixar para ler offline
Eye Movement as an Interaction Mechanism for Relevance Feed-
                  back in a Content-Based Image Retrieval System
                Yun Zhang*1,2                                                                                                                   ¶
                                                                                 †                ‡
                                                                   Hong Fu 2, Zhen Liang 2, Zheru Chi 2
                                                                                                                §
                                                                                                                                   Dagan Feng 2,3,
       1                                                                                                              3
         School of Computer Science                                    2
                                                                         Centre for Multimedia Signal                     School of Information Technologies
      Northwestern Polytechnical Uni-                               Processing Department of Electronic                       The University of Sydney
       versity, Xi’an, Shaanxi, China                                   and Information Engineering                               Sydney, Australia
                                                                   The Hong Kong Polytechnic University
                                                                             Hong Kong, China
                                                                                                  ver, the subjective nature of human annotation adds another
Abstract
                                                                                                  dimension of difficulty in managing image database.
Relevance feedback (RF) mechanisms are widely adopted in                                          CBIR is an alternative solution to retrieve images. However,
Content-Based Image Retrieval (CBIR) systems to improve                                           after years of rapid growth since 1990s [Flickner et al.1995], the
image retrieval performance. However, there exist some intrinsic                                  gaps between low level features and semantic contents of images
problems: (1) the semantic gap between high-level concepts and                                    holds back the progress and has entered a plateau phase. Such
low-level features and (2) the subjectivity of human perception                                   gaps can be concretely outlined into three aspects: (1) image
of visual contents. The primary focus of this paper is to evaluate                                representation (2) similarity measure (3) user’s interaction. Most
the possibility of inferring the relevance of images based on eye                                 of the image representations are based on intuitiveness of the
movement data. In total, 882 images from 101 categories are                                       researchers and the fulfillment of mathematics, instead of hu-
viewed by 10 subjects to test the usefulness of implicit RF,                                      man’s eye behavior. Do the features extracted reflect humans’
where the relevance of each image is known beforehand. A set                                      understanding of the image’s content? There is no clear answer
of measures based on fixations are thoroughly evaluated which                                     to this question. Similarity measure is highly dependent on the
include fixation duration, fixation count, and the number of revi-                                features and structures used in image representation. Moreover,
sits. Finally, the paper proposes a decision tree to predict the                                  developing better distance descriptors and refining similarity
user’s input during the image searching tasks. The prediction                                     measures are also very challenging. User interaction can be a
precision of the decision tree is over 87%, which spreads light                                   feasible approach to answer the question and to improve the
on a promising integration of natural eye movement into CBIR                                      image retrieval performance. In the Relevance Feedback (RF)
systems in the future.                                                                            process, the user is asked to refine the searching by providing
CR Categories: H.3.3 [Information Storage and Retrieval]:                                         explicit RF, such as selecting Areas-of-Interest (AOIs) from the
Information Search and Retrieval—Relevance feedback, Search                                       query image, or to tick positive and negative samples from re-
Process; H.5.2 [Information Interfaces and Representation]:                                       trieves. In the past few years, many articles reported that RF can
User interfaces                                                                                   help to establish the association between the low-level features
Keywords: Eye Tracking, Relevance Feedback (RF), Content-                                         and the semantics of images and to improve the retrieval per-
Based Image Retrieval (CBIR), Visual Perception                                                   formance [Liu et al.2006; Dacheng Tao et al.2008].

1       Introduction                                                                              However, the explicit feedback is laborious for the user and
                                                                                                  limited in complexity. In this paper, we propose an eye move-
Numerous digital images are being produced everyday from                                          ment based implicit feedback as a rich and natural source to
digital cameras, medical devices, security monitors, and other                                    replace the time-consuming and expensive explicit feedback. As
image capturing apparatus. It has become more and more diffi-                                     far as we know, there are just a few preliminary studies on im-
cult to retrieve a desired picture even from a photo album on a                                   plementing some general eye movement features in image re-
home computer because of the exponential increase in the num-                                     trieval. One is from Oyekoya and Setntiford’s work [Oyekoya
ber of images. Most traditional and common methods of im-                                         and Stentiford.2004; Oyekoya and Stentiford.2006]. They made
age retrieval based on metadata, such as textual annotations or                                   an investigation into the fixation duration and found that they
user-specified tags, have become the industry standard for re-                                    are different on images with/without a clear AOI. The other
trieval from large image collections. However, manual image                                       work was reported by Klami et al. [Klami et al. 2008]. They
annotation is time-consuming, laborious and expensive. Moreo-                                     proposed nine-feature vectors from different forms of fixations
    *email: tvsunny@gmail.comemail:
    †
                                                                                                  and saccades and used a classifier to predict one relevant image
      email:zhenliang@eie.polyu.edu.hk                                                            from four candidates.
    ‡
      email:enhongfu@inet.polyu.edu.hk
    ‖
      email:enzheru@inet.polyu.edu.hk                                                             Different from the previous work, the study reported in this pa-
    §
      email: feng@it.usyd.edu.au                                                                  per attempts to simulate a more real and complex image retrieval
                                                                                                  situation and to quantitatively analyze the correlation between
                                                                                                  users’ eye behavior and target images (positive images). In our
Copyright © 2010 by the Association for Computing Machinery, Inc.
                                                                                                  experiments, the images come from a wide variety of web
Permission to make digital or hard copies of part or all of this work for personal or             sources, and in each task, the query image and the numbers of
classroom use is granted without fee provided that copies are not made or distributed             positive images are varied from time to time. We evaluated the
for commercial advantage and that copies bear this notice and the full citation on the            significance of fixation durations, fixation counts, and the num-
first page. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on
                                                                                                  ber of revisits to provide a systematic interoperation of the us-
servers, or to redistribute to lists, requires prior specific permission and/or a fee.            er’s attention and effort allocation in eye movements, laying a
Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail
permissions@acm.org.
ETRA 2010, Austin, TX, March 22 – 24, 2010.
© 2010 ACM 978-1-60558-994-7/10/0003 $10.00

                                                                                             37
concrete and substantial foundation to involve natural eye                  Ten participants took part in the study, four females and six
  movement as a robust RF source [Zhou and Huang. 2003].                      males in an age range from 20 to 32 all with an academic back-
                                                                              ground. All of them are proficient computer users, and half of
  The rest of the paper is organized as follows. Section 2 introduc-          them have had experience of using an eye tracking system. Their
  es experimental design and setting for relevance feedback tasks             visions are either normal or correct-to-normal. The participants
  and the corresponding eye movement data collecting. In Section              were asked to complete two sets of the above mentioned image
  3, we report our thorough investigation on using fixation dura-             searching tasks and the gaze data are recorded with a 60 Hz
  tion, fixation count and the numbers of revisits for the prediction         sampling rate. Afterwards the participants were asked to indicate
  of relevant images. These factors are performed with the ANO-               which images they have chosen as positive images to ensure the
  VA test to reveal their significances and interconnections. Sec-            accuracy of a further analysis on their eye movement data. The
  tion 4 proposes a decision tree model to predict the user’s input           eye tracker is non-intrusive and allows a 300x220x300 mm free
  during the images searching tasks. Finally, we conclude the                 head movement space. Different candidate images and the loca-
  results and propose the future work.                                        tions of positive images are ensured in and between each set of
                                                                              the task. In other words, no two images are the same and no two
  2         Design of Experiments                                             stimuli have the same positive image locations. This is to reduce
                                                                              the memory effects and to simulate the natural relevance feed-
  2.1              Task Setup                                                 back situation.
  We study an image searching task which reflects kinds of activi-
  ties occurring in a complete CBIR system. In total, 882 images              3     Analysis of Gaze Data in Image Searching
  are randomly selected from 101 object categories. The image set
  is obtained by collecting images through the Google image                   Raw gaze data are preprocessed by finding the fixations with the
  search enginee [Li 2005]. The design and examples of the                    built-in filter provided by Tobii Technology. The filter maps a
  searching task interface is shown in Fig. 1. On the top left is the         series of raw coordinates to a single fixation if the coordinates
  query image. Twenty candidate images are arranged as a 4x5                  stay sufficiently long within a sphere of a given radius. We used
  grid display. All of the images are from 101 categories such as             an interval threshold of 150 ms and a radius of 1 º visual angle.
  landscapes, animals, buildings, human faces, and home ap-                   3.1       Fixation Duration and Fixation Count
  pliances. The red blocks in Fig. 1(a) denotes the locations of
  positive images in Fig. 1(b) (Class No. 22: Pyramid). The others            The main features used in eye tracking related information re-
  are negative images and their image classes are different from              trieval are fixations and saccades [Jacob and Karn.2003]. Two
  each other. That is to say, apart from the query image’s category,          groups of derived metrics stem from the fixation: fixation dura-
  no two images in the grid are from the same category. The can-              tion and fixation count are thoroughly studied to support the
  didate images in one searching stimulus are randomly arranged.              possibility of inferring the relevance of images based on eye
                                                                              movements [Goldberg et al.2002; Gołofit 2008]. Suppose that
                                                                              FDP(m) and FDN(m) are the fixation durations on the positive
Query   Class No   Class No   Class No   Class No   Class No
Image      01         22         22         75         64                     and the negative images observed by subject m, respectively;
        Negative   Positive   Positive   Negative   Negative
                                                                              FCP(m) and FCN(m) are the fixation counts on the positive and
        Class No
           56
                   Class No
                      38
                              Class No
                                 17
                                         Class No
                                           100
                                                    Class No
                                                       12
                                                                              the negative images observed by subject m, respectively; Then
        Negative   Negative   Negative   Negative   Negative
                                                                              in our searching task, FDP(m) and FDN(m) are defined as
        Class No   Class No   Class No   Class No   Class No
           45         22         06         77         91
        Negative   Positive   Negative   Negative   Negative                                              ∑,       ,   FD                          sgn               ,
                                                                                          FDP         =
(a)     Class No   Class No   Class No   Class No   Class No
                                                               (b)                                                      ∑,          ,    sgn           ,                          (1)
           13         69         22         22         28
        Negative   Negative   Positive   Positive   Negative                                          ∑,       ,   FD                          1           sgn               ,
                                                                                         FDN      =
                                                                                                                       ∑,       ,       1     sgn           1,
  Figure 1. Image searching stimulus. (a) the layout of the search-
  ing stimulus with 5 positive images; (b) an example.                        where       0,1, … ,20 denotes the image candidate in each
                                                                              searching stimulus interface;      1,2, … ,21 denotes the stimulus
                                                                              in each searching task (it also represents the numbers of positive
  Such a simulated relevance feedback task asks each participant
                                                                              images in the current stimulus);          1,2 denotes the task set,
  to use his eye to locate the positive image on each stimulus. On
                                                                                    1,2, … ,10 represents the subject and sgn(x) is the signum
  locating the positive image, the participants select the target by
                                                                              function. Consequently, FD           is the fixation duration on the
  fixating on it for a short period of time with the eye. A set of the
                                                                              i-th image candidate of the j-th stimulus of the k-th task from
  task are composed of 21 such stimulus whose positive image
                                                                              subject m, and
  number are varied from 0 to 20. Thus, the set of task contains
  21x21 = 441 images and the total number of the negative images                              1 if subject                  regards cadidate image as positive
  and positive images are equal (210 images each).                                  ,
                                                                                              0 if subject                  regards cadidate image as negative
  2.2              Apparatus and Procedure                                    In the similar manner, FCP(m) and FCN(m) are defined as
  Eye tracking data is collected by the Tobii X120 eye tracker,                                       ∑,       ,       FC                      sgn               ,
  whose accuracy is α 0.5° and drift β 0.3°. Each candidate                              FCP      =
                                                                                                                       ∑,       ,       sgn        ,
  image has a resolution of 300 x 300 pixels and thus an image
                                                                                                     ∑,    ,       FC                          1           sgn           ,       (2)
  stimulus has 1800 x 1200 pixels. Each of stimuli is displayed on                      FCN      =
  the screen with a viewing distance of 600 mm and the screen’s                                                    ∑,       ,           1     sgn           ,
  resolution is 1920x1280 pixels and the pixel pitch is h = 0.264             where FC           is the fixation counts on the i-th image candi-
  mm. Hence the output uncertainty is just R tan α β /h =                     date of the j-th stimulus of the k-th task from subject m. The two
  30 pixels, which has ensured the error of gaze data no larger               pairs of fixation-related variables were monitored and recorded
  than 1% area of each candidate image.



                                                                         38
during the experiment. The average value and standard deviation              ing task. We can see that (1) some of the candidate images are
of ten participants are summarized in Table 1.                               never visited, which indicates the use of pre-attentive vision at
                                                                             the very beginning of the visual search [Salojärvi et al. 2004].
Table 1 Statistics on the fixation duration and fixation count on            During the pre-attentive process, all the candidate images have
                  positive and negative images                               been examined to decide the successive fixation locations; and
Sub.        FDP(m)           FDN(m)          FCP(m)        FCN(m)            (2) in our experiments, revisits happen both on positive images
          1.410±1.081      0.415±0.481       2.5±1.9       1.3±1.3           and negative images. The majority of them have just been vi-
  1
                                                                             sited once, while some of them are revisited during the image
  2       1.332±0.394      0.283±0.247       2.7±1.4       1.2±0.9           searching.
  3       2.582±1.277      0.418±0.430       5.6±3.3       1.7±1.5
  4       0.805±0.414      0.356±0.328       2.4±1.2       1.5±1.2                                 The Number of Visits ‐‐ Histogram
                                                                                 2500                2149
  5       1.154±0.484      0.388±0.284       2.6±1.4       1.5±1.0
                                                                                 2000
  6       1.880±0.926      0.402±0.338       3.0±1.9       1.4±1.0
                                                                                 1500
  7       0.987±0.397      0.166±0.283       1.7±0.8       0.6±0.7                                           878
  8       0.704±0.377      0.358±0.254       2.2±1.1       1.3±0.9               1000
                                                                                             403                     306
  9       1.125±0.674      0.329±0.403       3.0±2.0       1.4±1.5                500                                        119       65    80
 10       1.101±0.444      0.392±0.235       2.7±1.3       1.5±0.8                 0
AVG.      1.308±0.891      0.351±0.345        2.8±2.0       1.3±1.1                      No Visit 1 times 2 times 3 times 4 times 5 times > 6 times

                                                                             Figure 2 The total revisit histogram. The X-axis denotes the
Analysis of variance (ANOVA) tests are performed to find out                 number of re-fixation and Y-axis is the corresponding count
whether there are discriminating visual behaviors between the                (unit: millisecond).
observation of positive and negative images. Given the individu-
al difference in eye movements, we designed two groups of two-                   Table 3 Overall revisits on positive and negative images
way ANOVA among three factors: test subject, fixation duration               A1          1          2        3       4        5         6         >7
and fixation count. The results are shown in Table 2.
                                                                             A2         549        196      88       55      34         13        27
 Table 2 ANOVA test results among three factors: test subject,               A3         329        110       31      10       3         2         1
            fixation duration and fixation count.
                                                                             A4         878        306      119      65      37         15        28
                               GROUP I
    Factor            Levels                    Test result                  A5     63%            64%      74%    85%      92%        87%    100%
(A) Test             10 levels                                               A1 = the number of revisits on an image candidate; A2 = revisit
                                          F(9,9) = 1.26, p < 0.37
Subjects           (10 subjects)                                             counts on positive images; A3 = revisit counts on negative im-
(B) Fixation          2 levels                                               ages; A4 = the total number of revisits; A5 = the percentage of
                                        F(1,9) = 32.84, p < 0.0003
Duration          (FDP & FDN)                                                the total revisits occurring to positive images.
                               GROUP II
    Factor            Levels                    Test result                  To compare with Oyekoya and Setntiford’s work [2006], we
(A) Test             10 levels                                               investigate whether the variance of revisit counts has a different
                                          F(9,9) = 2.03, p < 0.15            effect between positive and negative image candidates over all
Subjects           (10 subjects)
(B) Fixation          2 levels                                               the participants (as shown in Table 3). When revisits counts ≥ 3
                                        F(1,9) = 28.28, p < 0.0005
Count             (FCP & FCN)                                                times, the result of one-way ANOVA is significant with F(1,8)
                                                                             = 5.73, p < 0.044. That is to say, the probability of revisits on a
As illustrated in Table 2, both fixation duration and fixation               positive image is increased with revisits counts. For example,
count revealed significant effects to positive and negative im-              when an image is revisited more than three times, it has a very
ages during simulated relevance feedback tasks. Concretely                   high probability (over 74%) to be a positive image candidate. As
speaking, the fixation durations on each positive image from all             a result, the number of revisit is also a feasible implicit relev-
the subjects (1.30 seconds) are longer than those on negative                ance feedback to drive an image retrieval engine.
image (0.35seconds). Correspondingly, the analysis of fixation
count produces similar results that subjects visit more times on a           4     Feature Extraction and Results
positive image (2.8) than on a negative one (1.3). On the other              The primary focus of this paper is on evaluating the possibility
hand, the variations of different subjects have no significant               of inferring the relevance of images based on eye movement
effects on both groups. (In GROUP I, 0.37 > α = 0.05; in GROUP II,           data. The features such as fixation duration, fixation count and
0.15 > α = 0.05).                                                            the number of revisit have shown discriminating power between
                                                                             positive and negative images. Consequently, we composed a
3.2     Number of Revisits                                                   simple set of 11 features                    , ,…,      , an eye
A revisit is defined as the re-fixation on an AOI previously fix-            movement’s vector to predict the positive images from each
ated. Much human computer interaction and usability research                 returned 4x5 image candidates set in the simulated relevance
shows that re-fixation or revisit on a target may be an indication           feedback task, where        1,2, … ,20 denotes the numbers of
of special interest on the target. Therefore, the analysis of revisit        positive images in the current stimulus;                1,2, … ,10
during the relevance feedback process may reveal the correlation             represents the subject , , … ,       are listed in Table 4, where
between the eye movement pattern and positive image candi-                       1, … ,20 and FL     FD /FC .
dates.                                                                       Table 4 Features used in relevance feedback to predict positive
                                                                                                        images
Figure 2 shows a general status of the overall visit frequency (no.
of revisits = no. of visits - 1) throughout the whole image search-



                                                                        39
NO.       Features                        Description                           color, texture, shape, and spatial information, to human attention,
                                                                                 such as AOIs. As a result, eye tracking data can be a rich and
                         Fixation duration on i-th image inside 4x5 image
             FD
                                      candidate set interface
                                                                                 new source for improving image representation [Lei Wu et al.
                          Fixation Count on i-th image inside 4x5 image
                                                                                 2009]. Our future work is to develop an eye tracking based
             FC                                                                  CBIR system in which human beings’ natural eye movements
                                      candidate set interface
                                          FL     FD /FC                          will be effectively exploited and used in the modules of image
             FL          Fixation Length on i-th image inside 4x5 image          representation, similarity measurement and relevance feedback.
                                      candidate set interface
              R
                          Revisit numbers happened on i-th image inside          Acknowledgments
                                 4x5 image candidate set interface
                                                                                 The work reported in this paper is substantially supported by the
Different from Klami et al.’s work [Klami et al. 2008], we use a                 Research Grants Council of the Hong Kong Special Administra-
decision tree (DT) as a classifier to automatically learn the pre-               tive Region, China (Project code: PolyU 5141/07E) and the
diction rules. The data set mentioned in Section 2 is divided into               PolyU Grant (Project code: 1-BBZ9).
a training and a testing sets to evaluate the prediction accuracy.
Two different methods are used to train the DT, which are illu-                  References
strated in Table 5 (prediction precisions are 87.3% and 93.5%,
respectively), and an example of predicted positive image from                   DACHENG TAO, XIAOOU TANG AND XUELONG LI. 2008. Which
4x5 candidates set is shown in Figure 3.                                         Components are Important for Interactive Image Searching Circuits and
                                                                                 Systems for Video Technology, IEEE Transactions on 18, 3-11. .
 Table 5 Training methods and testing results of decision trees
                                                                                 FLICKNER, M., SAWHNEY, H., NIBLACK, W., ASHLEY, J., HUANG,
              Method                                     I
                                                                                 Q., DOM, B., GORKANI, M., HAFNER, J., LEE, D., PETKOVIC, D.,
         Training Data Set                                    1,2, … 5           STEELE, D. AND YANKER, P. 1995. Query by Image and Video
          Testing Data Set                                    5,6, … 10          Content: The QBIC System. Computer 28, 23-32. .
        Prediction Precision                          87.3%                      GOLDBERG, J.H., STIMSON, M.J., LEWENSTEIN, M., SCOTT, N.
              Method                                     II                      AND WICHANSKY, A.M. 2002. Eye tracking in web search tasks:
                                                                                 design implications. In ETRA '02: Proceedings of the 2002 symposium
         Training Data Set                                   1,3,5 … 19
                                                                                 on Eye tracking research & applications, New Orleans, Louisiana,
          Testing Data Set                                   2,4,6 … 20          Anonymous ACM, New York, NY, USA, 51-58.
        Prediction Precision                          93.5%
                                                                                 GOŁOFIT, K. 2008. Click Passwords Under Investigation. Computer
                                                                                 Security - ESORICS 2007 343-358. .
                                                                                 JACOB, R. AND KARN, K. 2003. Eye Tracking in Human-Computer
                                                                                 Interaction and Usability Research: Ready to Deliver the Promises. In
                                                                                 The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Re-
                                                                                 search, HYONA, RADACH AND DEUBEL, Eds. Elsevier Science,
                                                                                 Oxford, England.
                                                                                 KLAMI, A., SAUNDERS, C., DE CAMPOS, T.E. AND KASKI, S. 2008.
                                                                                 Can relevance of images be inferred from eye movements? In MIR '08:
                                                                                 Proceeding of the 1st ACM international conference on Multimedia
                                                                                 information retrieval, Vancouver, British Columbia, Canada, Anonym-
                                                                                 ous ACM, New York, NY, USA, 134-140.

Figure 3 An example of predicted positive images from 4x5                        LEI WU, YANG HU, MINGJING LI, NENGHAI YU AND XIAN-
candidates set in the simulated relevance feedback task. The                     SHENG HUA. 2009. Scale-Invariant Visual Language Modeling for
                                                                                 Object Categorization. Multimedia, IEEE Transactions on 11, 286-294. .
query image is “hedgehog”, and DT model returned 8 predicted
positive images (in red frames) based on the 11 features vector                  FEIFEI. LI ,Visual recognition: computational models and human psy-
with 100% accuracy.                                                              chophysics, Phd Thesis, California Institute of Technology, 2005.
                                                                                 LIU, D., HUA, K., VU, K. AND YU, N. 2006. Fast Query Point Move-
5      Conclusion and Further Work                                               ment Techniques with Relevance Feedback for Content-Based Image
                                                                                 Retrieval. Advances in Database Technology - EDBT 2006 700-717. .
An eye tracking system can be possibly integrated into a CBIR
system as a more efficient input mechanism for implementing                      OYEKOYA, O. AND STENTIFORD, F. 2004. Exploring Human Eye
the user’s relevance feedback process. In this paper, we mainly                  Behaviour using a Model of Visual Attention. 17th International Confe-
concentrate on a group of fixation- related measurements which                   rence on (ICPR'04) Volume 4, IEEE Computer Society, Washington, DC,
                                                                                 USA, 945-948.
shows static eye movement patterns. In fact, the dynamic cha-
racteristics can also manifest human organizational behavior and                 OYEKOYA, O. AND STENTIFORD, F. 2006. Perceptual Image Retriev-
decision processes, such as saccades and scan path, which reveal                 al Using Eye Movements. Advances in Machine Vision, Image
the pre-attention and cognition process of a human being while                   Processing, and Pattern Analysis 281-289. .
viewing an image. In our further work, we will continue to de-                   SALOJÄRVI, J., PUOLAMÄKI, K. AND KASKI, S. 2004. Relevance
velop a more comprehensive study which includes both the stat-                   feedback from eye movements for proactive information retrieval. In
ic and dynamic features of eye movements. Originally, it is a                    Workshop on Processing Sensory Information for Proactive Systems
unity of human’s conscious and unconscious visual cognition                      (PSIPS 2004, Anonymous , 14-15.
behavior, which can not only be used in relevance feedback, but
also a new source of image representation. Human’s image                         ZHOU, X.S. AND HUANG, T.S. 2003. Relevance feedback in image
viewing automatically bridge the low level features, such as                     retrieval: A comprehensive review. Multimedia Systems 8, 536-544. .




                                                                            40

Mais conteúdo relacionado

Mais procurados

Neural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabNeural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabijcses
 
MultiModal Identification System in Monozygotic Twins
MultiModal Identification System in Monozygotic TwinsMultiModal Identification System in Monozygotic Twins
MultiModal Identification System in Monozygotic TwinsCSCJournals
 
Deep re-id: 关于行人重识别的深度学习方法
Deep re-id: 关于行人重识别的深度学习方法Deep re-id: 关于行人重识别的深度学习方法
Deep re-id: 关于行人重识别的深度学习方法哲东 郑
 
Multilabel Image Annotation using Multimodal Analysis
Multilabel Image Annotation using Multimodal AnalysisMultilabel Image Annotation using Multimodal Analysis
Multilabel Image Annotation using Multimodal Analysisijtsrd
 
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...IJCSEIT Journal
 
Hardoon Image Ranking With Implicit Feedback From Eye Movements
Hardoon Image Ranking With Implicit Feedback From Eye MovementsHardoon Image Ranking With Implicit Feedback From Eye Movements
Hardoon Image Ranking With Implicit Feedback From Eye MovementsKalle
 
Project report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMProject report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMMohammad Saiful Islam
 
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...IJECEIAES
 
Rule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRanda Elanwar
 
Deep Neural NEtwork
Deep Neural NEtworkDeep Neural NEtwork
Deep Neural NEtworkVai Jayanthi
 
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...IRJET Journal
 
Optimized Biometric System Based on Combination of Face Images and Log Transf...
Optimized Biometric System Based on Combination of Face Images and Log Transf...Optimized Biometric System Based on Combination of Face Images and Log Transf...
Optimized Biometric System Based on Combination of Face Images and Log Transf...sipij
 
Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learningijtsrd
 
Development of video-based emotion recognition using deep learning with Googl...
Development of video-based emotion recognition using deep learning with Googl...Development of video-based emotion recognition using deep learning with Googl...
Development of video-based emotion recognition using deep learning with Googl...TELKOMNIKA JOURNAL
 

Mais procurados (19)

Neural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlabNeural network based numerical digits recognization using nnt in matlab
Neural network based numerical digits recognization using nnt in matlab
 
MultiModal Identification System in Monozygotic Twins
MultiModal Identification System in Monozygotic TwinsMultiModal Identification System in Monozygotic Twins
MultiModal Identification System in Monozygotic Twins
 
237 240
237 240237 240
237 240
 
Deep re-id: 关于行人重识别的深度学习方法
Deep re-id: 关于行人重识别的深度学习方法Deep re-id: 关于行人重识别的深度学习方法
Deep re-id: 关于行人重识别的深度学习方法
 
Multilabel Image Annotation using Multimodal Analysis
Multilabel Image Annotation using Multimodal AnalysisMultilabel Image Annotation using Multimodal Analysis
Multilabel Image Annotation using Multimodal Analysis
 
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
 
Hardoon Image Ranking With Implicit Feedback From Eye Movements
Hardoon Image Ranking With Implicit Feedback From Eye MovementsHardoon Image Ranking With Implicit Feedback From Eye Movements
Hardoon Image Ranking With Implicit Feedback From Eye Movements
 
Project report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMProject report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVM
 
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...
Pattern Approximation Based Generalized Image Noise Reduction Using Adaptive ...
 
Seminar5
Seminar5Seminar5
Seminar5
 
Ijcatr04051013
Ijcatr04051013Ijcatr04051013
Ijcatr04051013
 
Rule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognitionRule based algorithm for handwritten characters recognition
Rule based algorithm for handwritten characters recognition
 
Deep Neural NEtwork
Deep Neural NEtworkDeep Neural NEtwork
Deep Neural NEtwork
 
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...
 
33 102-1-pb
33 102-1-pb33 102-1-pb
33 102-1-pb
 
Optimized Biometric System Based on Combination of Face Images and Log Transf...
Optimized Biometric System Based on Combination of Face Images and Log Transf...Optimized Biometric System Based on Combination of Face Images and Log Transf...
Optimized Biometric System Based on Combination of Face Images and Log Transf...
 
Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learning
 
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
[IJET-V1I5P9] Author: Prutha Gandhi, Dhanashri Dalvi, Pallavi Gaikwad, Shubha...
 
Development of video-based emotion recognition using deep learning with Googl...
Development of video-based emotion recognition using deep learning with Googl...Development of video-based emotion recognition using deep learning with Googl...
Development of video-based emotion recognition using deep learning with Googl...
 

Destaque

Plataforma anged 2013
Plataforma anged 2013Plataforma anged 2013
Plataforma anged 2013oscargaliza
 
Рекламируйтесь с нами
Рекламируйтесь с намиРекламируйтесь с нами
Рекламируйтесь с намиAeroSvit Airlines
 
Acta asamblea congresual parador baiona
Acta asamblea congresual parador baionaActa asamblea congresual parador baiona
Acta asamblea congresual parador baionaoscargaliza
 
nextNY Online Marketing School Intro to SEO
nextNY Online Marketing School Intro to SEOnextNY Online Marketing School Intro to SEO
nextNY Online Marketing School Intro to SEOnextNY
 
ParaEmpezarSeasonsandWeather
ParaEmpezarSeasonsandWeatherParaEmpezarSeasonsandWeather
ParaEmpezarSeasonsandWeatherSenoraAmandaWhite
 
MobileConf 2013 - Aerogear Android
MobileConf 2013 - Aerogear AndroidMobileConf 2013 - Aerogear Android
MobileConf 2013 - Aerogear AndroidDaniel Passos
 
Clase 2 preparación y administracion de medicamentos y sueros
Clase 2 preparación y administracion de medicamentos y suerosClase 2 preparación y administracion de medicamentos y sueros
Clase 2 preparación y administracion de medicamentos y suerosMANUEL RIVERA
 
שיעור שתיים מעבד התמלילים
שיעור שתיים   מעבד התמליליםשיעור שתיים   מעבד התמלילים
שיעור שתיים מעבד התמליליםhaimkarel
 
Acta mediterranea de catering pontevedra
Acta mediterranea de catering pontevedraActa mediterranea de catering pontevedra
Acta mediterranea de catering pontevedraoscargaliza
 
Inspeccion ourense
Inspeccion ourenseInspeccion ourense
Inspeccion ourenseoscargaliza
 
SelfRJ - Aerogear iOS
SelfRJ - Aerogear iOSSelfRJ - Aerogear iOS
SelfRJ - Aerogear iOSDaniel Passos
 
Manual de preparación y administración de medicamentos inyectables utilizados...
Manual de preparación y administración de medicamentos inyectables utilizados...Manual de preparación y administración de medicamentos inyectables utilizados...
Manual de preparación y administración de medicamentos inyectables utilizados...MANUEL RIVERA
 

Destaque (20)

Plataforma anged 2013
Plataforma anged 2013Plataforma anged 2013
Plataforma anged 2013
 
Social Networking Security Workshop
Social Networking Security WorkshopSocial Networking Security Workshop
Social Networking Security Workshop
 
Tactical Assassins
Tactical AssassinsTactical Assassins
Tactical Assassins
 
Power point
Power pointPower point
Power point
 
Рекламируйтесь с нами
Рекламируйтесь с намиРекламируйтесь с нами
Рекламируйтесь с нами
 
Acta asamblea congresual parador baiona
Acta asamblea congresual parador baionaActa asamblea congresual parador baiona
Acta asamblea congresual parador baiona
 
Outlook Express
Outlook ExpressOutlook Express
Outlook Express
 
nextNY Online Marketing School Intro to SEO
nextNY Online Marketing School Intro to SEOnextNY Online Marketing School Intro to SEO
nextNY Online Marketing School Intro to SEO
 
Vpn
VpnVpn
Vpn
 
TEMA2AVocabulary
TEMA2AVocabularyTEMA2AVocabulary
TEMA2AVocabulary
 
Statby school 2553_m3_1057012007
Statby school 2553_m3_1057012007Statby school 2553_m3_1057012007
Statby school 2553_m3_1057012007
 
ParaEmpezarSeasonsandWeather
ParaEmpezarSeasonsandWeatherParaEmpezarSeasonsandWeather
ParaEmpezarSeasonsandWeather
 
MobileConf 2013 - Aerogear Android
MobileConf 2013 - Aerogear AndroidMobileConf 2013 - Aerogear Android
MobileConf 2013 - Aerogear Android
 
Clase 2 preparación y administracion de medicamentos y sueros
Clase 2 preparación y administracion de medicamentos y suerosClase 2 preparación y administracion de medicamentos y sueros
Clase 2 preparación y administracion de medicamentos y sueros
 
שיעור שתיים מעבד התמלילים
שיעור שתיים   מעבד התמליליםשיעור שתיים   מעבד התמלילים
שיעור שתיים מעבד התמלילים
 
Truman
TrumanTruman
Truman
 
Acta mediterranea de catering pontevedra
Acta mediterranea de catering pontevedraActa mediterranea de catering pontevedra
Acta mediterranea de catering pontevedra
 
Inspeccion ourense
Inspeccion ourenseInspeccion ourense
Inspeccion ourense
 
SelfRJ - Aerogear iOS
SelfRJ - Aerogear iOSSelfRJ - Aerogear iOS
SelfRJ - Aerogear iOS
 
Manual de preparación y administración de medicamentos inyectables utilizados...
Manual de preparación y administración de medicamentos inyectables utilizados...Manual de preparación y administración de medicamentos inyectables utilizados...
Manual de preparación y administración de medicamentos inyectables utilizados...
 

Semelhante a Eye Tracking Predicts Image Relevance in CBIR Systems

Liang Content Based Image Retrieval Using A Combination Of Visual Features An...
Liang Content Based Image Retrieval Using A Combination Of Visual Features An...Liang Content Based Image Retrieval Using A Combination Of Visual Features An...
Liang Content Based Image Retrieval Using A Combination Of Visual Features An...Kalle
 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionIJERA Editor
 
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Kalle
 
Paper id 25201471
Paper id 25201471Paper id 25201471
Paper id 25201471IJRAT
 
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALMETA-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
 
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Kalle
 
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksRyan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksKalle
 
Design and Development of an Algorithm for Image Clustering In Textile Image ...
Design and Development of an Algorithm for Image Clustering In Textile Image ...Design and Development of an Algorithm for Image Clustering In Textile Image ...
Design and Development of an Algorithm for Image Clustering In Textile Image ...IJCSEA Journal
 
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-IdentificationMulti-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identificationgerogepatton
 
AUTOMATION OF ATTENDANCE USING DEEP LEARNING
AUTOMATION OF ATTENDANCE USING DEEP LEARNINGAUTOMATION OF ATTENDANCE USING DEEP LEARNING
AUTOMATION OF ATTENDANCE USING DEEP LEARNINGIRJET Journal
 
An Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective ViewAn Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective Viewijtsrd
 
Age and gender detection using deep learning
Age and gender detection using deep learningAge and gender detection using deep learning
Age and gender detection using deep learningIRJET Journal
 
10.1.1.432.9149.pdf
10.1.1.432.9149.pdf10.1.1.432.9149.pdf
10.1.1.432.9149.pdfmoemi1
 
https://uii.io/0hIB
https://uii.io/0hIBhttps://uii.io/0hIB
https://uii.io/0hIBmoemi1
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...AnuragVijayAgrawal
 
Survey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNSurvey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNIRJET Journal
 
Content based image retrieval project
Content based image retrieval projectContent based image retrieval project
Content based image retrieval projectaliaKhan71
 

Semelhante a Eye Tracking Predicts Image Relevance in CBIR Systems (20)

Liang Content Based Image Retrieval Using A Combination Of Visual Features An...
Liang Content Based Image Retrieval Using A Combination Of Visual Features An...Liang Content Based Image Retrieval Using A Combination Of Visual Features An...
Liang Content Based Image Retrieval Using A Combination Of Visual Features An...
 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object Recognition
 
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...
 
Paper id 25201471
Paper id 25201471Paper id 25201471
Paper id 25201471
 
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALMETA-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
 
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...
 
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksRyan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
 
Sub1547
Sub1547Sub1547
Sub1547
 
Design and Development of an Algorithm for Image Clustering In Textile Image ...
Design and Development of an Algorithm for Image Clustering In Textile Image ...Design and Development of an Algorithm for Image Clustering In Textile Image ...
Design and Development of an Algorithm for Image Clustering In Textile Image ...
 
Ts2 c topic
Ts2 c topicTs2 c topic
Ts2 c topic
 
Ts2 c topic (1)
Ts2 c topic (1)Ts2 c topic (1)
Ts2 c topic (1)
 
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-IdentificationMulti-Level Feature Fusion Based Transfer Learning for Person Re-Identification
Multi-Level Feature Fusion Based Transfer Learning for Person Re-Identification
 
AUTOMATION OF ATTENDANCE USING DEEP LEARNING
AUTOMATION OF ATTENDANCE USING DEEP LEARNINGAUTOMATION OF ATTENDANCE USING DEEP LEARNING
AUTOMATION OF ATTENDANCE USING DEEP LEARNING
 
An Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective ViewAn Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective View
 
Age and gender detection using deep learning
Age and gender detection using deep learningAge and gender detection using deep learning
Age and gender detection using deep learning
 
10.1.1.432.9149.pdf
10.1.1.432.9149.pdf10.1.1.432.9149.pdf
10.1.1.432.9149.pdf
 
https://uii.io/0hIB
https://uii.io/0hIBhttps://uii.io/0hIB
https://uii.io/0hIB
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
 
Survey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNNSurvey on Human Behavior Recognition using CNN
Survey on Human Behavior Recognition using CNN
 
Content based image retrieval project
Content based image retrieval projectContent based image retrieval project
Content based image retrieval project
 

Mais de Kalle

Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsBlignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsKalle
 
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Kalle
 
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Kalle
 
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Kalle
 
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlUrbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlKalle
 
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Kalle
 
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingTien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingKalle
 
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Kalle
 
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeStevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeKalle
 
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Kalle
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneKalle
 
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerSan Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerKalle
 
Rosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingRosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingKalle
 
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchQvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchKalle
 
Prats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyPrats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyKalle
 
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Kalle
 
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Kalle
 
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Kalle
 
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...Kalle
 
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...Kalle
 

Mais de Kalle (20)

Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsBlignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
 
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
 
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
 
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
 
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlUrbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
 
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
 
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingTien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
 
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
 
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeStevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
 
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze Alone
 
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerSan Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
 
Rosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingRosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem Solving
 
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchQvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
 
Prats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyPrats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement Study
 
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
 
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
 
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
 
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...
Palinko Estimating Cognitive Load Using Remote Eye Tracking In A Driving Simu...
 
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...
Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizont...
 

Eye Tracking Predicts Image Relevance in CBIR Systems

  • 1. Eye Movement as an Interaction Mechanism for Relevance Feed- back in a Content-Based Image Retrieval System Yun Zhang*1,2 ¶ † ‡ Hong Fu 2, Zhen Liang 2, Zheru Chi 2 § Dagan Feng 2,3, 1 3 School of Computer Science 2 Centre for Multimedia Signal School of Information Technologies Northwestern Polytechnical Uni- Processing Department of Electronic The University of Sydney versity, Xi’an, Shaanxi, China and Information Engineering Sydney, Australia The Hong Kong Polytechnic University Hong Kong, China ver, the subjective nature of human annotation adds another Abstract dimension of difficulty in managing image database. Relevance feedback (RF) mechanisms are widely adopted in CBIR is an alternative solution to retrieve images. However, Content-Based Image Retrieval (CBIR) systems to improve after years of rapid growth since 1990s [Flickner et al.1995], the image retrieval performance. However, there exist some intrinsic gaps between low level features and semantic contents of images problems: (1) the semantic gap between high-level concepts and holds back the progress and has entered a plateau phase. Such low-level features and (2) the subjectivity of human perception gaps can be concretely outlined into three aspects: (1) image of visual contents. The primary focus of this paper is to evaluate representation (2) similarity measure (3) user’s interaction. Most the possibility of inferring the relevance of images based on eye of the image representations are based on intuitiveness of the movement data. In total, 882 images from 101 categories are researchers and the fulfillment of mathematics, instead of hu- viewed by 10 subjects to test the usefulness of implicit RF, man’s eye behavior. Do the features extracted reflect humans’ where the relevance of each image is known beforehand. A set understanding of the image’s content? There is no clear answer of measures based on fixations are thoroughly evaluated which to this question. Similarity measure is highly dependent on the include fixation duration, fixation count, and the number of revi- features and structures used in image representation. Moreover, sits. Finally, the paper proposes a decision tree to predict the developing better distance descriptors and refining similarity user’s input during the image searching tasks. The prediction measures are also very challenging. User interaction can be a precision of the decision tree is over 87%, which spreads light feasible approach to answer the question and to improve the on a promising integration of natural eye movement into CBIR image retrieval performance. In the Relevance Feedback (RF) systems in the future. process, the user is asked to refine the searching by providing CR Categories: H.3.3 [Information Storage and Retrieval]: explicit RF, such as selecting Areas-of-Interest (AOIs) from the Information Search and Retrieval—Relevance feedback, Search query image, or to tick positive and negative samples from re- Process; H.5.2 [Information Interfaces and Representation]: trieves. In the past few years, many articles reported that RF can User interfaces help to establish the association between the low-level features Keywords: Eye Tracking, Relevance Feedback (RF), Content- and the semantics of images and to improve the retrieval per- Based Image Retrieval (CBIR), Visual Perception formance [Liu et al.2006; Dacheng Tao et al.2008]. 1 Introduction However, the explicit feedback is laborious for the user and limited in complexity. In this paper, we propose an eye move- Numerous digital images are being produced everyday from ment based implicit feedback as a rich and natural source to digital cameras, medical devices, security monitors, and other replace the time-consuming and expensive explicit feedback. As image capturing apparatus. It has become more and more diffi- far as we know, there are just a few preliminary studies on im- cult to retrieve a desired picture even from a photo album on a plementing some general eye movement features in image re- home computer because of the exponential increase in the num- trieval. One is from Oyekoya and Setntiford’s work [Oyekoya ber of images. Most traditional and common methods of im- and Stentiford.2004; Oyekoya and Stentiford.2006]. They made age retrieval based on metadata, such as textual annotations or an investigation into the fixation duration and found that they user-specified tags, have become the industry standard for re- are different on images with/without a clear AOI. The other trieval from large image collections. However, manual image work was reported by Klami et al. [Klami et al. 2008]. They annotation is time-consuming, laborious and expensive. Moreo- proposed nine-feature vectors from different forms of fixations *email: tvsunny@gmail.comemail: † and saccades and used a classifier to predict one relevant image email:zhenliang@eie.polyu.edu.hk from four candidates. ‡ email:enhongfu@inet.polyu.edu.hk ‖ email:enzheru@inet.polyu.edu.hk Different from the previous work, the study reported in this pa- § email: feng@it.usyd.edu.au per attempts to simulate a more real and complex image retrieval situation and to quantitatively analyze the correlation between users’ eye behavior and target images (positive images). In our Copyright © 2010 by the Association for Computing Machinery, Inc. experiments, the images come from a wide variety of web Permission to make digital or hard copies of part or all of this work for personal or sources, and in each task, the query image and the numbers of classroom use is granted without fee provided that copies are not made or distributed positive images are varied from time to time. We evaluated the for commercial advantage and that copies bear this notice and the full citation on the significance of fixation durations, fixation counts, and the num- first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on ber of revisits to provide a systematic interoperation of the us- servers, or to redistribute to lists, requires prior specific permission and/or a fee. er’s attention and effort allocation in eye movements, laying a Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail permissions@acm.org. ETRA 2010, Austin, TX, March 22 – 24, 2010. © 2010 ACM 978-1-60558-994-7/10/0003 $10.00 37
  • 2. concrete and substantial foundation to involve natural eye Ten participants took part in the study, four females and six movement as a robust RF source [Zhou and Huang. 2003]. males in an age range from 20 to 32 all with an academic back- ground. All of them are proficient computer users, and half of The rest of the paper is organized as follows. Section 2 introduc- them have had experience of using an eye tracking system. Their es experimental design and setting for relevance feedback tasks visions are either normal or correct-to-normal. The participants and the corresponding eye movement data collecting. In Section were asked to complete two sets of the above mentioned image 3, we report our thorough investigation on using fixation dura- searching tasks and the gaze data are recorded with a 60 Hz tion, fixation count and the numbers of revisits for the prediction sampling rate. Afterwards the participants were asked to indicate of relevant images. These factors are performed with the ANO- which images they have chosen as positive images to ensure the VA test to reveal their significances and interconnections. Sec- accuracy of a further analysis on their eye movement data. The tion 4 proposes a decision tree model to predict the user’s input eye tracker is non-intrusive and allows a 300x220x300 mm free during the images searching tasks. Finally, we conclude the head movement space. Different candidate images and the loca- results and propose the future work. tions of positive images are ensured in and between each set of the task. In other words, no two images are the same and no two 2 Design of Experiments stimuli have the same positive image locations. This is to reduce the memory effects and to simulate the natural relevance feed- 2.1 Task Setup back situation. We study an image searching task which reflects kinds of activi- ties occurring in a complete CBIR system. In total, 882 images 3 Analysis of Gaze Data in Image Searching are randomly selected from 101 object categories. The image set is obtained by collecting images through the Google image Raw gaze data are preprocessed by finding the fixations with the search enginee [Li 2005]. The design and examples of the built-in filter provided by Tobii Technology. The filter maps a searching task interface is shown in Fig. 1. On the top left is the series of raw coordinates to a single fixation if the coordinates query image. Twenty candidate images are arranged as a 4x5 stay sufficiently long within a sphere of a given radius. We used grid display. All of the images are from 101 categories such as an interval threshold of 150 ms and a radius of 1 º visual angle. landscapes, animals, buildings, human faces, and home ap- 3.1 Fixation Duration and Fixation Count pliances. The red blocks in Fig. 1(a) denotes the locations of positive images in Fig. 1(b) (Class No. 22: Pyramid). The others The main features used in eye tracking related information re- are negative images and their image classes are different from trieval are fixations and saccades [Jacob and Karn.2003]. Two each other. That is to say, apart from the query image’s category, groups of derived metrics stem from the fixation: fixation dura- no two images in the grid are from the same category. The can- tion and fixation count are thoroughly studied to support the didate images in one searching stimulus are randomly arranged. possibility of inferring the relevance of images based on eye movements [Goldberg et al.2002; Gołofit 2008]. Suppose that FDP(m) and FDN(m) are the fixation durations on the positive Query Class No Class No Class No Class No Class No Image 01 22 22 75 64 and the negative images observed by subject m, respectively; Negative Positive Positive Negative Negative FCP(m) and FCN(m) are the fixation counts on the positive and Class No 56 Class No 38 Class No 17 Class No 100 Class No 12 the negative images observed by subject m, respectively; Then Negative Negative Negative Negative Negative in our searching task, FDP(m) and FDN(m) are defined as Class No Class No Class No Class No Class No 45 22 06 77 91 Negative Positive Negative Negative Negative ∑, , FD sgn , FDP = (a) Class No Class No Class No Class No Class No (b) ∑, , sgn , (1) 13 69 22 22 28 Negative Negative Positive Positive Negative ∑, , FD 1 sgn , FDN = ∑, , 1 sgn 1, Figure 1. Image searching stimulus. (a) the layout of the search- ing stimulus with 5 positive images; (b) an example. where 0,1, … ,20 denotes the image candidate in each searching stimulus interface; 1,2, … ,21 denotes the stimulus in each searching task (it also represents the numbers of positive Such a simulated relevance feedback task asks each participant images in the current stimulus); 1,2 denotes the task set, to use his eye to locate the positive image on each stimulus. On 1,2, … ,10 represents the subject and sgn(x) is the signum locating the positive image, the participants select the target by function. Consequently, FD is the fixation duration on the fixating on it for a short period of time with the eye. A set of the i-th image candidate of the j-th stimulus of the k-th task from task are composed of 21 such stimulus whose positive image subject m, and number are varied from 0 to 20. Thus, the set of task contains 21x21 = 441 images and the total number of the negative images 1 if subject regards cadidate image as positive and positive images are equal (210 images each). , 0 if subject regards cadidate image as negative 2.2 Apparatus and Procedure In the similar manner, FCP(m) and FCN(m) are defined as Eye tracking data is collected by the Tobii X120 eye tracker, ∑, , FC sgn , whose accuracy is α 0.5° and drift β 0.3°. Each candidate FCP = ∑, , sgn , image has a resolution of 300 x 300 pixels and thus an image ∑, , FC 1 sgn , (2) stimulus has 1800 x 1200 pixels. Each of stimuli is displayed on FCN = the screen with a viewing distance of 600 mm and the screen’s ∑, , 1 sgn , resolution is 1920x1280 pixels and the pixel pitch is h = 0.264 where FC is the fixation counts on the i-th image candi- mm. Hence the output uncertainty is just R tan α β /h = date of the j-th stimulus of the k-th task from subject m. The two 30 pixels, which has ensured the error of gaze data no larger pairs of fixation-related variables were monitored and recorded than 1% area of each candidate image. 38
  • 3. during the experiment. The average value and standard deviation ing task. We can see that (1) some of the candidate images are of ten participants are summarized in Table 1. never visited, which indicates the use of pre-attentive vision at the very beginning of the visual search [Salojärvi et al. 2004]. Table 1 Statistics on the fixation duration and fixation count on During the pre-attentive process, all the candidate images have positive and negative images been examined to decide the successive fixation locations; and Sub. FDP(m) FDN(m) FCP(m) FCN(m) (2) in our experiments, revisits happen both on positive images 1.410±1.081 0.415±0.481 2.5±1.9 1.3±1.3 and negative images. The majority of them have just been vi- 1 sited once, while some of them are revisited during the image 2 1.332±0.394 0.283±0.247 2.7±1.4 1.2±0.9 searching. 3 2.582±1.277 0.418±0.430 5.6±3.3 1.7±1.5 4 0.805±0.414 0.356±0.328 2.4±1.2 1.5±1.2 The Number of Visits ‐‐ Histogram 2500 2149 5 1.154±0.484 0.388±0.284 2.6±1.4 1.5±1.0 2000 6 1.880±0.926 0.402±0.338 3.0±1.9 1.4±1.0 1500 7 0.987±0.397 0.166±0.283 1.7±0.8 0.6±0.7 878 8 0.704±0.377 0.358±0.254 2.2±1.1 1.3±0.9 1000 403 306 9 1.125±0.674 0.329±0.403 3.0±2.0 1.4±1.5 500 119 65 80 10 1.101±0.444 0.392±0.235 2.7±1.3 1.5±0.8 0 AVG. 1.308±0.891 0.351±0.345 2.8±2.0 1.3±1.1 No Visit 1 times 2 times 3 times 4 times 5 times > 6 times Figure 2 The total revisit histogram. The X-axis denotes the Analysis of variance (ANOVA) tests are performed to find out number of re-fixation and Y-axis is the corresponding count whether there are discriminating visual behaviors between the (unit: millisecond). observation of positive and negative images. Given the individu- al difference in eye movements, we designed two groups of two- Table 3 Overall revisits on positive and negative images way ANOVA among three factors: test subject, fixation duration A1 1 2 3 4 5 6 >7 and fixation count. The results are shown in Table 2. A2 549 196 88 55 34 13 27 Table 2 ANOVA test results among three factors: test subject, A3 329 110 31 10 3 2 1 fixation duration and fixation count. A4 878 306 119 65 37 15 28 GROUP I Factor Levels Test result A5 63% 64% 74% 85% 92% 87% 100% (A) Test 10 levels A1 = the number of revisits on an image candidate; A2 = revisit F(9,9) = 1.26, p < 0.37 Subjects (10 subjects) counts on positive images; A3 = revisit counts on negative im- (B) Fixation 2 levels ages; A4 = the total number of revisits; A5 = the percentage of F(1,9) = 32.84, p < 0.0003 Duration (FDP & FDN) the total revisits occurring to positive images. GROUP II Factor Levels Test result To compare with Oyekoya and Setntiford’s work [2006], we (A) Test 10 levels investigate whether the variance of revisit counts has a different F(9,9) = 2.03, p < 0.15 effect between positive and negative image candidates over all Subjects (10 subjects) (B) Fixation 2 levels the participants (as shown in Table 3). When revisits counts ≥ 3 F(1,9) = 28.28, p < 0.0005 Count (FCP & FCN) times, the result of one-way ANOVA is significant with F(1,8) = 5.73, p < 0.044. That is to say, the probability of revisits on a As illustrated in Table 2, both fixation duration and fixation positive image is increased with revisits counts. For example, count revealed significant effects to positive and negative im- when an image is revisited more than three times, it has a very ages during simulated relevance feedback tasks. Concretely high probability (over 74%) to be a positive image candidate. As speaking, the fixation durations on each positive image from all a result, the number of revisit is also a feasible implicit relev- the subjects (1.30 seconds) are longer than those on negative ance feedback to drive an image retrieval engine. image (0.35seconds). Correspondingly, the analysis of fixation count produces similar results that subjects visit more times on a 4 Feature Extraction and Results positive image (2.8) than on a negative one (1.3). On the other The primary focus of this paper is on evaluating the possibility hand, the variations of different subjects have no significant of inferring the relevance of images based on eye movement effects on both groups. (In GROUP I, 0.37 > α = 0.05; in GROUP II, data. The features such as fixation duration, fixation count and 0.15 > α = 0.05). the number of revisit have shown discriminating power between positive and negative images. Consequently, we composed a 3.2 Number of Revisits simple set of 11 features , ,…, , an eye A revisit is defined as the re-fixation on an AOI previously fix- movement’s vector to predict the positive images from each ated. Much human computer interaction and usability research returned 4x5 image candidates set in the simulated relevance shows that re-fixation or revisit on a target may be an indication feedback task, where 1,2, … ,20 denotes the numbers of of special interest on the target. Therefore, the analysis of revisit positive images in the current stimulus; 1,2, … ,10 during the relevance feedback process may reveal the correlation represents the subject , , … , are listed in Table 4, where between the eye movement pattern and positive image candi- 1, … ,20 and FL FD /FC . dates. Table 4 Features used in relevance feedback to predict positive images Figure 2 shows a general status of the overall visit frequency (no. of revisits = no. of visits - 1) throughout the whole image search- 39
  • 4. NO. Features Description color, texture, shape, and spatial information, to human attention, such as AOIs. As a result, eye tracking data can be a rich and Fixation duration on i-th image inside 4x5 image FD candidate set interface new source for improving image representation [Lei Wu et al. Fixation Count on i-th image inside 4x5 image 2009]. Our future work is to develop an eye tracking based FC CBIR system in which human beings’ natural eye movements candidate set interface FL FD /FC will be effectively exploited and used in the modules of image FL Fixation Length on i-th image inside 4x5 image representation, similarity measurement and relevance feedback. candidate set interface R Revisit numbers happened on i-th image inside Acknowledgments 4x5 image candidate set interface The work reported in this paper is substantially supported by the Different from Klami et al.’s work [Klami et al. 2008], we use a Research Grants Council of the Hong Kong Special Administra- decision tree (DT) as a classifier to automatically learn the pre- tive Region, China (Project code: PolyU 5141/07E) and the diction rules. The data set mentioned in Section 2 is divided into PolyU Grant (Project code: 1-BBZ9). a training and a testing sets to evaluate the prediction accuracy. Two different methods are used to train the DT, which are illu- References strated in Table 5 (prediction precisions are 87.3% and 93.5%, respectively), and an example of predicted positive image from DACHENG TAO, XIAOOU TANG AND XUELONG LI. 2008. Which 4x5 candidates set is shown in Figure 3. Components are Important for Interactive Image Searching Circuits and Systems for Video Technology, IEEE Transactions on 18, 3-11. . Table 5 Training methods and testing results of decision trees FLICKNER, M., SAWHNEY, H., NIBLACK, W., ASHLEY, J., HUANG, Method I Q., DOM, B., GORKANI, M., HAFNER, J., LEE, D., PETKOVIC, D., Training Data Set 1,2, … 5 STEELE, D. AND YANKER, P. 1995. Query by Image and Video Testing Data Set 5,6, … 10 Content: The QBIC System. Computer 28, 23-32. . Prediction Precision 87.3% GOLDBERG, J.H., STIMSON, M.J., LEWENSTEIN, M., SCOTT, N. Method II AND WICHANSKY, A.M. 2002. Eye tracking in web search tasks: design implications. In ETRA '02: Proceedings of the 2002 symposium Training Data Set 1,3,5 … 19 on Eye tracking research & applications, New Orleans, Louisiana, Testing Data Set 2,4,6 … 20 Anonymous ACM, New York, NY, USA, 51-58. Prediction Precision 93.5% GOŁOFIT, K. 2008. Click Passwords Under Investigation. Computer Security - ESORICS 2007 343-358. . JACOB, R. AND KARN, K. 2003. Eye Tracking in Human-Computer Interaction and Usability Research: Ready to Deliver the Promises. In The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Re- search, HYONA, RADACH AND DEUBEL, Eds. Elsevier Science, Oxford, England. KLAMI, A., SAUNDERS, C., DE CAMPOS, T.E. AND KASKI, S. 2008. Can relevance of images be inferred from eye movements? In MIR '08: Proceeding of the 1st ACM international conference on Multimedia information retrieval, Vancouver, British Columbia, Canada, Anonym- ous ACM, New York, NY, USA, 134-140. Figure 3 An example of predicted positive images from 4x5 LEI WU, YANG HU, MINGJING LI, NENGHAI YU AND XIAN- candidates set in the simulated relevance feedback task. The SHENG HUA. 2009. Scale-Invariant Visual Language Modeling for Object Categorization. Multimedia, IEEE Transactions on 11, 286-294. . query image is “hedgehog”, and DT model returned 8 predicted positive images (in red frames) based on the 11 features vector FEIFEI. LI ,Visual recognition: computational models and human psy- with 100% accuracy. chophysics, Phd Thesis, California Institute of Technology, 2005. LIU, D., HUA, K., VU, K. AND YU, N. 2006. Fast Query Point Move- 5 Conclusion and Further Work ment Techniques with Relevance Feedback for Content-Based Image Retrieval. Advances in Database Technology - EDBT 2006 700-717. . An eye tracking system can be possibly integrated into a CBIR system as a more efficient input mechanism for implementing OYEKOYA, O. AND STENTIFORD, F. 2004. Exploring Human Eye the user’s relevance feedback process. In this paper, we mainly Behaviour using a Model of Visual Attention. 17th International Confe- concentrate on a group of fixation- related measurements which rence on (ICPR'04) Volume 4, IEEE Computer Society, Washington, DC, USA, 945-948. shows static eye movement patterns. In fact, the dynamic cha- racteristics can also manifest human organizational behavior and OYEKOYA, O. AND STENTIFORD, F. 2006. Perceptual Image Retriev- decision processes, such as saccades and scan path, which reveal al Using Eye Movements. Advances in Machine Vision, Image the pre-attention and cognition process of a human being while Processing, and Pattern Analysis 281-289. . viewing an image. In our further work, we will continue to de- SALOJÄRVI, J., PUOLAMÄKI, K. AND KASKI, S. 2004. Relevance velop a more comprehensive study which includes both the stat- feedback from eye movements for proactive information retrieval. In ic and dynamic features of eye movements. Originally, it is a Workshop on Processing Sensory Information for Proactive Systems unity of human’s conscious and unconscious visual cognition (PSIPS 2004, Anonymous , 14-15. behavior, which can not only be used in relevance feedback, but also a new source of image representation. Human’s image ZHOU, X.S. AND HUANG, T.S. 2003. Relevance feedback in image viewing automatically bridge the low level features, such as retrieval: A comprehensive review. Multimedia Systems 8, 536-544. . 40