SlideShare uma empresa Scribd logo
1 de 50
Baixar para ler offline
Mobile Visual Search	


       Oge Marques	

 Florida Atlantic University	


 Universitat Politècnica de Catalunya 
              Barcelona 	

             2 Mar 2012
Take-home message	



Mobile Visual Search (MVS) is a fascinating research
field with many open challenges and opportunities
which have the potential to impact the way we
organize, annotate, and retrieve visual data (images
and videos) using mobile devices.	





                                                Oge	
  Marques	
  
Outline	

•  This talk is structured in four parts:	


   1.  Opportunities	


   2.  Basic concepts	


   3.  Technical aspects	


   4.  Examples and applications	


                                               Oge	
  Marques	
  
Part I	


Opportunities
Mobile visual search: driving factors	

  •  Age of mobile computing	





h,p://60secondmarketer.com/blog/2011/10/18/more-­‐mobile-­‐phones-­‐than-­‐toothbrushes/	
  	
     Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Why do I need a camera? I have a smartphone…
     
         	

  
  (22 Dec 2011) 	





h,p://www.cellular-­‐news.com/story/52382.php	
  	
     Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Powerful devices	





                                                            1 GHz ARM
                                                            Cortex-A9
                                                            processor,
                                                            PowerVR
                                                            SGX543MP2,
  	

                                                            Apple A5 chipset	

  	


  	

h,p://www.apple.com/iphone/specs.html	
  	
  
h,p://www.gsmarena.com/apple_iphone_4s-­‐4212.php	
  	
                           Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Powerful devices	





h,p://europe.nokia.com/PRODUCT_METADATA_0/Products/Phones/8000-­‐series/808/Nokia808PureView_Whitepaper.pdf	
  	
  
h,p://www.nokia.com/fr-­‐fr/produits/mobiles/808/	
  	
                                                               Oge	
  Marques	
  
Mobile visual search: driving factors	

                                                    Social networks
                                                    and mobile
                                                    devices	

                                                      
                                                      (May 2011)	





h,p://jess3.com/geosocial-­‐universe-­‐2/	
  	
                       Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Social networks and mobile devices	

           –  Motivated users: image taking and image sharing are
              huge!	





           	



:	
  h,p://www.onlinemarkeUng-­‐trends.com/2011/03/facebook-­‐photo-­‐staUsUcs-­‐and-­‐insights.html	
  	
     Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Instagram: 	

           –  15 million registered users (in 13 months)	

           –  7 employees	

           –  A (growing ecosystem) based on it!	

                    •  Search 	

                    •  Send postcards	

                    •  Manage your photos	

                    •  Build a poster	

                    •  etc.	

  	

h,p://thenextweb.com/apps/2011/12/07/instagram-­‐hits-­‐15m-­‐users-­‐and-­‐has-­‐2-­‐people-­‐working-­‐on-­‐an-­‐android-­‐app-­‐right-­‐now/	
  	
  
h,p://www.nuwomb.com/instagram/	
  	
  	
                                                                                                                 Oge	
  Marques	
  
Mobile visual search: driving factors	

  •  Legitimate (or not quite…) needs and use cases	





h,p://www.slideshare.net/dtunkelang/search-­‐by-­‐sight-­‐google-­‐goggles	
  
h,ps://twi,er.com/#!/courtanee/status/14704916575	
  	
  	
                      Oge	
  Marques	
  
Search system, a low-latency interactive visual search system.         base and is the key to very fast retr
                                                      Several sidebars in this article invite the interested reader to dig   features they have in common wit
                                                      deeper into the underlying algorithms.                                 of potentially similar images is sele
                                                                                                                                 Finally, a geometric verificatio

            Mobile visual search: driving factors	

  ROBUST MOBILE IMAGE RECOGNITION
                                                      Today, the most successful algorithms for content-based image
                                                                                                                             most similar matches in the datab
                                                                                                                             spatial pattern between features of
                                                      retrieval use an approach that is referred to as bag of features       didate database image to ensure
                                                      (BoFs) or bag of words (BoWs). The BoW idea is borrowed from           Example retrieval systems are pres
    •  A natural use case for CBIR with QBE (at last!)	

                                                      text retrieval. To find a particular text document, such as a Web
                                                      page, it is sufficient to use a few well-chosen words. In the
                                                                                                                                 For mobile visual search, ther
                                                                                                                             to provide the users with an int
               –  The example is right in front of the user!	

                                                      database, the document itself can be likewise represented by a         deployed systems typically transm
                                                                                                                             the server, which might require t
                                                                                                                             large databases, the inverted file in
                                                                                                                             memory swapping operations slow
                                                                                                                             ing stage. Further, the GV step
                                                                                                                             and thus increases the response t
                                                                                                                             the retrieval pipeline in the follow
                                                                                                                             the challenges of mobile visual se




                                                                                                                                    Query         Feature
                                                                                                                                    Image        Extraction


                                                                                                                             [FIG2] A Pipeline for image retrieva
                                                                                                                             from the query image. Feature mat
                                                      [FIG1] A snapshot of an outdoor mobile visual search system            images in the database that have m
                                                      being used. The system augments the viewfinder with                    with the query image. The GV step
                                                      information about the objects it recognizes in the image taken         feature locations that cannot be pl
                                                      with a camera phone.                                                   in viewing position.
Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                                                                                           Oge	
  Marques	
  
Part II	


Basic concepts
MVS: technical challenges	

•  How to ensure low latency (and interactive
   queries) under constraints such as:	

  –  Network bandwidth	

  –  Computational power 	

  –  Battery consumption	

•  How to achieve robust visual recognition in spite
   of low-resolution cameras, varying lighting
   conditions, etc.	

•  How to handle broad and narrow domains	


                                                 Oge	
  Marques	
  
MVS: Pipeline for image retrieval	





Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
      Oge	
  Marques	
  
3 scenarios	





Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                      Oge	
  Marques	
  
Part III	


Technical aspects
Part III - Outline	

•  The MVS pipeline in greater detail	


•  Datasets for MVS research	


•  MPEG Compact Descriptors for Visual Search
   (CDVS)	





                                                Oge	
  Marques	
  
MVS: descriptor extraction	

    •  Interest point detection	

    •  Feature descriptor computation	





Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                  Oge	
  Marques	
  
Interest point detection	

   •  Numerous interest-point detectors have been proposed in
      the literature:	

              –  Harris Corners (Harris and Stephens 1988)	

              –  Scale-Invariant Feature Transform (SIFT) Difference-of-Gaussian
                 (DoG) (Lowe 2004)	

              –  Maximally Stable Extremal Regions (MSERs) (Matas et al. 2002)	

              –  Hessian affine (Mikolajczyk et al. 2005)	

              –  Features from Accelerated Segment Test (FAST) (Rosten and
                 Drummond 2006)	

              –  Hessian blobs (Bay, Tuytelaars and Van Gool 2006) 	

   •  Different tradeoffs in repeatability and complexity	

   •  See (Mikolajczyk and Schmid 2005) for a comparative
      performance evaluation of local descriptors in a common
      framework. 	


Girod	
  et	
  al.	
  IEEE	
  Signal	
  Processing	
  Magazine	
  2011	
     Oge	
  Marques	
  
Feature descriptor computation	

   •  After interest-point detection, we compute a
      visual word descriptor on a normalized patch. 	


   •  Ideally, descriptors should be:	

              –  robust to small distortions in scale, orientation, and
                 lighting conditions;	

              –  discriminative, i.e., characteristic of an image or a small
                 set of images;	

              –  compact, due to typical mobile computing constraints.	



Girod	
  et	
  al.	
  IEEE	
  Signal	
  Processing	
  Magazine	
  2011	
     Oge	
  Marques	
  
Feature descriptor computation	

   •  Examples of feature descriptors in the literature:	

              –  SIFT (Lowe 1999)	

              –  Speeded Up Robust Feature (SURF) interest-point
                 detector (Bay et al. 2008) 	

              –  Gradient Location and Orientation Histogram (GLOH)
                 (Mikolajczyk and Schmid 2005)	

              –  Compressed Histogram of Gradients (CHoG)
                 (Chandrasekhar et al. 2009, 2010)	

   •  See (Winder, (Hua,) and Brown CVPR 2007, 2009) and
      (Mikolajczyk and Schmid PAMI 2005) for comparative
      performance evaluation of different descriptors. 	

Girod	
  et	
  al.	
  IEEE	
  Signal	
  Processing	
  Magazine	
  2011	
     Oge	
  Marques	
  
Feature descriptor computation	

   •  What about compactness?	

              –  Option 1: Compress off-the-shelf descriptors. 	

                         •  Result: poor rate-constrained image-retrieval
                            performance. 	



              –  Option 2: Design a descriptor with compression in
                 mind. 	

                        –  Example: CHoG (Compressed Histogram of Gradients) 
                           (Chandrasekhar et al. 2009, 2010)	




Girod	
  et	
  al.	
  IEEE	
  Signal	
  Processing	
  Magazine	
  2011	
     Oge	
  Marques	
  
CHoG: Compressed Histogram of Gradients	

                                                  Gradients
   Gradient distributions
                                Patch
                             for each bin
                                                     dx



                                                     dy

                                                               dx
                                                                            dy
             011101


                                                  Spatial
                                  0100101


                                                  binning
                                                                                            01101

                                                                                            101101  

                                                                  Histogram
                                                                                            0100011

                                                                                            111001  

                                                                 compression
                                                                                            0010011

                                                                                            01100

                                                                                            1010100
                                                                                                    

                                                                                          CHoG

                                                                                         Descriptor
       Bernd Girod: Mobile Visual Search
Chandrasekhar	
  et	
  al.	
  CVPR	
  09,10	
                                                  Oge	
  Marques	
  
CHoG: Compressed Histogram of Gradients	

                                                                                  [3B2-9]   mmu2011030086.3d    30/7/011    16:27   Page 92


    •  Performance evaluation	

               –  Recall vs. bit rate	

      Industry and Standards


                                                                           100
                                                                                                                                                       features, as they arrive.15 On
                                                                           98                                                                          finds a result that has sufficien
                                                                                                                                                       ing score, it terminates the searc
                                                                           96                                                                          ately sends the results back. T
                                                                                                                                                       optimization reduces system
                                             Classification accuracy (%)




                                                                           94
                                                                                                                                                       other factor of two.
                                                                           92                                                                             Overall, the SPS system dem
                                                                                                                                                       using the described array of tec
                                                                           90                                                                          bile visual-search systems can ac
                                                                                                                                                       ognition accuracy, scale to re
                                                                           88
                                                                                                                                                       databases, and deliver search r
                                                                           86                                                                          ceptable time.

                                                                           84                                              Send feature (CHoG)         Emerging MPEG standard
                                                                                                                           Send image (JPEG)              As we have seen, key compo
                                                                           82
                                                                                                                           Send feature (SIFT)         gies for mobile visual search alr
                                                                           80                                                                          we can choose among several p
                                                                                 100                           101                               102
                                                                                                                                                       tures to design such a system. W
                                                                                                  Query size (Kbytes)
                                                                                                                                                       these options at the beginnin
                                               Figure 7. Comparison of different schemes with regard to classification                                 The architecture shown in Figur
Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                                                                                                                   Oge	
  Marques	
  
                                                                                                                                                       est one to implement on a mobi
                                               accuracy and query size. CHoG descriptor data is an order of magnitude
                                               smaller compared to JPEG images or uncompressed SIFT descriptors.                                       requires fast networks such as W
                                                                                                                                                       good performance. The archite
MVS: feature indexing and matching	

    •  Goal: produce a data structure that can quickly return a short
       list of the database candidates most likely to match the query
       image. 	

               –  The short list may contain false positives as long as the correct match
                  is included. 	

               –  Slower pairwise comparisons can be subsequently performed on just
                  the short list of candidates rather than the entire database.	

    •  Example of a technique: Vocabulary Tree (VT)-Based Retrieval	





Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                                  Oge	
  Marques	
  
MVS: geometric verification	

    •  Goal: use location information of features in
       query and database images to confirm that the
       feature matches are consistent with a change in
       view-point between the two images.	





Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                Oge	
  Marques	
  
ik2, c, ikNk 6 is sorted, it is more
utive ID differences 5 dk1 5 ik1,
es.                                       is used to encode the inverted index.




2 ik1Nk 212 6 in place of the IDs. This
 dex [58] can significantly reduce
cting recognition accuracy. First,        [64] and recursive bottom-up complete (RBUC) code [65] have
                                          been shown to be at least ten times faster in decoding than


                                 MVS: geometric verification	

                                          AC, while achieving comparable compression gains as AC. The
                                          carryover and RBUC codes attain these speedups by enforcing
ed in text retrieval [62]. Second,        word-aligned memory accesses.
 n be quantized to a few repre-               Figure S6(a) compares the memory usage of the invert-
               •  Method: perform ed index with and without feature descriptorsRBUC evaluate
Max quantization. Third, the dis-          pairwise matching of compression using the and
ces and visit counts are far from         code. Index compression reduces memory usage from near-
                    geometricrate ly 10 GBof correspondences. 	

 coding can be much more
                                    consistency to 2 GB. This five times reduction leads to a sub-
               •  Techniques: 	

oding. Using the distributions of         stantial speedup in server-side processing, as shown in
counts, each inverted list can be         Figure S6(b). Without compression, the large inverted
 c code (AC) [63].        The geometricindex causes swapping between main anddatabase image is usually
                     –  Since keeping       transform between the query and virtual memory                          estimated
 very important for interactive regression down the retrieval engine. After compression,
                          using robust and slows techniques such as:	

ions, a scheme that allows ultra- sample consensus (RANSAC) (Fischlermemory congestion
                            •  Random memory swapping is avoided and and Bolles 1981)	

 red over AC. The carryover code          delays no longer contribute to the query latency.
                           •  Hough transform (Lowe 2004)	

                    –  The transformation is often represented by an affine mapping or a homography. 	

        •  Note: GV is computationally expensive, which is why it’s only used for a subset
            of images selected during the feature-matching stage. 	

onsistency checks to rerank
 tion and scale information of
  [53] and [69] propose incor-
tion into the VT matching or
 71], the authors investigate
stimation itself. Philbin et al.
atching features to propose
 c transformation model and
 hypotheses. Weak geometric
cally used to rerank a larger
ore a full GVt	
  al.	
  Iperformed on011	
  
        Girod	
  e is EEE	
  MulUmedia	
  2                                                                           Oge	
  Marques	
  
                                                [FIG4] In the GV step, we match feature descriptors pairwise and
                                                find feature correspondences that are consistent with a geometric
add a geometric reranking step
Datasets for MVS research	

   •  Stanford Mobile Visual Search Data Set 
           (http://web.cs.wpi.edu/~claypool/mmsys-dataset/2011/stanford/)	

             –  Key characteristics:	

                       •  rigid objects	

                       •  widely varying lighting conditions	

                       •  perspective distortion	

                       •  foreground and background clutter	

                       •  realistic ground-truth reference data	

                       •  query data collected from heterogeneous low and high-end
                          camera phones. 	




Chandrasekhar	
  et	
  al.	
  ACM	
  MMSys	
  2011	
                          Oge	
  Marques	
  
SMVS Data Set: categories and examples	


  •  DVD covers	





h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/dvd_covers.html	
  	
     Oge	
  Marques	
  
SMVS Data Set: categories and examples	


  •  CD covers	





h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/cd_covers.html	
  	
     Oge	
  Marques	
  
SMVS Data Set: categories and examples	


  •  Museum paintings	





h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/museum_painUngs.html	
  	
     Oge	
  Marques	
  
Other MVS data sets	





ISO/IEC	
  JTC1/SC29/WG11/N12202	
  -­‐	
  July	
  2011,	
  Torino,	
  IT	
     Oge	
  Marques	
  
MPEG Compact Descriptors for Visual Search (CDVS)	


   •  Objective	

              –  Define a standard that enables efficient
                 implementation of visual search functionality on mobile
                 devices	

   •  Scope	

                        •  bitstream of descriptors	

                        •  parts of descriptor extraction process (e.g. key-point
                           detection) needed to ensure interoperability	



              –  Additional info: 	

	

                        •  https://mailhost.tnt.uni-hannover.de/mailman/listinfo/cdvs 	

                        •  http://mpeg.chiariglione.org/meetings/geneva11-1/geneva_ahg.htm (Ad hoc groups)	




Bober,	
  Cordara,	
  and	
  Reznik	
  (2010)	
                                                             Oge	
  Marques	
  
MPEG CDVS	

                          [3B2-9]    mmu2011030086.3d       1/8/011   16:44   Page 93




  •  Summarized timeline	

         Table 1. Timeline for development of MPEG standard for visual search.


         When                   Milestone                             Comments
         March, 2011            Call for Proposals is published       Registration deadline: 11 July 2011
                                                                      Proposals due: 21 November 2011
         December, 2011         Evaluation of proposals               None
         February, 2012         1st Working Draft                     First specification and test software model that can
                                                                        be used for subsequent improvements.
         July, 2012             Committee Draft                       Essentially complete and stabilized specification.
         January, 2013          Draft International Standard          Complete specification. Only minor editorial
                                                                        changes are allowed after DIS.
         July, 2013             Final Draft International             Finalized specification, submitted for approval and
                                    Standard                            publication as International standard.




                that among several component technologies for         existing standards, such as MPEG Query For-
                image retrieval, such a standard should focus pri-    mat, HTTP, XML, JPEG, and JPSearch.
                marily on defining the format of descriptors and
Girod	
  et	
  al.	
  IEEE	
  MulUmedia	
  2011	
                                                                    Oge	
  Marques	
  
                parts of their extraction process (such as interest   Conclusions and outlook
                point detectors) needed to ensure interoperabil-         Recent years have witnessed remarkable
Part IV	


Examples and applications
Examples	

•    Google Goggles	

•    SnapTell 	

•    oMoby (and the IQ Engines API)	

•    pixlinQ	

•    Moodstocks	





                                         Oge	
  Marques	
  
Examples of commercial MVS apps	

  •  Google
     Goggles 	

          –  Android
             and iPhone	

          –  Narrow-
             domain
             search and
             retrieval	





h,p://www.google.com/mobile/goggles	
  	
     Oge	
  Marques	
  
SnapTell	

                                                             	

  •  One of the earliest (ca. 2008) MVS apps for iPhone	

          –  Eventually acquired by Amazon (A9)	

  •  Proprietary technique (“highly accurate and robust
     algorithm for image matching: Accumulated Signed Gradient
     (ASG)”).	





h,p://www.snaptell.com/technology/index.htm	
  	
                   Oge	
  Marques	
  
oMoby (and the IQ Engines API)	

          –  iPhone app	





h,p://omoby.com/pages/screenshots.php	
  	
     Oge	
  Marques	
  
oMoby (and the IQ Engines API)	


  •  The IQ Engines API: 
     “vision as a service”	





h,p://www.iqengines.com/applicaUons.php	
  	
     Oge	
  Marques	
  
pixlinQ	

  •  A “mobile visual
     search solution that
     enables you to link
     users to digital
     content whenever
     they take a mobile
     picture of your
     printed materials.”	

          –  Powered by image
             recognition from LTU
             technologies	


h,p://www.pixlinq.com/home	
  	
                  Oge	
  Marques	
  
pixlinQ	

  •  Example app (La Redoute)	





h,p://www.youtube.com/watch?v=qUZCFtc42Q4	
  	
                  Oge	
  Marques	
  
Moodstocks: overview	

  •  Offline image recognition thanks to a smart image
      signatures synchronization	

  	





h,p://www.youtube.com/watch?v=tsxe23b12eU	
  	
         Oge	
  Marques	
  
Moodstocks: technology	

•  Unique features:	

   –  offline image recognition thanks to a smart image signatures
      synchronization,	

   –  QR Code decoding,	

   –  EAN 8/13 decoding,	

   –  online image recognition as a fallback for very large image databases,	

   –  simultaneous run of image recognition and barcode decoding,	

   –  seamless scans logging in the background.	


•  Cross-platform (iOS / Android) client-side SDK and HTTP API
   available: https://github.com/Moodstocks 	


•  JPEG encoder used within their SDK also publicly
   available: https://github.com/Moodstocks/jpec 	



                                                                         Oge	
  Marques	
  
Moodstocks	

  •  Many successful apps for different platforms	





h,p://www.moodstocks.com/gallery/	
  	
                     Oge	
  Marques	
  
Concluding thoughts
Concluding thoughts	

•  Mobile Visual Search (MVS) is coming of age.	


•  This is not a fad and it can only grow.	


•  Still a good research topic	

   –  Many relevant technical challenges	

   –  MPEG efforts have just started	



•  Infinite creative commercial possibilities	

                                                     Oge	
  Marques	
  
Thanks!	

•  Questions?	





•  For additional information: omarques@fau.edu	

                                                 Oge	
  Marques	
  

Mais conteúdo relacionado

Mais procurados

Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesAnil Mishra
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Pratt SILS Knowledge Organization Spring 2010
Pratt SILS Knowledge Organization Spring 2010Pratt SILS Knowledge Organization Spring 2010
Pratt SILS Knowledge Organization Spring 2010PrattSILS
 
Toward a New Algorithm for Hands Free Browsing
Toward a New Algorithm for Hands Free BrowsingToward a New Algorithm for Hands Free Browsing
Toward a New Algorithm for Hands Free BrowsingCSCJournals
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...webhostingguy
 
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous Computing
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous ComputingHCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous Computing
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous ComputingSabin Buraga
 
Resume
ResumeResume
Resumebutest
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhavalDhavalkumar Thakker
 
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive Technologies
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive TechnologiesHCI 2018 (9/10) Affective Factors. From Emotion to Persuasive Technologies
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive TechnologiesSabin Buraga
 
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...Sabin Buraga
 

Mais procurados (11)

Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for Libraries
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Pratt SILS Knowledge Organization Spring 2010
Pratt SILS Knowledge Organization Spring 2010Pratt SILS Knowledge Organization Spring 2010
Pratt SILS Knowledge Organization Spring 2010
 
Toward a New Algorithm for Hands Free Browsing
Toward a New Algorithm for Hands Free BrowsingToward a New Algorithm for Hands Free Browsing
Toward a New Algorithm for Hands Free Browsing
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
 
Cl32543545
Cl32543545Cl32543545
Cl32543545
 
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous Computing
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous ComputingHCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous Computing
HCI 2014 (10 of 10): Natural User Interfaces. Ubiquitous Computing
 
Resume
ResumeResume
Resume
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhaval
 
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive Technologies
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive TechnologiesHCI 2018 (9/10) Affective Factors. From Emotion to Persuasive Technologies
HCI 2018 (9/10) Affective Factors. From Emotion to Persuasive Technologies
 
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...
HCI 2018 (8/10) An Introduction to Data Visualization. Design, processes, tec...
 

Semelhante a Mobile Visual Search: Driving Factors and Technical Challenges

How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingyalisassoon
 
Image processing project list for java and dotnet
Image processing project list for java and dotnetImage processing project list for java and dotnet
Image processing project list for java and dotnetredpel dot com
 
W3C Mobile Web technologies
W3C Mobile Web technologiesW3C Mobile Web technologies
W3C Mobile Web technologiesRobin Berjon
 
Weaviate and Pinecone Comparison.pdf
Weaviate and Pinecone Comparison.pdfWeaviate and Pinecone Comparison.pdf
Weaviate and Pinecone Comparison.pdfEvgenios Skitsanos
 
Mobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced CostMobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced CostEswar Publications
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization wordDhana K
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET Journal
 
Physical Mashups in the Web-Home
Physical Mashups in the Web-HomePhysical Mashups in the Web-Home
Physical Mashups in the Web-HomeDominique Guinard
 
Image Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSImage Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSOge Marques
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Editor IJARCET
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Editor IJARCET
 
GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebWeb Information Systems, TU Delft
 

Semelhante a Mobile Visual Search: Driving Factors and Technical Challenges (20)

How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changing
 
Image processing project list for java and dotnet
Image processing project list for java and dotnetImage processing project list for java and dotnet
Image processing project list for java and dotnet
 
W3C Mobile Web technologies
W3C Mobile Web technologiesW3C Mobile Web technologies
W3C Mobile Web technologies
 
gfs-sosp2003
gfs-sosp2003gfs-sosp2003
gfs-sosp2003
 
gfs-sosp2003
gfs-sosp2003gfs-sosp2003
gfs-sosp2003
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation1
Presentation1Presentation1
Presentation1
 
Weaviate and Pinecone Comparison.pdf
Weaviate and Pinecone Comparison.pdfWeaviate and Pinecone Comparison.pdf
Weaviate and Pinecone Comparison.pdf
 
Presentation1
Presentation1Presentation1
Presentation1
 
Mobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced CostMobile Web Browsing Based On Content Preserving With Reduced Cost
Mobile Web Browsing Based On Content Preserving With Reduced Cost
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
 
Physical Mashups in the Web-Home
Physical Mashups in the Web-HomePhysical Mashups in the Web-Home
Physical Mashups in the Web-Home
 
Image Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSImage Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOS
 
Gfs论文
Gfs论文Gfs论文
Gfs论文
 
The google file system
The google file systemThe google file system
The google file system
 
Support for Mobile Devices/Services - 2011 AIKCU Technology Conference Pre-Co...
Support for Mobile Devices/Services - 2011 AIKCU Technology Conference Pre-Co...Support for Mobile Devices/Services - 2011 AIKCU Technology Conference Pre-Co...
Support for Mobile Devices/Services - 2011 AIKCU Technology Conference Pre-Co...
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
 
GeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic WebGeniUS: Generic User Modeling Library for the Social Semantic Web
GeniUS: Generic User Modeling Library for the Social Semantic Web
 

Mais de Oge Marques

The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...
The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...
The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...Oge Marques
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesOge Marques
 
Using games to improve computer vision solutions
Using games to improve computer vision solutionsUsing games to improve computer vision solutions
Using games to improve computer vision solutionsOge Marques
 
Image retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesImage retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesOge Marques
 
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Oge Marques
 
Image Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadImage Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadOge Marques
 
Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Oge Marques
 
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)Oge Marques
 

Mais de Oge Marques (8)

The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...
The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...
The Impact of Segmentation on the Accuracy and Sensitivity of a Melanoma Clas...
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and Opportunities
 
Using games to improve computer vision solutions
Using games to improve computer vision solutionsUsing games to improve computer vision solutions
Using games to improve computer vision solutions
 
Image retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesImage retrieval: challenges and opportunities
Image retrieval: challenges and opportunities
 
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
 
Image Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadImage Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPad
 
Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010
 
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)
 

Último

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Mobile Visual Search: Driving Factors and Technical Challenges

  • 1. Mobile Visual Search Oge Marques Florida Atlantic University Universitat Politècnica de Catalunya Barcelona 2 Mar 2012
  • 2. Take-home message Mobile Visual Search (MVS) is a fascinating research field with many open challenges and opportunities which have the potential to impact the way we organize, annotate, and retrieve visual data (images and videos) using mobile devices. Oge  Marques  
  • 3. Outline •  This talk is structured in four parts: 1.  Opportunities 2.  Basic concepts 3.  Technical aspects 4.  Examples and applications Oge  Marques  
  • 5. Mobile visual search: driving factors •  Age of mobile computing h,p://60secondmarketer.com/blog/2011/10/18/more-­‐mobile-­‐phones-­‐than-­‐toothbrushes/     Oge  Marques  
  • 6. Mobile visual search: driving factors •  Why do I need a camera? I have a smartphone… (22 Dec 2011) h,p://www.cellular-­‐news.com/story/52382.php     Oge  Marques  
  • 7. Mobile visual search: driving factors •  Powerful devices 1 GHz ARM Cortex-A9 processor, PowerVR SGX543MP2, Apple A5 chipset h,p://www.apple.com/iphone/specs.html     h,p://www.gsmarena.com/apple_iphone_4s-­‐4212.php     Oge  Marques  
  • 8. Mobile visual search: driving factors •  Powerful devices h,p://europe.nokia.com/PRODUCT_METADATA_0/Products/Phones/8000-­‐series/808/Nokia808PureView_Whitepaper.pdf     h,p://www.nokia.com/fr-­‐fr/produits/mobiles/808/     Oge  Marques  
  • 9. Mobile visual search: driving factors Social networks and mobile devices (May 2011) h,p://jess3.com/geosocial-­‐universe-­‐2/     Oge  Marques  
  • 10. Mobile visual search: driving factors •  Social networks and mobile devices –  Motivated users: image taking and image sharing are huge! :  h,p://www.onlinemarkeUng-­‐trends.com/2011/03/facebook-­‐photo-­‐staUsUcs-­‐and-­‐insights.html     Oge  Marques  
  • 11. Mobile visual search: driving factors •  Instagram: –  15 million registered users (in 13 months) –  7 employees –  A (growing ecosystem) based on it! •  Search •  Send postcards •  Manage your photos •  Build a poster •  etc. h,p://thenextweb.com/apps/2011/12/07/instagram-­‐hits-­‐15m-­‐users-­‐and-­‐has-­‐2-­‐people-­‐working-­‐on-­‐an-­‐android-­‐app-­‐right-­‐now/     h,p://www.nuwomb.com/instagram/       Oge  Marques  
  • 12. Mobile visual search: driving factors •  Legitimate (or not quite…) needs and use cases h,p://www.slideshare.net/dtunkelang/search-­‐by-­‐sight-­‐google-­‐goggles   h,ps://twi,er.com/#!/courtanee/status/14704916575       Oge  Marques  
  • 13. Search system, a low-latency interactive visual search system. base and is the key to very fast retr Several sidebars in this article invite the interested reader to dig features they have in common wit deeper into the underlying algorithms. of potentially similar images is sele Finally, a geometric verificatio Mobile visual search: driving factors ROBUST MOBILE IMAGE RECOGNITION Today, the most successful algorithms for content-based image most similar matches in the datab spatial pattern between features of retrieval use an approach that is referred to as bag of features didate database image to ensure (BoFs) or bag of words (BoWs). The BoW idea is borrowed from Example retrieval systems are pres •  A natural use case for CBIR with QBE (at last!) text retrieval. To find a particular text document, such as a Web page, it is sufficient to use a few well-chosen words. In the For mobile visual search, ther to provide the users with an int –  The example is right in front of the user! database, the document itself can be likewise represented by a deployed systems typically transm the server, which might require t large databases, the inverted file in memory swapping operations slow ing stage. Further, the GV step and thus increases the response t the retrieval pipeline in the follow the challenges of mobile visual se Query Feature Image Extraction [FIG2] A Pipeline for image retrieva from the query image. Feature mat [FIG1] A snapshot of an outdoor mobile visual search system images in the database that have m being used. The system augments the viewfinder with with the query image. The GV step information about the objects it recognizes in the image taken feature locations that cannot be pl with a camera phone. in viewing position. Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 15. MVS: technical challenges •  How to ensure low latency (and interactive queries) under constraints such as: –  Network bandwidth –  Computational power –  Battery consumption •  How to achieve robust visual recognition in spite of low-resolution cameras, varying lighting conditions, etc. •  How to handle broad and narrow domains Oge  Marques  
  • 16. MVS: Pipeline for image retrieval Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 17. 3 scenarios Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 19. Part III - Outline •  The MVS pipeline in greater detail •  Datasets for MVS research •  MPEG Compact Descriptors for Visual Search (CDVS) Oge  Marques  
  • 20. MVS: descriptor extraction •  Interest point detection •  Feature descriptor computation Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 21. Interest point detection •  Numerous interest-point detectors have been proposed in the literature: –  Harris Corners (Harris and Stephens 1988) –  Scale-Invariant Feature Transform (SIFT) Difference-of-Gaussian (DoG) (Lowe 2004) –  Maximally Stable Extremal Regions (MSERs) (Matas et al. 2002) –  Hessian affine (Mikolajczyk et al. 2005) –  Features from Accelerated Segment Test (FAST) (Rosten and Drummond 2006) –  Hessian blobs (Bay, Tuytelaars and Van Gool 2006) •  Different tradeoffs in repeatability and complexity •  See (Mikolajczyk and Schmid 2005) for a comparative performance evaluation of local descriptors in a common framework. Girod  et  al.  IEEE  Signal  Processing  Magazine  2011   Oge  Marques  
  • 22. Feature descriptor computation •  After interest-point detection, we compute a visual word descriptor on a normalized patch. •  Ideally, descriptors should be: –  robust to small distortions in scale, orientation, and lighting conditions; –  discriminative, i.e., characteristic of an image or a small set of images; –  compact, due to typical mobile computing constraints. Girod  et  al.  IEEE  Signal  Processing  Magazine  2011   Oge  Marques  
  • 23. Feature descriptor computation •  Examples of feature descriptors in the literature: –  SIFT (Lowe 1999) –  Speeded Up Robust Feature (SURF) interest-point detector (Bay et al. 2008) –  Gradient Location and Orientation Histogram (GLOH) (Mikolajczyk and Schmid 2005) –  Compressed Histogram of Gradients (CHoG) (Chandrasekhar et al. 2009, 2010) •  See (Winder, (Hua,) and Brown CVPR 2007, 2009) and (Mikolajczyk and Schmid PAMI 2005) for comparative performance evaluation of different descriptors. Girod  et  al.  IEEE  Signal  Processing  Magazine  2011   Oge  Marques  
  • 24. Feature descriptor computation •  What about compactness? –  Option 1: Compress off-the-shelf descriptors. •  Result: poor rate-constrained image-retrieval performance. –  Option 2: Design a descriptor with compression in mind. –  Example: CHoG (Compressed Histogram of Gradients) (Chandrasekhar et al. 2009, 2010) Girod  et  al.  IEEE  Signal  Processing  Magazine  2011   Oge  Marques  
  • 25. CHoG: Compressed Histogram of Gradients Gradients Gradient distributions Patch for each bin dx dy dx dy 011101 Spatial 0100101 binning 01101 101101 Histogram 0100011 111001 compression 0010011 01100 1010100 CHoG
 Descriptor Bernd Girod: Mobile Visual Search Chandrasekhar  et  al.  CVPR  09,10   Oge  Marques  
  • 26. CHoG: Compressed Histogram of Gradients [3B2-9] mmu2011030086.3d 30/7/011 16:27 Page 92 •  Performance evaluation –  Recall vs. bit rate Industry and Standards 100 features, as they arrive.15 On 98 finds a result that has sufficien ing score, it terminates the searc 96 ately sends the results back. T optimization reduces system Classification accuracy (%) 94 other factor of two. 92 Overall, the SPS system dem using the described array of tec 90 bile visual-search systems can ac ognition accuracy, scale to re 88 databases, and deliver search r 86 ceptable time. 84 Send feature (CHoG) Emerging MPEG standard Send image (JPEG) As we have seen, key compo 82 Send feature (SIFT) gies for mobile visual search alr 80 we can choose among several p 100 101 102 tures to design such a system. W Query size (Kbytes) these options at the beginnin Figure 7. Comparison of different schemes with regard to classification The architecture shown in Figur Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques   est one to implement on a mobi accuracy and query size. CHoG descriptor data is an order of magnitude smaller compared to JPEG images or uncompressed SIFT descriptors. requires fast networks such as W good performance. The archite
  • 27. MVS: feature indexing and matching •  Goal: produce a data structure that can quickly return a short list of the database candidates most likely to match the query image. –  The short list may contain false positives as long as the correct match is included. –  Slower pairwise comparisons can be subsequently performed on just the short list of candidates rather than the entire database. •  Example of a technique: Vocabulary Tree (VT)-Based Retrieval Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 28. MVS: geometric verification •  Goal: use location information of features in query and database images to confirm that the feature matches are consistent with a change in view-point between the two images. Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques  
  • 29. ik2, c, ikNk 6 is sorted, it is more utive ID differences 5 dk1 5 ik1, es. is used to encode the inverted index. 2 ik1Nk 212 6 in place of the IDs. This dex [58] can significantly reduce cting recognition accuracy. First, [64] and recursive bottom-up complete (RBUC) code [65] have been shown to be at least ten times faster in decoding than MVS: geometric verification AC, while achieving comparable compression gains as AC. The carryover and RBUC codes attain these speedups by enforcing ed in text retrieval [62]. Second, word-aligned memory accesses. n be quantized to a few repre- Figure S6(a) compares the memory usage of the invert- •  Method: perform ed index with and without feature descriptorsRBUC evaluate Max quantization. Third, the dis- pairwise matching of compression using the and ces and visit counts are far from code. Index compression reduces memory usage from near- geometricrate ly 10 GBof correspondences. coding can be much more consistency to 2 GB. This five times reduction leads to a sub- •  Techniques: oding. Using the distributions of stantial speedup in server-side processing, as shown in counts, each inverted list can be Figure S6(b). Without compression, the large inverted c code (AC) [63]. The geometricindex causes swapping between main anddatabase image is usually –  Since keeping transform between the query and virtual memory estimated very important for interactive regression down the retrieval engine. After compression, using robust and slows techniques such as: ions, a scheme that allows ultra- sample consensus (RANSAC) (Fischlermemory congestion •  Random memory swapping is avoided and and Bolles 1981) red over AC. The carryover code delays no longer contribute to the query latency. •  Hough transform (Lowe 2004) –  The transformation is often represented by an affine mapping or a homography. •  Note: GV is computationally expensive, which is why it’s only used for a subset of images selected during the feature-matching stage. onsistency checks to rerank tion and scale information of [53] and [69] propose incor- tion into the VT matching or 71], the authors investigate stimation itself. Philbin et al. atching features to propose c transformation model and hypotheses. Weak geometric cally used to rerank a larger ore a full GVt  al.  Iperformed on011   Girod  e is EEE  MulUmedia  2 Oge  Marques   [FIG4] In the GV step, we match feature descriptors pairwise and find feature correspondences that are consistent with a geometric add a geometric reranking step
  • 30. Datasets for MVS research •  Stanford Mobile Visual Search Data Set (http://web.cs.wpi.edu/~claypool/mmsys-dataset/2011/stanford/) –  Key characteristics: •  rigid objects •  widely varying lighting conditions •  perspective distortion •  foreground and background clutter •  realistic ground-truth reference data •  query data collected from heterogeneous low and high-end camera phones. Chandrasekhar  et  al.  ACM  MMSys  2011   Oge  Marques  
  • 31. SMVS Data Set: categories and examples •  DVD covers h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/dvd_covers.html     Oge  Marques  
  • 32. SMVS Data Set: categories and examples •  CD covers h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/cd_covers.html     Oge  Marques  
  • 33. SMVS Data Set: categories and examples •  Museum paintings h,p://web.cs.wpi.edu/~claypool/mmsys-­‐2011-­‐dataset/stanford/mvs_images/museum_painUngs.html     Oge  Marques  
  • 34. Other MVS data sets ISO/IEC  JTC1/SC29/WG11/N12202  -­‐  July  2011,  Torino,  IT   Oge  Marques  
  • 35. MPEG Compact Descriptors for Visual Search (CDVS) •  Objective –  Define a standard that enables efficient implementation of visual search functionality on mobile devices •  Scope •  bitstream of descriptors •  parts of descriptor extraction process (e.g. key-point detection) needed to ensure interoperability –  Additional info: •  https://mailhost.tnt.uni-hannover.de/mailman/listinfo/cdvs •  http://mpeg.chiariglione.org/meetings/geneva11-1/geneva_ahg.htm (Ad hoc groups) Bober,  Cordara,  and  Reznik  (2010)   Oge  Marques  
  • 36. MPEG CDVS [3B2-9] mmu2011030086.3d 1/8/011 16:44 Page 93 •  Summarized timeline Table 1. Timeline for development of MPEG standard for visual search. When Milestone Comments March, 2011 Call for Proposals is published Registration deadline: 11 July 2011 Proposals due: 21 November 2011 December, 2011 Evaluation of proposals None February, 2012 1st Working Draft First specification and test software model that can be used for subsequent improvements. July, 2012 Committee Draft Essentially complete and stabilized specification. January, 2013 Draft International Standard Complete specification. Only minor editorial changes are allowed after DIS. July, 2013 Final Draft International Finalized specification, submitted for approval and Standard publication as International standard. that among several component technologies for existing standards, such as MPEG Query For- image retrieval, such a standard should focus pri- mat, HTTP, XML, JPEG, and JPSearch. marily on defining the format of descriptors and Girod  et  al.  IEEE  MulUmedia  2011   Oge  Marques   parts of their extraction process (such as interest Conclusions and outlook point detectors) needed to ensure interoperabil- Recent years have witnessed remarkable
  • 37. Part IV Examples and applications
  • 38. Examples •  Google Goggles •  SnapTell •  oMoby (and the IQ Engines API) •  pixlinQ •  Moodstocks Oge  Marques  
  • 39. Examples of commercial MVS apps •  Google Goggles –  Android and iPhone –  Narrow- domain search and retrieval h,p://www.google.com/mobile/goggles     Oge  Marques  
  • 40. SnapTell •  One of the earliest (ca. 2008) MVS apps for iPhone –  Eventually acquired by Amazon (A9) •  Proprietary technique (“highly accurate and robust algorithm for image matching: Accumulated Signed Gradient (ASG)”). h,p://www.snaptell.com/technology/index.htm     Oge  Marques  
  • 41. oMoby (and the IQ Engines API) –  iPhone app h,p://omoby.com/pages/screenshots.php     Oge  Marques  
  • 42. oMoby (and the IQ Engines API) •  The IQ Engines API: “vision as a service” h,p://www.iqengines.com/applicaUons.php     Oge  Marques  
  • 43. pixlinQ •  A “mobile visual search solution that enables you to link users to digital content whenever they take a mobile picture of your printed materials.” –  Powered by image recognition from LTU technologies h,p://www.pixlinq.com/home     Oge  Marques  
  • 44. pixlinQ •  Example app (La Redoute) h,p://www.youtube.com/watch?v=qUZCFtc42Q4     Oge  Marques  
  • 45. Moodstocks: overview •  Offline image recognition thanks to a smart image signatures synchronization h,p://www.youtube.com/watch?v=tsxe23b12eU     Oge  Marques  
  • 46. Moodstocks: technology •  Unique features: –  offline image recognition thanks to a smart image signatures synchronization, –  QR Code decoding, –  EAN 8/13 decoding, –  online image recognition as a fallback for very large image databases, –  simultaneous run of image recognition and barcode decoding, –  seamless scans logging in the background. •  Cross-platform (iOS / Android) client-side SDK and HTTP API available: https://github.com/Moodstocks •  JPEG encoder used within their SDK also publicly available: https://github.com/Moodstocks/jpec Oge  Marques  
  • 47. Moodstocks •  Many successful apps for different platforms h,p://www.moodstocks.com/gallery/     Oge  Marques  
  • 49. Concluding thoughts •  Mobile Visual Search (MVS) is coming of age. •  This is not a fad and it can only grow. •  Still a good research topic –  Many relevant technical challenges –  MPEG efforts have just started •  Infinite creative commercial possibilities Oge  Marques  
  • 50. Thanks! •  Questions? •  For additional information: omarques@fau.edu Oge  Marques