SlideShare uma empresa Scribd logo
1 de 38
Oge Marques
Florida Atlantic University
     Boca Raton, FL - USA
    “Image search and retrieval” is not a problem,
     but rather a collection of related problems that
     look like one.

    10 years after “the end of the early years”,
     research in image search and retrieval still has
     many open problems, challenges, and
     opportunities.
    This is a highly interdisciplinary field, but …

                        Image and       (Multimedia)
                                                         Information
                          Video          Database
                                                           Retrieval
                        Processing        Systems




                                          Visual
                     Machine                                 Computer
                     Learning          Information            Vision
                                         Retrieval



                                         Visual data
                                                        Human Visual
                         Data Mining    modeling and
                                                         Perception
                                       representation
    There are many things that I believe…




    … but cannot prove
The “big mismatch”
    It’s been 10 years since the “end of the early
     years” [Smeulders et al., 2000]




     ◦  Are the challenges from 2000 still relevant?
     ◦  Are the directions and guidelines from 2000 still
        appropriate?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Driving forces
        “[…] content-based image retrieval (CBIR) will continue
         to grow in every direction: new audiences, new
         purposes, new styles of use, new modes of interaction,
         larger data sets, and new methods to solve the
         problems.”
    Yes, we have seen many new audiences, new
     purposes, new styles of use, and new modes
     of interaction emerge.

    Each of these usually requires new methods
     to solve the problems that they bring.

    However, not too many researchers see them
     as a driving force (as they should).
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Heritage of computer vision
        “An important obstacle to overcome […] is to realize
         that image retrieval does not entail solving the general
         image understanding problem.”
    I’m afraid I have bad news…
     ◦  Computer vision hasn’t made so much progress
        during the past 10 years.

     ◦  Some classical problems 

        (including image 

        understanding)

        remain unresolved.

     ◦  Similarly, CBIR from a 

        pure computer vision

        perspective didn’t work 

        too well either.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Influence on computer vision
        “[…] CBIR offers a different look at traditional computer
         vision problems: large data sets, no reliance on strong
         segmentation, and revitalized interest in color image
         processing and invariance.”
    The adoption of large data sets became standard
     practice in computer vision (see Torralba’s work).
    No reliance on strong segmentation (still
     unresolved)  new areas of research, e.g.,
     automatic ROI extraction and RBIR.
    Color image processing and color descriptors
     became incredibly popular, useful, and (to some
     degree) effective.
    Invariance still a huge problem
     ◦  But it’s cheaper than ever to have multiple views.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Similarity and learning
        “We make a pledge for the importance of human-
         based similarity rather than general similarity. Also,
         the connection between image semantics, image data,
         and query context will have to be made clearer in the
         future.”
        “[…] in order to bring semantics to the user, learning is
         inevitable.”
    Similarity is a tough problem to crack and
     model.

    See it for yourself…
    Are these two images similar?
    Are these two images similar?
    Is the second or the third image more similar
     to the first?
    Which image fits better to the first two: the
     third or the fourth?
    Is learning really inevitable?

    Maybe, maybe not, but it sure comes handy
     in some specific cases…
     ◦  SVM anyone?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Interaction
        Better visualization options, more control to the user,
         ability to provide feedback […]
    Significant progress on visualization
     interfaces and devices.

    Relevance Feedback: still a very tricky
     tradeoff (effort vs. perceived benefit), but
     more popular than ever (rating, thumbs up/
     down, etc.)
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Need for databases
        “The connection between CBIR and database research
         is likely to increase in the future. […] problems like the
         definition of suitable query languages, efficient search
         in high dimensional feature space, search in the
         presence of changing similarity measures are largely
         unsolved […]”
    Very little progress
     ◦  Image search and retrieval has benefited much
        more from document information retrieval than
        from database research.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  The problem of evaluation
        CBIR could use a reference standard against which new
         algorithms could be evaluated (similar to TREC in the
         field of text recognition).
        “A comprehensive and publicly available collection of
         images, sorted by class and retrieval purposes,
         together with a protocol to standardize experimental
         practices, will be instrumental in the next phase of
         CBIR.”
    Significant progress on benchmarks,
     standardized datasets, etc.
     ◦  ImageCLEF
     ◦  Pascal VOC Challenge
     ◦  MSRA dataset
     ◦  Simplicity dataset
     ◦  UCID dataset and ground truth (GT)
     ◦  Accio / SIVAL dataset and GT
     ◦  Caltech 101, Caltech 256
     ◦  LabelMe
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Semantic gap and other sources
        “A critical point in the advancement of CBIR is the
         semantic gap, where the meaning of an image is rarely
         self-evident. […] One way to resolve the semantic gap
         comes from sources outside the image by integrating
         other sources of information about the image in the
         query.”
    The semantic gap problem has not been
     solved (and maybe will never be…)

    What are the alternatives?
     1.  Treat visual similarity and semantic relatedness
         differently
        Examples: Alipr, Google similarity search, etc.
     2.  Improve both (text-based and visual) search
         methods independently
     3.  Trust the user
        CFIR, collaborative filtering, crowdsourcing, games.
    I postulate that image search and retrieval is
     not a problem (but, instead, a collection of
     related problems that look like one)

    There are many potential opportunities for
     good solutions to specific problems

    One promising avenue: think about image
     retrieval as added value (e.g., like.com, SPE,
     etc.)
    Google Similarity Search (VisualRank) [Jing &
     Baluja, 2008]



    Google Goggles (mobile visual search)
    Google Goggles understands narrow-domain
     search and retrieval




    Several other apps for iPhone, iPad, and
     Android (e.g., kooaba and Fetch!)
    The Web 2.0 has brought about:
     ◦  New data sources
     ◦  New usage patterns
     ◦  New understanding about the users, their needs,
        habits, preferences
     ◦  New opportunities
     ◦  Lots of metadata!

     ◦  A chance to experience a true paradigm shift
        Before: image annotation is tedious, labor-intensive,
         expensive
        After: image annotation is fun!
    Games!
     ◦  Google Image Labeler
     ◦  Games with a purpose (GWAP):
        The ESP Game
        Squigl
        Matchin
    New devices and services…

     ◦  Flickr (b. 2004)
     ◦  YouTube (b. 2005)
     ◦  Flip video cameras (b. 2006)
     ◦  iPhone (b. 2007)
     ◦  iPad (b. 2010)
    New opportunities for narrowing the semantic
     gap
     ◦  From bottom up: (semi-)automatic image
        annotation
     ◦  From top down: using (content / context)
        ontologies
     ◦  Combining top-down and bottom-up

    New fields of research, including:
     ◦  Tag recommendation systems
     ◦  User intentions in image search
    Many opportunities await…
–    I believe (but cannot prove…) that successful
     Image Search & Retrieval solutions will:
     •  combine content-based image retrieval (CBIR) with
        metadata (high-level semantic-based image
        retrieval)
     •  only be truly successful in narrow domains
     •  include the user in the loop
      –  Relevance Feedback (RF)
      –  Collaborative efforts (tagging, rating, annotating)
     •  provide friendly, intuitive interfaces
     •  incorporate results and insights from cognitive
        science, particularly human visual attention,
        perception, and memory
Questions?




             omarques@fau.edu

Mais conteúdo relacionado

Mais procurados

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Visionantiw
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resumebutest
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introductioncairo university
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...webhostingguy
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLAlbert Y. C. Chen
 
Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?MaxKemman
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Keith Schengili-Roberts
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Keith Schengili-Roberts
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFenareti Lampathaki
 

Mais procurados (16)

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
16 ijcse-01237
16 ijcse-0123716 ijcse-01237
16 ijcse-01237
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
 
Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?
 
Resume 2015/1
Resume 2015/1Resume 2015/1
Resume 2015/1
 
Viva presentation
Viva presentation Viva presentation
Viva presentation
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
 
An Introduction to Face Detection
An Introduction to Face DetectionAn Introduction to Face Detection
An Introduction to Face Detection
 

Destaque

Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Obama White House
 
Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Neil Milliken
 
упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1eclass
 
Entrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessEntrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessCory Miller
 
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)Brenda Meller
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - KeynoteCory Miller
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherCory Miller
 
Department of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanDepartment of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanObama White House
 
Whitepaper ame purchasing
Whitepaper ame purchasingWhitepaper ame purchasing
Whitepaper ame purchasingmykalz71
 
The First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessThe First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessCory Miller
 
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)wnelson0001
 
How to Put Your Reading on Steroids
How to Put Your Reading on SteroidsHow to Put Your Reading on Steroids
How to Put Your Reading on SteroidsCory Miller
 

Destaque (20)

CAR Email 6.21.02
CAR Email 6.21.02CAR Email 6.21.02
CAR Email 6.21.02
 
Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Crew Documents 020700 - 020754
Crew Documents 020700 - 020754
 
Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010
 
упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1
 
Entrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessEntrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for Success
 
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - Keynote
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far Together
 
RCEC Email 4.16.03
RCEC Email 4.16.03RCEC Email 4.16.03
RCEC Email 4.16.03
 
SERA Email 1.20.03
SERA Email 1.20.03SERA Email 1.20.03
SERA Email 1.20.03
 
RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)
 
RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)
 
Profile Inspire
Profile InspireProfile Inspire
Profile Inspire
 
Department of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanDepartment of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform Plan
 
RCEC Email 5.30.03
RCEC Email 5.30.03RCEC Email 5.30.03
RCEC Email 5.30.03
 
Whitepaper ame purchasing
Whitepaper ame purchasingWhitepaper ame purchasing
Whitepaper ame purchasing
 
The First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessThe First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your Business
 
Carpe diem2
Carpe diem2Carpe diem2
Carpe diem2
 
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
 
How to Put Your Reading on Steroids
How to Put Your Reading on SteroidsHow to Put Your Reading on Steroids
How to Put Your Reading on Steroids
 

Semelhante a Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesCSCJournals
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and NowSi Krishan
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsunyil96
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide webunyil96
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015dermotte
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
CORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationCORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationScott M. Confer
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesIJERA Editor
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Visionbutest
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveysipij
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Gene Moo Lee
 
Project presentation by Debendra Adhikari
Project presentation by Debendra AdhikariProject presentation by Debendra Adhikari
Project presentation by Debendra AdhikariDEBENDRA ADHIKARI
 
Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation LearningSangwoo Mo
 
The deep learning technology on coco framework full report
The deep learning technology on coco framework full reportThe deep learning technology on coco framework full report
The deep learning technology on coco framework full reportJIEMS Akkalkuwa
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET Journal
 
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram ProcessingEfficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processingsipij
 

Semelhante a Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010) (20)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and Approaches
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
CORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationCORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements Elicitation
 
Parents
ParentsParents
Parents
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
40120140501006
4012014050100640120140501006
40120140501006
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
 
Project presentation by Debendra Adhikari
Project presentation by Debendra AdhikariProject presentation by Debendra Adhikari
Project presentation by Debendra Adhikari
 
Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
The deep learning technology on coco framework full report
The deep learning technology on coco framework full reportThe deep learning technology on coco framework full report
The deep learning technology on coco framework full report
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
 
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram ProcessingEfficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processing
 

Último

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Último (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

  • 1. Oge Marques Florida Atlantic University Boca Raton, FL - USA
  • 2.   “Image search and retrieval” is not a problem, but rather a collection of related problems that look like one.   10 years after “the end of the early years”, research in image search and retrieval still has many open problems, challenges, and opportunities.
  • 3.   This is a highly interdisciplinary field, but … Image and (Multimedia) Information Video Database Retrieval Processing Systems Visual Machine Computer Learning Information Vision Retrieval Visual data Human Visual Data Mining modeling and Perception representation
  • 4.   There are many things that I believe…   … but cannot prove
  • 6.   It’s been 10 years since the “end of the early years” [Smeulders et al., 2000] ◦  Are the challenges from 2000 still relevant? ◦  Are the directions and guidelines from 2000 still appropriate?
  • 7.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Driving forces   “[…] content-based image retrieval (CBIR) will continue to grow in every direction: new audiences, new purposes, new styles of use, new modes of interaction, larger data sets, and new methods to solve the problems.”
  • 8.   Yes, we have seen many new audiences, new purposes, new styles of use, and new modes of interaction emerge.   Each of these usually requires new methods to solve the problems that they bring.   However, not too many researchers see them as a driving force (as they should).
  • 9.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Heritage of computer vision   “An important obstacle to overcome […] is to realize that image retrieval does not entail solving the general image understanding problem.”
  • 10.   I’m afraid I have bad news… ◦  Computer vision hasn’t made so much progress during the past 10 years. ◦  Some classical problems 
 (including image 
 understanding)
 remain unresolved. ◦  Similarly, CBIR from a 
 pure computer vision
 perspective didn’t work 
 too well either.
  • 11.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Influence on computer vision   “[…] CBIR offers a different look at traditional computer vision problems: large data sets, no reliance on strong segmentation, and revitalized interest in color image processing and invariance.”
  • 12.   The adoption of large data sets became standard practice in computer vision (see Torralba’s work).   No reliance on strong segmentation (still unresolved)  new areas of research, e.g., automatic ROI extraction and RBIR.   Color image processing and color descriptors became incredibly popular, useful, and (to some degree) effective.   Invariance still a huge problem ◦  But it’s cheaper than ever to have multiple views.
  • 13.
  • 14.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Similarity and learning   “We make a pledge for the importance of human- based similarity rather than general similarity. Also, the connection between image semantics, image data, and query context will have to be made clearer in the future.”   “[…] in order to bring semantics to the user, learning is inevitable.”
  • 15.   Similarity is a tough problem to crack and model.   See it for yourself…
  • 16.   Are these two images similar?
  • 17.   Are these two images similar?
  • 18.   Is the second or the third image more similar to the first?
  • 19.   Which image fits better to the first two: the third or the fourth?
  • 20.   Is learning really inevitable?   Maybe, maybe not, but it sure comes handy in some specific cases… ◦  SVM anyone?
  • 21.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Interaction   Better visualization options, more control to the user, ability to provide feedback […]
  • 22.   Significant progress on visualization interfaces and devices.   Relevance Feedback: still a very tricky tradeoff (effort vs. perceived benefit), but more popular than ever (rating, thumbs up/ down, etc.)
  • 23.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Need for databases   “The connection between CBIR and database research is likely to increase in the future. […] problems like the definition of suitable query languages, efficient search in high dimensional feature space, search in the presence of changing similarity measures are largely unsolved […]”
  • 24.   Very little progress ◦  Image search and retrieval has benefited much more from document information retrieval than from database research.
  • 25.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  The problem of evaluation   CBIR could use a reference standard against which new algorithms could be evaluated (similar to TREC in the field of text recognition).   “A comprehensive and publicly available collection of images, sorted by class and retrieval purposes, together with a protocol to standardize experimental practices, will be instrumental in the next phase of CBIR.”
  • 26.   Significant progress on benchmarks, standardized datasets, etc. ◦  ImageCLEF ◦  Pascal VOC Challenge ◦  MSRA dataset ◦  Simplicity dataset ◦  UCID dataset and ground truth (GT) ◦  Accio / SIVAL dataset and GT ◦  Caltech 101, Caltech 256 ◦  LabelMe
  • 27.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Semantic gap and other sources   “A critical point in the advancement of CBIR is the semantic gap, where the meaning of an image is rarely self-evident. […] One way to resolve the semantic gap comes from sources outside the image by integrating other sources of information about the image in the query.”
  • 28.   The semantic gap problem has not been solved (and maybe will never be…)   What are the alternatives? 1.  Treat visual similarity and semantic relatedness differently   Examples: Alipr, Google similarity search, etc. 2.  Improve both (text-based and visual) search methods independently 3.  Trust the user   CFIR, collaborative filtering, crowdsourcing, games.
  • 29.   I postulate that image search and retrieval is not a problem (but, instead, a collection of related problems that look like one)   There are many potential opportunities for good solutions to specific problems   One promising avenue: think about image retrieval as added value (e.g., like.com, SPE, etc.)
  • 30.   Google Similarity Search (VisualRank) [Jing & Baluja, 2008]   Google Goggles (mobile visual search)
  • 31.   Google Goggles understands narrow-domain search and retrieval   Several other apps for iPhone, iPad, and Android (e.g., kooaba and Fetch!)
  • 32.   The Web 2.0 has brought about: ◦  New data sources ◦  New usage patterns ◦  New understanding about the users, their needs, habits, preferences ◦  New opportunities ◦  Lots of metadata! ◦  A chance to experience a true paradigm shift   Before: image annotation is tedious, labor-intensive, expensive   After: image annotation is fun!
  • 33.   Games! ◦  Google Image Labeler ◦  Games with a purpose (GWAP):   The ESP Game   Squigl   Matchin
  • 34.   New devices and services… ◦  Flickr (b. 2004) ◦  YouTube (b. 2005) ◦  Flip video cameras (b. 2006) ◦  iPhone (b. 2007) ◦  iPad (b. 2010)
  • 35.   New opportunities for narrowing the semantic gap ◦  From bottom up: (semi-)automatic image annotation ◦  From top down: using (content / context) ontologies ◦  Combining top-down and bottom-up   New fields of research, including: ◦  Tag recommendation systems ◦  User intentions in image search
  • 36.   Many opportunities await…
  • 37. –  I believe (but cannot prove…) that successful Image Search & Retrieval solutions will: •  combine content-based image retrieval (CBIR) with metadata (high-level semantic-based image retrieval) •  only be truly successful in narrow domains •  include the user in the loop –  Relevance Feedback (RF) –  Collaborative efforts (tagging, rating, annotating) •  provide friendly, intuitive interfaces •  incorporate results and insights from cognitive science, particularly human visual attention, perception, and memory
  • 38. Questions? omarques@fau.edu