SlideShare uma empresa Scribd logo
1 de 13
Baixar para ler offline
Object Recognition with Pictorial Structures

                Pedro F. Felzenszwalb
                University of Chicago
                 pff@cs.uchicago.edu

        Joint work with Daniel P. Huttenlocher
Pictorial structures


Part-based representation:

 • Each part models local visual properties.

 • “Springs” model spatial relationships.

 • Joint estimation of part locations.

    – No hard detection of parts or features.

    – No initialization parameters.


                                                1
• Model is represented by a graph G = (V, E).

   – V = {v1, . . . , vn} are the parts.

   – (vi, vj ) ∈ E indicates a connection between parts.

• mi(li) is the cost of placing part i at location li.

• dij (li, lj ) is a deformation cost.

• Optimal location for object is given by L∗ = (l1, . . . , ln),
                                                 ∗           ∗
                                                            
                          n
            L∗ = argmin     mi(li) +            dij (li, lj )
                                                              
                        
                    L    i=1          (vi,vj )∈E



                                                                  2
Efficient minimization

                                                           
                         n
          L∗ = argmin      mi(li) +            dij (li, lj )
                                                            
                  L     i=1          (vi,vj )∈E

• n parts and h locations gives hn configurations.

• If graph is a tree we can use dynamic programming.

   – O(nh2), much better but still slow.

• If dij (li, lj ) = ||Tij (li) − Tji(lj )||2 can use DT.

   – O(nh), as good as matching each part separately!!

                                                                 3
Distance transform
 Given a set of points on a grid P ⊆ G,
the quadratic distance transform of P is,


          DP (q) = min ||q − p||2
                    p∈P




           P                DP

                                            4
Generalized distance transform


Given a function f : G → R,

                 Df (q) = min ||q − p||2 + f (p)
                              p∈G

 – for each location q, find nearby location p with f (p) small.

 – equals DT of points P if f is an indicator function.
                                    
                                    0   if p ∈ P
                       f (p) =                     .
                                    ∞   otherwise



                                                           5
1D case:      Df (q) = minp∈G (q − p)2 + f (p)

For each p, Df (q) is below the parabola rooted at (p, f (p)).

Df (q) is defined by the lower envelope of h parabolas.
                                          1
                          f




                              (




                                              )
                                      2
                          f




                              (




                                              )
                                  §
                      h




                                          1
                  f




                      (




                                              )
                                      0
                          f




                              (




                                              )




                                                                                                                      §
                                                              .




                                                                  .




                                                                      .




                                                                          .




                                                                              .




                                                                                  .




                                                                                      .




                                                                                          .




                                                                                              .




                                                                                                  .




                                                                                                      .




                                                                                                          .




                                                                                                              .
                                                  0




                                                      1




                                                          2




                                                                                                                  h




                                                                                                                          1



                                                                                                                              6
There is a simple geometric algorithm that computes Df (p) in
O(h) time for the 1D case.

 – similar to Graham’s scan convex hull algorithm.

 – about 20 lines of C code.


The 2D case is “separable”, it can be solved by sequential 1D
transformations along rows and columns of the grid.

See Distance Transforms of Sampled Functions, Felzen-
szwalb and Huttenlocher.




                                                        7
Simple face model

• Locations are positions in the image grid.

• Match cost mi(li) for placing part i at li.

• Central part v1 - the nose.

• Each part has an ideal position pi relative to nose.

  – Let T1i(l1) = l1 + pi,

                               n                n
        E(l1, . . . , ln) =         mi(li) +         ||li − T1i(l1)||2
                              i=1              i=2


                                                                         8
Efficient minimization

                                                            
                   n                n
L∗ = argmin            mi(li) +             ||li − T1i(l1)||2
        L         i=1              i=2
                                                                 
                              n
L∗ = argmin m1(l1) +              mi(li) + ||li − T1i(l1)||2
        L                    i=2
                                                                     
                              n
 ∗
l1 = argmin m1(l1) +              min(mi(li) + ||li − T1i(l1)||2)
        l1                   i=2        li

                                                       
                              n
 ∗
l1 = argmin m1(l1) +              Dmi (T1i(l1))
        l1                   i=2
                                                                      9
Matching results




                   10
Matching results




                   11
Summary


• Generic framework for part-based modeling.


• Global minimization for deformable objects can be fast.


• Soft detection avoids unnecessary early decisions.


• Partial occlusion is handled automatically.



                                                        12

Mais conteúdo relacionado

Mais procurados

Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measures
Nene Thomas
 
Trigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengthsTrigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengths
Nene Thomas
 
Form 5 formulae and note
Form 5 formulae and noteForm 5 formulae and note
Form 5 formulae and note
smktsj2
 
02 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_0202 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_02
Niit Care
 
Practical Meta Programming
Practical Meta ProgrammingPractical Meta Programming
Practical Meta Programming
Reggie Meisler
 
Generic Image Processing With Climb - Slides
Generic Image Processing With Climb - SlidesGeneric Image Processing With Climb - Slides
Generic Image Processing With Climb - Slides
Laurent Senta
 

Mais procurados (19)

C++ Chapter I
C++ Chapter IC++ Chapter I
C++ Chapter I
 
Chemisty Stream (2013-January) Question Papers
Chemisty  Stream (2013-January) Question PapersChemisty  Stream (2013-January) Question Papers
Chemisty Stream (2013-January) Question Papers
 
Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measures
 
Trigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengthsTrigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengths
 
Form 5 formulae and note
Form 5 formulae and noteForm 5 formulae and note
Form 5 formulae and note
 
Identity Based Encryption
Identity Based EncryptionIdentity Based Encryption
Identity Based Encryption
 
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
 
02 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_0202 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_02
 
Lesson 1: Functions
Lesson 1: FunctionsLesson 1: Functions
Lesson 1: Functions
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
Matlab
MatlabMatlab
Matlab
 
Advanced C programming
Advanced C programmingAdvanced C programming
Advanced C programming
 
Practical Meta Programming
Practical Meta ProgrammingPractical Meta Programming
Practical Meta Programming
 
Tipos de funciones
Tipos de funcionesTipos de funciones
Tipos de funciones
 
Numeros en mandarin
Numeros en mandarinNumeros en mandarin
Numeros en mandarin
 
Lesson03 The Concept Of Limit 027 Slides
Lesson03   The Concept Of Limit 027 SlidesLesson03   The Concept Of Limit 027 Slides
Lesson03 The Concept Of Limit 027 Slides
 
Lesson 20: Derivatives and the Shapes of Curves
Lesson 20: Derivatives and the Shapes of CurvesLesson 20: Derivatives and the Shapes of Curves
Lesson 20: Derivatives and the Shapes of Curves
 
MATHEON Center Days: Index determination and structural analysis using Algori...
MATHEON Center Days: Index determination and structural analysis using Algori...MATHEON Center Days: Index determination and structural analysis using Algori...
MATHEON Center Days: Index determination and structural analysis using Algori...
 
Generic Image Processing With Climb - Slides
Generic Image Processing With Climb - SlidesGeneric Image Processing With Climb - Slides
Generic Image Processing With Climb - Slides
 

Mais de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
zukun
 

Mais de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Object recognition with pictorial structures

  • 1. Object Recognition with Pictorial Structures Pedro F. Felzenszwalb University of Chicago pff@cs.uchicago.edu Joint work with Daniel P. Huttenlocher
  • 2. Pictorial structures Part-based representation: • Each part models local visual properties. • “Springs” model spatial relationships. • Joint estimation of part locations. – No hard detection of parts or features. – No initialization parameters. 1
  • 3. • Model is represented by a graph G = (V, E). – V = {v1, . . . , vn} are the parts. – (vi, vj ) ∈ E indicates a connection between parts. • mi(li) is the cost of placing part i at location li. • dij (li, lj ) is a deformation cost. • Optimal location for object is given by L∗ = (l1, . . . , ln), ∗ ∗   n L∗ = argmin  mi(li) + dij (li, lj )   L i=1 (vi,vj )∈E 2
  • 4. Efficient minimization   n L∗ = argmin  mi(li) + dij (li, lj )   L i=1 (vi,vj )∈E • n parts and h locations gives hn configurations. • If graph is a tree we can use dynamic programming. – O(nh2), much better but still slow. • If dij (li, lj ) = ||Tij (li) − Tji(lj )||2 can use DT. – O(nh), as good as matching each part separately!! 3
  • 5. Distance transform Given a set of points on a grid P ⊆ G, the quadratic distance transform of P is, DP (q) = min ||q − p||2 p∈P P DP 4
  • 6. Generalized distance transform Given a function f : G → R, Df (q) = min ||q − p||2 + f (p) p∈G – for each location q, find nearby location p with f (p) small. – equals DT of points P if f is an indicator function.  0 if p ∈ P f (p) = . ∞ otherwise 5
  • 7. 1D case: Df (q) = minp∈G (q − p)2 + f (p) For each p, Df (q) is below the parabola rooted at (p, f (p)). Df (q) is defined by the lower envelope of h parabolas. 1 f ( ) 2 f ( ) § h 1 f ( ) 0 f ( ) § . . . . . . . . . . . . . 0 1 2 h 1 6
  • 8. There is a simple geometric algorithm that computes Df (p) in O(h) time for the 1D case. – similar to Graham’s scan convex hull algorithm. – about 20 lines of C code. The 2D case is “separable”, it can be solved by sequential 1D transformations along rows and columns of the grid. See Distance Transforms of Sampled Functions, Felzen- szwalb and Huttenlocher. 7
  • 9. Simple face model • Locations are positions in the image grid. • Match cost mi(li) for placing part i at li. • Central part v1 - the nose. • Each part has an ideal position pi relative to nose. – Let T1i(l1) = l1 + pi, n n E(l1, . . . , ln) = mi(li) + ||li − T1i(l1)||2 i=1 i=2 8
  • 10. Efficient minimization   n n L∗ = argmin  mi(li) + ||li − T1i(l1)||2 L i=1 i=2   n L∗ = argmin m1(l1) + mi(li) + ||li − T1i(l1)||2 L i=2   n ∗ l1 = argmin m1(l1) + min(mi(li) + ||li − T1i(l1)||2) l1 i=2 li   n ∗ l1 = argmin m1(l1) + Dmi (T1i(l1)) l1 i=2 9
  • 13. Summary • Generic framework for part-based modeling. • Global minimization for deformable objects can be fast. • Soft detection avoids unnecessary early decisions. • Partial occlusion is handled automatically. 12