SlideShare uma empresa Scribd logo
1 de 1
Baixar para ler offline
Semantic Structure From Motion
                                                                                       Sid Yingze Bao and Silvio Savarese
     VISION LAB                                                      Electrical and Computer Engineering, University of Michigan at Ann Arbor
                                                                                              Source code and data: http://www.eecs.umich.edu/vision/projects/ssfm/index.html

Introduction                                                                                                        Model Overview                                                                                                        Results
Goal:                                                                                                                                                               Assumption:                                                                                                                                         Car                     Person
                                                                                                                    {O,Q,C}=arg max P(q,u,o|C,O,Q)                                                                                        Comparison Baselines
                                                                                                                                                                                                                                                                                                                                                                     Office

Estimate 3D location and pose of objects, 3D location                                                                                                               Given camera hypothesis, objects
                                                                                                                     =arg max P(q,u|C,Q)P(o|C,O)                                                                                          - Camera Pose Est.: Bundler [1]
of points, and camera parameters from 2 or more images.                                                                                                             and points are independent
                                                                                                                                                                                                                                          - Object Detection: LSVM [2]
                                                                                                                     Point Likelihood P(q,u|C,Q)
                                                                                                                                                                                                                                        1. Car Dataset [3] (available online)
                                                                                                                                                                                                                                           - Images and Dense Lidar Points
                                                                                                                                                                                                                                           - ~500 testing images in 10 scenarios
Motivation:                                                                                                           Object Likelihood P(o|C,O)                                                                                                 Cam. T. est. error v.s. baseline         3D object localization
                                                                                                                                                                                                                                                                               0.3
- Most 3D reconstruction methods do                                                                                                                                                                                                      40
                                                                                                                                                                                                                                                 e (degree)
                                                                                                                                                                                                                                                  T                           0.25
                                                                                                                                                                                                                                                                                    Recall
                                                                                                                       - Estimate 3D object likelihood by 2D                                                                                          Bundler                  0.2                          4 Cam
  not povide semantic information.                                                                                                                                                                                                                                      SSFM 0.15                           3 Cam
                                                                                                                         projection appearance:                                                                                          20
                                                                                                                                                                                                                                                                               0.1                          2 Cam
- Most recognition methds do not         Conventional                                                                                                                                                                                                                         0.05
                                                                                                                                                                                                                                                                                                            1 Cam
                                                                                                                                                                                                                                                                                           False Positive per Image
                                                                                                                                                                                                                                         0
  provide geometry and camera pose. Structure From Motion                                                                                                                                                                                    1            2           3     4
                                                                                                                                                                                                                                                                                0
                                                                                                                                                                                                                                                                                   0    2       4      6      8    10

- We propose to solve these two problem jointly.                                     2D object detection
Advantages:
- Improve camera pose estimation, compared to feature-point-based SFM.
- Improve object detections given multiple images, compared to independently
  detecting objects from each single images.
- Establish object correspondences across views.



SSFM Problem Formulation                                                                                                                                                                                                                2. Kinect Office Dataset (available online)                                           Cam. T. est. error v.s. baseline
                                                                                                                                                                                                                                                                                                                         20 e (degree)
                                                                                                                                                                                                                                                                                                                                                                 3D object localization


                                                                                                            Q
                                                                                                                                                                                                                                           - Images and calibrated Kinect 3D range data                                      T                                   Recall
                                                                                                                                                                                                                                                                                                                                                                          2 Cam
                                                                                                                                                                                                                                                                                                                                                                          1 Cam
                                                                                                                                                                                                                                                                                                                                            Bundler
 Measurements                                                                                                                                                                                                                              - Mouse, Monitor, and Keyboard                                                10

 - q: point features (e.g. DOG+SIFT)                                                                                                                                                                                                       - 500 images in 10 scenarios                                                   0
                                                                                                                                                                                                                                                                                                                                                     SSFM                 False Positive per Image
                                                                                             O                                                                                                                                                                                                                            0.4         0.6      0.8          1
 - u: point matches (e.g. threshold test)
 - o: 2D objects (e.g. [2])
                                                              o
                                                                                                                    Joint Likelihood Maximization
                                                                                                                     Main challenge:                                  image 1 object det.                   image 2 object det.
Model Parameters (unknowns)                                                                                          High dimensionality of unknowns =>
                                                                              q
                                                                                                                                                                                          azimuth=20


- C: camera (K is known)
                                                                                                                                                                                           zenith =10



                                               C
                                                                                                    o           q    Sample P(q,u,o|C,O,Q) with MCMC                                                                       azimuth=90
                                                                                                                                                                                                                            zenith =5

- Q: 3D points (locations)                                                                                                                                            azimuth=-20
                                                                                                                                                                       zenith =9
                                                                                                                                                                                                             azimuth=70


                                                                                                                    Parameter Initialization
                                                                                                                                                                                                              zenith =10


- O: 3D objects (locations, poses, categories)                                                                                                                                                                                           3. Person Dataset
                                                                                                        C           - Use object detection scale and pose to                                one of camera                                   - A pair of stereo cameras
 Intuition:                                                                                                           initialize cameras relative poses                                   pose initializations
                                                                                                                                                                                                                                            - 400 image pairs in 10 scenarios
 In addition to point features, measurements of objects across views provide add-                                   - Theorem: camera parameters can be
 itional geometrical constraints that allow to relate cameras and scene parameters.                                   estimated given:
                                                                                                                      i) 3 objects with scale; ii) 2 objects with
                                                                                                                      pose; iii) 1 object with scale and pose.
Reference                                                                                                           Monte Carlo Markov Chain
[1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV. 2008.    - Sampling starts from different initializations
[2] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained
                                                                                                                    - Proposal distribution P(q,u,o|C,O,Q)                                                                              Ackowledgement
     part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence of Pattern Analysis, 2009.                                                                   one of camera poses and objects initializations
[3] Gaurav Pandey, James McBride,,and Ryan Eustice,.Ford campus vision and lidar data set. International Journal    - Combine all samples to identify the maximum                                                                       We acknowledge the support of NSF CAREER #1054127 and the Gigascale Systems Research Center.
    of Robotics Research. 2011                                                                                                                                                                                                          We thank Mohit Bagra for collecting the Kinect dataset and Min Sun for helpful feedback.

Mais conteúdo relacionado

Destaque

Juventude amordaçada
Juventude amordaçadaJuventude amordaçada
Juventude amordaçadaCoragem2009
 
Pre production of music video
Pre production of music videoPre production of music video
Pre production of music videoy9_
 
Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016rosemere12
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D Rsangh1212
 
Anvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av csAnvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av csMinistério da Saúde
 
Main Street Square Bierborse 2011
Main Street Square Bierborse 2011Main Street Square Bierborse 2011
Main Street Square Bierborse 2011hoonio
 
Beware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas KashalikarBeware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas Kashalikarsangh1212
 
Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007Elisio Estanque
 
Donnie darko 9 shot
Donnie darko 9 shotDonnie darko 9 shot
Donnie darko 9 shotJack Murphy
 

Destaque (19)

NANOBOY
NANOBOYNANOBOY
NANOBOY
 
Juventude amordaçada
Juventude amordaçadaJuventude amordaçada
Juventude amordaçada
 
MOOCafé ABP
MOOCafé ABP MOOCafé ABP
MOOCafé ABP
 
Pre production of music video
Pre production of music videoPre production of music video
Pre production of music video
 
Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016Curso de idiomas globo inglês livro 016
Curso de idiomas globo inglês livro 016
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D R
 
The grand waltz 1º movement
The grand waltz 1º movementThe grand waltz 1º movement
The grand waltz 1º movement
 
Unidad2 actividad2
Unidad2 actividad2Unidad2 actividad2
Unidad2 actividad2
 
Anvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av csAnvisa aprova anticoagulante de uso oral para tratar av cs
Anvisa aprova anticoagulante de uso oral para tratar av cs
 
Main Street Square Bierborse 2011
Main Street Square Bierborse 2011Main Street Square Bierborse 2011
Main Street Square Bierborse 2011
 
Beware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas KashalikarBeware Of Meditation Dr Shriniwas Kashalikar
Beware Of Meditation Dr Shriniwas Kashalikar
 
Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007Opiniao 2b cultura popular (2), 2007
Opiniao 2b cultura popular (2), 2007
 
2 156 l'anello mancante1 copia
2 156 l'anello mancante1 copia2 156 l'anello mancante1 copia
2 156 l'anello mancante1 copia
 
El cuento 2
El cuento 2El cuento 2
El cuento 2
 
Comunidad del aprendizaje
Comunidad del aprendizajeComunidad del aprendizaje
Comunidad del aprendizaje
 
iPad productivity usage 101 (basics)
iPad productivity usage 101 (basics)iPad productivity usage 101 (basics)
iPad productivity usage 101 (basics)
 
Donnie darko 9 shot
Donnie darko 9 shotDonnie darko 9 shot
Donnie darko 9 shot
 
Debit card loans
Debit card loansDebit card loans
Debit card loans
 
Antonio machado
Antonio machadoAntonio machado
Antonio machado
 

Semelhante a Fcv poster bao

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...c.choi
 
Vehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image featuresVehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image featuresYixin Chen
 
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...Big Data Week São Paulo
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...I3E Technologies
 
A Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingA Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingEswar Publications
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...Seiya Ito
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningYu Huang
 
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGEOBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGEIJCSEA Journal
 
Object Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an ImageObject Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an ImageIJCSEA Journal
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
 
Object class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalObject class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalKunal Kishor Nirala
 

Semelhante a Fcv poster bao (20)

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
 
Vehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image featuresVehicle tracking and distance estimation based on multiple image features
Vehicle tracking and distance estimation based on multiple image features
 
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens ...
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...An integrated framework for 3 d modeling, object detection, and pose estimati...
An integrated framework for 3 d modeling, object detection, and pose estimati...
 
final_presentation
final_presentationfinal_presentation
final_presentation
 
p927-chang
p927-changp927-chang
p927-chang
 
sduGroupEvent
sduGroupEventsduGroupEvent
sduGroupEvent
 
A Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingA Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in Painting
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
 
PCA and SVD in brief
PCA and SVD in briefPCA and SVD in brief
PCA and SVD in brief
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
 
Real-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU AccelerationReal-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU Acceleration
 
CUDA Accelerated Face Recognition
CUDA Accelerated Face RecognitionCUDA Accelerated Face Recognition
CUDA Accelerated Face Recognition
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGEOBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
OBJECT DETECTION FOR SERVICE ROBOT USING RANGE AND COLOR FEATURES OF AN IMAGE
 
Object Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an ImageObject Detection for Service Robot Using Range and Color Features of an Image
Object Detection for Service Robot Using Range and Color Features of an Image
 
ei2106-submit-opt-415
ei2106-submit-opt-415ei2106-submit-opt-415
ei2106-submit-opt-415
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 
Object class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunalObject class recognition by unsupervide scale invariant learning - kunal
Object class recognition by unsupervide scale invariant learning - kunal
 

Mais de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 

Mais de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Último

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 

Último (20)

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Fcv poster bao

  • 1. Semantic Structure From Motion Sid Yingze Bao and Silvio Savarese VISION LAB Electrical and Computer Engineering, University of Michigan at Ann Arbor Source code and data: http://www.eecs.umich.edu/vision/projects/ssfm/index.html Introduction Model Overview Results Goal: Assumption: Car Person {O,Q,C}=arg max P(q,u,o|C,O,Q) Comparison Baselines Office Estimate 3D location and pose of objects, 3D location Given camera hypothesis, objects =arg max P(q,u|C,Q)P(o|C,O) - Camera Pose Est.: Bundler [1] of points, and camera parameters from 2 or more images. and points are independent - Object Detection: LSVM [2] Point Likelihood P(q,u|C,Q) 1. Car Dataset [3] (available online) - Images and Dense Lidar Points - ~500 testing images in 10 scenarios Motivation: Object Likelihood P(o|C,O) Cam. T. est. error v.s. baseline 3D object localization 0.3 - Most 3D reconstruction methods do 40 e (degree) T 0.25 Recall - Estimate 3D object likelihood by 2D Bundler 0.2 4 Cam not povide semantic information. SSFM 0.15 3 Cam projection appearance: 20 0.1 2 Cam - Most recognition methds do not Conventional 0.05 1 Cam False Positive per Image 0 provide geometry and camera pose. Structure From Motion 1 2 3 4 0 0 2 4 6 8 10 - We propose to solve these two problem jointly. 2D object detection Advantages: - Improve camera pose estimation, compared to feature-point-based SFM. - Improve object detections given multiple images, compared to independently detecting objects from each single images. - Establish object correspondences across views. SSFM Problem Formulation 2. Kinect Office Dataset (available online) Cam. T. est. error v.s. baseline 20 e (degree) 3D object localization Q - Images and calibrated Kinect 3D range data T Recall 2 Cam 1 Cam Bundler Measurements - Mouse, Monitor, and Keyboard 10 - q: point features (e.g. DOG+SIFT) - 500 images in 10 scenarios 0 SSFM False Positive per Image O 0.4 0.6 0.8 1 - u: point matches (e.g. threshold test) - o: 2D objects (e.g. [2]) o Joint Likelihood Maximization Main challenge: image 1 object det. image 2 object det. Model Parameters (unknowns) High dimensionality of unknowns => q azimuth=20 - C: camera (K is known) zenith =10 C o q Sample P(q,u,o|C,O,Q) with MCMC azimuth=90 zenith =5 - Q: 3D points (locations) azimuth=-20 zenith =9 azimuth=70 Parameter Initialization zenith =10 - O: 3D objects (locations, poses, categories) 3. Person Dataset C - Use object detection scale and pose to one of camera - A pair of stereo cameras Intuition: initialize cameras relative poses pose initializations - 400 image pairs in 10 scenarios In addition to point features, measurements of objects across views provide add- - Theorem: camera parameters can be itional geometrical constraints that allow to relate cameras and scene parameters. estimated given: i) 3 objects with scale; ii) 2 objects with pose; iii) 1 object with scale and pose. Reference Monte Carlo Markov Chain [1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV. 2008. - Sampling starts from different initializations [2] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained - Proposal distribution P(q,u,o|C,O,Q) Ackowledgement part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence of Pattern Analysis, 2009. one of camera poses and objects initializations [3] Gaurav Pandey, James McBride,,and Ryan Eustice,.Ford campus vision and lidar data set. International Journal - Combine all samples to identify the maximum We acknowledge the support of NSF CAREER #1054127 and the Gigascale Systems Research Center. of Robotics Research. 2011 We thank Mohit Bagra for collecting the Kinect dataset and Min Sun for helpful feedback.