SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Learning	
  Spa+otemporal	
  Graphs	
  of	
  
           Human	
  Ac+vi+es	
  

   William	
  Brendel	
     Sinisa	
  Todorovic	
  
Our Goal
     Long Jump              Triple Jump




•  Recognize all occurrences of activities
•  Identify the start and end frames
•  Parse the video and find all subactivities
•  Localize actors and objects involved
Weakly Supervised Setting
           Weight Lifting                          Large-Box Lifting




In	
  training:	
  
	
  	
  	
  	
  >	
  ONLY	
  class	
  labels	
  
	
  	
  


Domain	
  knowledge	
  of	
  temporal	
  structure:	
  
	
  	
  	
  	
  >	
  NOT	
  AVAILABLE	
  
Learning What and How

               Weak	
  supervision	
  in	
  training	
  
	
  
	
  
          Need	
  to	
  learn	
  from	
  training	
  videos:	
  
	
  
          What	
  ac+vity	
  parts	
  are	
  relevant	
  
	
  
       How	
  relevant	
  they	
  are	
  for	
  recogni+on	
  
Prior Work vs. Our Approach
Typically, focus
 only on HOW
semantic level


    model
                   gap

   features

  raw video
Prior Work vs. Our Approach

 Typically...
semantic level         semantic level

                          model
   model
                 gap     mid-level
                         features
   features

  raw video              raw video
Prior Work – Video Representation

       •  Space-­‐+me	
  points	
  
         –	
  Laptev	
  &	
  Schmid	
  08,	
  Niebles	
  &	
  Fei-­‐Fei	
  08,	
  …	
  


       •  S+ll	
  human	
  postures	
  
         –	
  SoaLo	
  07,	
  Ning	
  &	
  Huang	
  08,	
  …	
  


       •  Ac+on	
  templates	
  
         –	
  Yao	
  &	
  Zhu	
  09,	
  …	
  


       •  Point	
  tracks	
  
         –	
  Sukthankar	
  &	
  Hebert	
  10,	
  …	
  
	
  
Our Features: 2D+t Tubes


       •  Allow	
  simpler:	
  
           -­‐	
  Modeling	
  
           -­‐	
  Learning	
  (few	
  examples)	
  	
        Sukthankar & Hebert 07,

           -­‐	
  Inference	
                                  Gorelick & Irani 08,
                                                               Pritch & Peleg 08, ...

	
  
	
  
       •  We	
  are	
  the	
  first	
  to	
  use	
  2D+t	
  tubes	
  for	
  
          building	
  a	
  sta+s+cal	
  model	
  of	
  ac+vi+es	
  
Our Features: 2D+t Tubes


       •  Allow	
  simpler:	
  
           -­‐	
  Modeling	
  
           -­‐	
  Learning	
  (few	
  examples)	
  	
     Sukthankar & Hebert 07,

           -­‐	
  Inference	
                               Gorelick & Irani 08,
                                                           Pritch & Peleg 08, ...

	
  
	
  
       •  We	
  use	
  2D+t	
  tubes	
  for	
  building	
  a	
  sta+s+cal	
  
          genera+ve	
  model	
  of	
  ac+vi+es	
  
Prior Work – Activity Representation
       •  Graphical	
  models,	
  Grammars	
  
        -­‐	
  Ivanov	
  &	
  Bobick	
  00	
  
        -­‐	
  Xiang	
  &	
  Gong	
  06	
  
        -­‐	
  Ryoo	
  &	
  Aggawal	
  09	
  
        -­‐	
  Gupta	
  &	
  Davis	
  09	
  
        -­‐	
  Liu	
  &	
  Zhu	
  09	
  
        -­‐	
  Niebles	
  &	
  Fei-­‐Fei	
  10	
  
        -­‐	
  Lan	
  et	
  al.	
  11	
  	
  
	
  

       •  Probabilis+c	
  first-­‐order	
  logic	
  
        -­‐	
  Tran	
  &	
  Davis	
  08	
  
        -­‐	
  Albanese	
  et	
  al.	
  10	
  
        -­‐	
  Morariu	
  &	
  Davis	
  11	
  
        -­‐	
  Brendel	
  et	
  al.	
  11...	
  
Approach




	
  	
  	
  	
  Input	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Spa+otemporal	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Ac+vity	
  	
  	
  	
  	
  	
  	
  	
  Recogni+on	
  
	
  	
  	
  	
  Video	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Graph	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Model	
  	
  	
  	
  	
  	
  	
  	
  	
  Localiza+on	
  
Blocky Video Segmentation
Activity as a Spatiotemporal Graph
            Descriptors of nodes and edges:

            •  Node descriptors:       F
                 - Motion
                 - Object shape

            •  Adjacency Matrices:     {Ai}
                 - Allen temporal relations
                 - Spatial relations
                 - Compositional relations
Activity as Segmentation Graph

          G = (V, E, "descriptors")

            = (F, {A1, ..., An})
           node descriptors

                   adjacency matrices
                   of distinct relations
                   between the tubes
Activity Graph Model
             Probabilistic Graph Mixture



 model node descriptors           mixture weights

           model adjacency matrices
compositional       spatial            temporal

  *         +      *          +         *
Activity Model
   An	
  ac+vity	
  instance: G = (F, {A1,..., An})




Model adjacency matrices
Edge type: i =1, 2,..., n
Activity Model
   An	
  ac+vity	
  instance: G = (F, {A1,..., An})




Model adjacency matrices            Model matrix of
                                   node descriptors
Edge type: i =1, 2,..., n
Inference




	
  	
  	
  	
  Input	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Spa+otemporal	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Ac+vity	
  	
  	
  	
  	
  	
  	
  	
  Recogni+on	
  
	
  	
  	
  	
  Video	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Graph	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Model	
  	
  	
  	
  	
  	
  	
  	
  	
  Localiza+on	
  
Inference = Robust Least Squares
   Goal:	
  	
  
     • For	
  every	
  ac+vity	
  model	
  
     • Es+mate	
  the	
  permuta+on	
  matrix	
  




subject to
Learning the Activity Graph Model




Training	
  videos	
  →	
  Training	
  graphs	
  →	
  Graph	
  model	
  	
  
	
  
Learning

 Given K training graphs,


	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  



  Edge type: i =1, 2,..., n
Learning

Given K training graphs,                                                                     ESTIMATE

	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  




                                         Model parameters
Learning

Given K training graphs,                                                                     ESTIMATE

	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  




   Permutation matrix
Learning = Robust Least Squares
Given	
  K	
  
Training	
  
graphs:	
  

Es4mate:	
           and	
  
Learning = Structural EM

E-step à expected   M-step à matching of the
 model structure     training graphs and model




  Estimatation of          Estimation of
 model parametrs        permutation matrices
Learning Results




       Correctly	
  learned	
  ac+vity-­‐characteris+c	
  tubes	
  
	
  
Recognition and Segmentation




                    Ac+vity	
  “handshaking”	
  
       Detected	
  and	
  segmented	
  characteris+c	
  tube	
  
	
  
Recognition and Segmentation




                        Ac+vity	
  “kicking”	
  
       Detected	
  and	
  segmented	
  characteris+c	
  tube	
  
	
  
Classification on UTexas Dataset




     Human	
  interac+on	
  ac+vi+es	
  
             [18]	
  Ryoo	
  et	
  al.	
  ’10	
  
Conclusion

•  Fast spatiotemporal segmentation

•  New activity representation = graph model

•  Unified learning and inference = Least squares

•  Learning under weak supervision:

     - WHAT activity parts are relevant and

     - HOW relevant they are for recognition

Mais conteúdo relacionado

Mais procurados

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Linear circuit and superposition
Linear circuit and superpositionLinear circuit and superposition
Linear circuit and superpositionlipschitzembed
 
Lecture11
Lecture11Lecture11
Lecture11Bo Li
 
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaM Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaJan Aerts
 

Mais procurados (6)

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Gc.d day
Gc.d dayGc.d day
Gc.d day
 
Linear circuit and superposition
Linear circuit and superpositionLinear circuit and superposition
Linear circuit and superposition
 
Lecture11
Lecture11Lecture11
Lecture11
 
Ben Gal
Ben Gal Ben Gal
Ben Gal
 
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaM Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
 

Destaque

Review paper human activity analysis
Review paper human activity analysisReview paper human activity analysis
Review paper human activity analysisIftikhar Alam
 
Business, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu DewanBusiness, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu Dewankulachihansraj
 
nature and purpose of business.
nature and purpose of business.nature and purpose of business.
nature and purpose of business.Sruthy Ajith
 
Mkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentMkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentKawser Ahmad Sohan
 
Professional ethics presentation
Professional ethics presentationProfessional ethics presentation
Professional ethics presentationSkillet Tony
 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Nikhil Soares
 

Destaque (7)

Review paper human activity analysis
Review paper human activity analysisReview paper human activity analysis
Review paper human activity analysis
 
Business, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu DewanBusiness, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu Dewan
 
nature and purpose of business.
nature and purpose of business.nature and purpose of business.
nature and purpose of business.
 
Mkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentMkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship development
 
How do people destroy natural resources
How do people destroy natural resourcesHow do people destroy natural resources
How do people destroy natural resources
 
Professional ethics presentation
Professional ethics presentationProfessional ethics presentation
Professional ethics presentation
 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
 

Semelhante a Iccv2011 learning spatiotemporal graphs of human activities

Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep diveabulyomon
 
Python image processing_Python image processing.pptx
Python image processing_Python image processing.pptxPython image processing_Python image processing.pptx
Python image processing_Python image processing.pptxshashikant484397
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsLiwei Ren任力偉
 
JS Responsibilities
JS ResponsibilitiesJS Responsibilities
JS ResponsibilitiesBrendan Eich
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Stefan Urbanek
 
Aj2418721874
Aj2418721874Aj2418721874
Aj2418721874IJMER
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptxmustafa sarac
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lrguesta32562
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiersSolin TEM
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Databricks
 
MathWorks Interview Lecture
MathWorks Interview LectureMathWorks Interview Lecture
MathWorks Interview LectureJohn Yates
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basicsnpinto
 
Functional solid
Functional solidFunctional solid
Functional solidMatt Stine
 
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Förderverein Technische Fakultät
 

Semelhante a Iccv2011 learning spatiotemporal graphs of human activities (20)

Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
 
Python image processing_Python image processing.pptx
Python image processing_Python image processing.pptxPython image processing_Python image processing.pptx
Python image processing_Python image processing.pptx
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
 
JS Responsibilities
JS ResponsibilitiesJS Responsibilities
JS Responsibilities
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Aj2418721874
Aj2418721874Aj2418721874
Aj2418721874
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptx
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lr
 
Ds & ada
Ds & adaDs & ada
Ds & ada
 
Workshop Mock-Ups
Workshop Mock-UpsWorkshop Mock-Ups
Workshop Mock-Ups
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
 
Av Recognition
Av RecognitionAv Recognition
Av Recognition
 
MathWorks Interview Lecture
MathWorks Interview LectureMathWorks Interview Lecture
MathWorks Interview Lecture
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
 
Functional solid
Functional solidFunctional solid
Functional solid
 
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
 

Mais de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 

Mais de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 

Último

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Último (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Iccv2011 learning spatiotemporal graphs of human activities

  • 1. Learning  Spa+otemporal  Graphs  of   Human  Ac+vi+es   William  Brendel   Sinisa  Todorovic  
  • 2. Our Goal Long Jump Triple Jump •  Recognize all occurrences of activities •  Identify the start and end frames •  Parse the video and find all subactivities •  Localize actors and objects involved
  • 3. Weakly Supervised Setting Weight Lifting Large-Box Lifting In  training:          >  ONLY  class  labels       Domain  knowledge  of  temporal  structure:          >  NOT  AVAILABLE  
  • 4. Learning What and How Weak  supervision  in  training       Need  to  learn  from  training  videos:     What  ac+vity  parts  are  relevant     How  relevant  they  are  for  recogni+on  
  • 5. Prior Work vs. Our Approach Typically, focus only on HOW semantic level model gap features raw video
  • 6. Prior Work vs. Our Approach Typically... semantic level semantic level model model gap mid-level features features raw video raw video
  • 7. Prior Work – Video Representation •  Space-­‐+me  points   –  Laptev  &  Schmid  08,  Niebles  &  Fei-­‐Fei  08,  …   •  S+ll  human  postures   –  SoaLo  07,  Ning  &  Huang  08,  …   •  Ac+on  templates   –  Yao  &  Zhu  09,  …   •  Point  tracks   –  Sukthankar  &  Hebert  10,  …    
  • 8. Our Features: 2D+t Tubes •  Allow  simpler:   -­‐  Modeling   -­‐  Learning  (few  examples)     Sukthankar & Hebert 07, -­‐  Inference   Gorelick & Irani 08, Pritch & Peleg 08, ...     •  We  are  the  first  to  use  2D+t  tubes  for   building  a  sta+s+cal  model  of  ac+vi+es  
  • 9. Our Features: 2D+t Tubes •  Allow  simpler:   -­‐  Modeling   -­‐  Learning  (few  examples)     Sukthankar & Hebert 07, -­‐  Inference   Gorelick & Irani 08, Pritch & Peleg 08, ...     •  We  use  2D+t  tubes  for  building  a  sta+s+cal   genera+ve  model  of  ac+vi+es  
  • 10. Prior Work – Activity Representation •  Graphical  models,  Grammars   -­‐  Ivanov  &  Bobick  00   -­‐  Xiang  &  Gong  06   -­‐  Ryoo  &  Aggawal  09   -­‐  Gupta  &  Davis  09   -­‐  Liu  &  Zhu  09   -­‐  Niebles  &  Fei-­‐Fei  10   -­‐  Lan  et  al.  11       •  Probabilis+c  first-­‐order  logic   -­‐  Tran  &  Davis  08   -­‐  Albanese  et  al.  10   -­‐  Morariu  &  Davis  11   -­‐  Brendel  et  al.  11...  
  • 11. Approach        Input                                    Spa+otemporal                                              Ac+vity                Recogni+on          Video                                                    Graph                                                                Model                  Localiza+on  
  • 13. Activity as a Spatiotemporal Graph Descriptors of nodes and edges: •  Node descriptors: F - Motion - Object shape •  Adjacency Matrices: {Ai} - Allen temporal relations - Spatial relations - Compositional relations
  • 14. Activity as Segmentation Graph G = (V, E, "descriptors") = (F, {A1, ..., An}) node descriptors adjacency matrices of distinct relations between the tubes
  • 15. Activity Graph Model Probabilistic Graph Mixture model node descriptors mixture weights model adjacency matrices compositional spatial temporal * + * + *
  • 16. Activity Model An  ac+vity  instance: G = (F, {A1,..., An}) Model adjacency matrices Edge type: i =1, 2,..., n
  • 17. Activity Model An  ac+vity  instance: G = (F, {A1,..., An}) Model adjacency matrices Model matrix of node descriptors Edge type: i =1, 2,..., n
  • 18. Inference        Input                                    Spa+otemporal                                              Ac+vity                Recogni+on          Video                                                    Graph                                                                Model                  Localiza+on  
  • 19. Inference = Robust Least Squares Goal:     • For  every  ac+vity  model   • Es+mate  the  permuta+on  matrix   subject to
  • 20. Learning the Activity Graph Model Training  videos  →  Training  graphs  →  Graph  model      
  • 21. Learning Given K training graphs,              Adjacency  matrix                            Node  descriptor   Edge type: i =1, 2,..., n
  • 22. Learning Given K training graphs, ESTIMATE              Adjacency  matrix                                Node  descriptor   Model parameters
  • 23. Learning Given K training graphs, ESTIMATE              Adjacency  matrix                                Node  descriptor   Permutation matrix
  • 24. Learning = Robust Least Squares Given  K   Training   graphs:   Es4mate:   and  
  • 25. Learning = Structural EM E-step à expected M-step à matching of the model structure training graphs and model Estimatation of Estimation of model parametrs permutation matrices
  • 26. Learning Results Correctly  learned  ac+vity-­‐characteris+c  tubes    
  • 27. Recognition and Segmentation Ac+vity  “handshaking”   Detected  and  segmented  characteris+c  tube    
  • 28. Recognition and Segmentation Ac+vity  “kicking”   Detected  and  segmented  characteris+c  tube    
  • 29. Classification on UTexas Dataset Human  interac+on  ac+vi+es   [18]  Ryoo  et  al.  ’10  
  • 30. Conclusion •  Fast spatiotemporal segmentation •  New activity representation = graph model •  Unified learning and inference = Least squares •  Learning under weak supervision: - WHAT activity parts are relevant and - HOW relevant they are for recognition