SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Frontiers of
Human Activity Analysis

                J. K. Aggarwal
               Michael S. Ryoo
                 Kris M. Kitani
Applications and
  challenges


                   2
Object recognition applications
 Applications in practice
   Google image search

                             koala.jpg



                             This is a
                             koala in the
                             Australian
                             zoo, …


                             No label at
                             all.

                                            3
Object recognition applications
 Object recognition application in practice
   Pedestrian (human) detection
      Vehicle safety – Volvo automatic brake




                                                4
Video analysis applications
 We are living in an age where computer
  vision applications are working in practice.
   Surveillance
    applications

   Example>
    Siemens
    pedestrian
    tracking


                                                 5
Detection of Object Abandonment
   Activity recognition via reasoning of                  !

    temporal logic                          II   I   III

        Description-based



         Likelihood(f #)
               …




Bhargava, Chen, Ryoo, Aggarwal,
AVSS 2007                                                      6
Illegally parked car detection

     Foregrounds



     Moving
     vehicles


     Staying
     vehicles



     Illegally
Text parked
     vehicles


  Lee, Ryoo, Riley, Aggarwal, AVSS 2007
Human-vehicle interactions
  Retrieval of videos
   involving humans and
   vehicles
      Event-based analysis
      Surveillance
      Military systems
                                           Four persons are going into a
  Scene state analysis                    car and coming out:
      Detailed description of                Person locations,
                                              door opening, seating, …
       the scene sequence
Ryoo+, Lee+, and Aggarwal, ACM CIVR 2010                                   8
Aerial vehicles
  Unmanned aerial vehicles (UAVs)
      Automated understanding of aerial images
         Recognition of military activities
         Example> carrying, digging, …

      Challenge:
       visual cues
       are vague
       because of
       low-resolution

Chen and Aggarwal, WMVC 2009                      9
Detecting Persons Climbing
        Human contour           Extracting            Detecting            Constructing
         segmentation         extreme points       stable contacts      primitive intervals
                                                                          4


                                                                         3.5


                                                                          3


                                                                         2.5


                                                                          2


                                                                         1.5


                                                                          1


                                                                         0.5


                                                                          0
                                                                               0   50   100   150   200   250   300




An climbing sequence is decomposed into segments based on primitive intervals formed fr
   om stable contacts; the recognition is achieved from searching with HMM




Yu and Aggarwal, 2007                                                                                                 10
Human Computer Interactions
   Intelligent HCI system to help users’ physical
    tasks
       Our HCI system observes the task, and guides the
        user to accomplish it by providing feedback.
       Ex> Assembly tasks
                                            Insert hard-disk into
                                            the red location

                                             Thus, Grab hard-disk

                                            Insert optical-disk into
                                            the green location

                                             Thus, move optical-disk


Ryoo, Grauman, and Aggarwal, CVIU 10                                   11
Intelligent driving
 Human activity
  recognition
   Personal
    driving diary
      Passive recording

                           Ryoo et al., WMVC 2011



 Human intention                        Driver


  recognition                                       real-time


   Real-time driving supporter
                                                                12
Future directions
 Future directions of activity recognition
  research are driven by applications
   Surveillance
      Real-time
      Multiple cameras, continuous streams
   Video search
      YouTube – 20 hours of new videos per minute
      Large-scale database
      Very noisy – camera viewpoints, lighting, …


                                                     13
Future directions (cont’d)
 Longer sequences (story, plot, characters)
 3D representation and reasoning

 Complex temporal and logical constraints
   Allen’s temporal logic, context
 Incorporating prior knowledge (Ontologies)
   Data driven high-level learning is difficult



                                                   14
Challenges – real-time
 Real-time implementation is required
   Surveillance systems
   Robots and autonomous systems
   Content-based video retrieval
 Multiple activity detection
   Continuous video streams        Computations




                                                   15
Challenges – real-time
 Goal
   Process 20 hours of videos every minute
 Problem
   Sliding window is slow.
   Voting?




   Feature extraction itself?
                                              16
Real-time using GP-GPUs
 General purpose graphics processing
  units (GP-GPUs)
   Multiple cores running thousands of threads
   Parallel processing

   20 times speed-up of
   video comparison method in
   Shechtman and Irani 2005:




 [Rofouei, M., Moazeni, M., and Sarrafzadeh, M., Fast GPU-based space-time correlation for
 activity recognition in video sequences, 2008]
                                                                                             17
Challenges – activity context
 An activity involves interactions among
   Humans, objects, and scenes




               Running?                                   Kicking?

   Activity recognition must consider motion-
    object context!
 [Gupta, A., Kembhavi, A., and Davis, L., Observing Human-Object Interactions: Using Spatial
 and Functional Compatibility for Recognition, IEEE T PAMI 2009]
                                                                                               18
Activity context – pose
 Object-pose-motion




 [Yao, B. and Fei-Fei, L., Modeling Mutual Context of Object and Human Pose in Human-Object
 Interaction Activities , CVPR 2010]
                                                                                              19
Activity context (cont’d)
 Probabilistic modeling of dependencies
   Graphical models




                                           20
Challenges – interactive learning
 Interactive learning approach
   Learning by generating questions
       Human
       teacher       teaching         Activity
                                    recognition
                     questions
                                      system
                      videos?
                    corrections?

   Interactive learning
      Human-in-the-loop
      Explore decision boundaries actively

                                                  21
Active video composition
 A new learning                  Original video:

  paradigm
     Composed videos
                                  Composed videos:
      from a real video
     Automatically create
      necessary training
      videos


     Structural variations
        Who stretches his hand
         first?

Ryoo and Yu,
WMVC 2011                                            22
Active structure learning (cont’d)
                                                        Positive structure
Iteration #1:
                 Decision
                 Boundary   Label




Negative structure
                                                       Positive? Negative?



                                              Query:

                       activity structure space




                                                                             23
Active structure learning (cont’d)
                                                      Positive structure
Iteration #2:




Negative structure
                                                     Positive? Negative?



                                            Query:

                     activity structure space




                                                                           24
Active structure learning (cont’d)
                                                      Positive structure
Iteration #3:




Negative structure
                                                     Positive? Negative?



                                            Query:

                     activity structure space




                                                                           25
Recent work on ego-centric activity analysis
   Understanding first-person activities with wearable cameras
                                     (not a new idea in the wearable computing community)




Starner ISWC’98   Torralba ICCV’03   Ren CVPR’10        Sundaram WEV’09           Sun WEV’09           Fathi ICCV’11




                                                    Datasets
       Georgia Tech PBJ                                                              UEC Tokyo Sports


                                                   CMU Kitchen



           Fathi CVPR’11                            Spriggs WEV’09                          Kitani CVPR’11
                                                                                                                       26
Future of ego-centric vision


                         Inside-out
Ego-motion analysis       cameras




     2nd person        Extreme sports
   activity analysis


                                        27
Conclusion
 Single-layered approaches for actions
   Sequential approaches
   Space-time approaches
 Hierarchical approaches for activities
   Syntactic approaches
   Description-based approaches
 Applications and challenges
   Real-time applications
   Context and interactive learning
                                           28
Thank you
 We thank you for your attendance.
 Are there any questions?




                                      29

Mais conteúdo relacionado

Mais procurados

Midterm Presentation
Midterm PresentationMidterm Presentation
Midterm Presentation
pasteinplace
 
Common Sense (Object Mini-Project)
Common Sense (Object Mini-Project)Common Sense (Object Mini-Project)
Common Sense (Object Mini-Project)
liaoc487
 
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTIONA NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
IJASCSE
 
ICCV2011: Human Action Recognition by Learning bases of action attributes and...
ICCV2011: Human Action Recognition by Learning bases of action attributes and...ICCV2011: Human Action Recognition by Learning bases of action attributes and...
ICCV2011: Human Action Recognition by Learning bases of action attributes and...
zukun
 
Validating a smartphone-based pedestrian navigation system prototype - An inf...
Validating a smartphone-based pedestrian navigation system prototype - An inf...Validating a smartphone-based pedestrian navigation system prototype - An inf...
Validating a smartphone-based pedestrian navigation system prototype - An inf...
Beniamino Murgante
 
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 SeriesWorkshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
imec.archive
 
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity RecognitionCoupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
Keisuke Ogaki
 

Mais procurados (20)

UVR_ICGHIT2013
UVR_ICGHIT2013UVR_ICGHIT2013
UVR_ICGHIT2013
 
Virtualizing Real-life Lectures with vAcademia and Kinect
Virtualizing Real-life Lectures with vAcademia and KinectVirtualizing Real-life Lectures with vAcademia and Kinect
Virtualizing Real-life Lectures with vAcademia and Kinect
 
Global Futures Forum (Washington, D.C.)
Global Futures Forum (Washington, D.C.)Global Futures Forum (Washington, D.C.)
Global Futures Forum (Washington, D.C.)
 
Meet GeekPhysical
Meet GeekPhysicalMeet GeekPhysical
Meet GeekPhysical
 
58 towards a new gaze tracker
58 towards a new gaze tracker58 towards a new gaze tracker
58 towards a new gaze tracker
 
COSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research DirectionsCOSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research Directions
 
Midterm Presentation
Midterm PresentationMidterm Presentation
Midterm Presentation
 
GdAM Midterm
GdAM MidtermGdAM Midterm
GdAM Midterm
 
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
 
Lecture21
Lecture21Lecture21
Lecture21
 
Common Sense (Object Mini-Project)
Common Sense (Object Mini-Project)Common Sense (Object Mini-Project)
Common Sense (Object Mini-Project)
 
MODSIM World Canada
MODSIM World CanadaMODSIM World Canada
MODSIM World Canada
 
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTIONA NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
A NOVAL ARTECHTURE FOR 3D MODEL IN VIRTUAL COMMUNITIES FROM FACE DETECTION
 
ICCV2011: Human Action Recognition by Learning bases of action attributes and...
ICCV2011: Human Action Recognition by Learning bases of action attributes and...ICCV2011: Human Action Recognition by Learning bases of action attributes and...
ICCV2011: Human Action Recognition by Learning bases of action attributes and...
 
Virtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis PatientsVirtual Reality Training for Upper Limb Prosthesis Patients
Virtual Reality Training for Upper Limb Prosthesis Patients
 
Interesting difference of VR research-style between Japanese and French / 日仏V...
Interesting difference of VR research-style between Japanese and French / 日仏V...Interesting difference of VR research-style between Japanese and French / 日仏V...
Interesting difference of VR research-style between Japanese and French / 日仏V...
 
Validating a smartphone-based pedestrian navigation system prototype - An inf...
Validating a smartphone-based pedestrian navigation system prototype - An inf...Validating a smartphone-based pedestrian navigation system prototype - An inf...
Validating a smartphone-based pedestrian navigation system prototype - An inf...
 
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 SeriesWorkshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
Workshopvin1 2018 Technology In Labour For Rebirth Of Hal 9000 Series
 
Test
TestTest
Test
 
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity RecognitionCoupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
Coupling Eye-Motion and Ego-Motion Features forFirst-Person Activity Recognition
 

Destaque (7)

Human activity recognition
Human activity recognitionHuman activity recognition
Human activity recognition
 
Review paper human activity analysis
Review paper human activity analysisReview paper human activity analysis
Review paper human activity analysis
 
cvpr2011: human activity recognition - part 4: syntactic
cvpr2011: human activity recognition - part 4: syntacticcvpr2011: human activity recognition - part 4: syntactic
cvpr2011: human activity recognition - part 4: syntactic
 
Activity Recognition using Cell Phone Accelerometers
Activity Recognition using Cell Phone AccelerometersActivity Recognition using Cell Phone Accelerometers
Activity Recognition using Cell Phone Accelerometers
 
Human Activity Recognition in Android
Human Activity Recognition in AndroidHuman Activity Recognition in Android
Human Activity Recognition in Android
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 
Human activity recognition
Human activity recognition Human activity recognition
Human activity recognition
 

Semelhante a cvpr2011: human activity recognition - part 6: applications

A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
CSCJournals
 
Chapter 1_Introduction.docx
Chapter 1_Introduction.docxChapter 1_Introduction.docx
Chapter 1_Introduction.docx
KISHWARYA2
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
butest
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
Editor IJARCET
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
Editor IJARCET
 

Semelhante a cvpr2011: human activity recognition - part 6: applications (20)

Deep Learning Summit (DLS01-6)
Deep Learning Summit (DLS01-6)Deep Learning Summit (DLS01-6)
Deep Learning Summit (DLS01-6)
 
IRJET- A Review on Moving Object Detection in Video Forensics
IRJET- A Review on Moving Object Detection in Video Forensics IRJET- A Review on Moving Object Detection in Video Forensics
IRJET- A Review on Moving Object Detection in Video Forensics
 
Java Implementation based Heterogeneous Video Sequence Automated Surveillance...
Java Implementation based Heterogeneous Video Sequence Automated Surveillance...Java Implementation based Heterogeneous Video Sequence Automated Surveillance...
Java Implementation based Heterogeneous Video Sequence Automated Surveillance...
 
seminar Islideshow.pptx
seminar Islideshow.pptxseminar Islideshow.pptx
seminar Islideshow.pptx
 
590 599
590 599590 599
590 599
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
Automatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion SaliencyAutomatic Foreground object detection using Visual and Motion Saliency
Automatic Foreground object detection using Visual and Motion Saliency
 
A Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal FeaturesA Framework for Human Action Detection via Extraction of Multimodal Features
A Framework for Human Action Detection via Extraction of Multimodal Features
 
Chapter 1_Introduction.docx
Chapter 1_Introduction.docxChapter 1_Introduction.docx
Chapter 1_Introduction.docx
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
IRJET - Using Convolutional Neural Network in Surveillance Videos for Recogni...
 
Q180305116119
Q180305116119Q180305116119
Q180305116119
 
Natural User Interfaces using YVision
Natural User Interfaces using YVisionNatural User Interfaces using YVision
Natural User Interfaces using YVision
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
 
Pc Seminar Jordi
Pc Seminar JordiPc Seminar Jordi
Pc Seminar Jordi
 
ASSISTANCE SYSTEM FOR DRIVERS USING IOT
ASSISTANCE SYSTEM FOR DRIVERS USING IOTASSISTANCE SYSTEM FOR DRIVERS USING IOT
ASSISTANCE SYSTEM FOR DRIVERS USING IOT
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI Robot
 
IRJET- Object Detection in Real Time using AI and Deep Learning
IRJET- Object Detection in Real Time using AI and Deep LearningIRJET- Object Detection in Real Time using AI and Deep Learning
IRJET- Object Detection in Real Time using AI and Deep Learning
 
YOLOv4: A Face Mask Detection System
YOLOv4: A Face Mask Detection SystemYOLOv4: A Face Mask Detection System
YOLOv4: A Face Mask Detection System
 

Mais de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 

Mais de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

cvpr2011: human activity recognition - part 6: applications

  • 1. Frontiers of Human Activity Analysis J. K. Aggarwal Michael S. Ryoo Kris M. Kitani
  • 2. Applications and challenges 2
  • 3. Object recognition applications  Applications in practice  Google image search koala.jpg This is a koala in the Australian zoo, … No label at all. 3
  • 4. Object recognition applications  Object recognition application in practice  Pedestrian (human) detection  Vehicle safety – Volvo automatic brake 4
  • 5. Video analysis applications  We are living in an age where computer vision applications are working in practice.  Surveillance applications  Example> Siemens pedestrian tracking 5
  • 6. Detection of Object Abandonment  Activity recognition via reasoning of ! temporal logic II I III  Description-based Likelihood(f #) … Bhargava, Chen, Ryoo, Aggarwal, AVSS 2007 6
  • 7. Illegally parked car detection Foregrounds Moving vehicles Staying vehicles Illegally Text parked vehicles Lee, Ryoo, Riley, Aggarwal, AVSS 2007
  • 8. Human-vehicle interactions  Retrieval of videos involving humans and vehicles  Event-based analysis  Surveillance  Military systems Four persons are going into a  Scene state analysis car and coming out:  Detailed description of Person locations, door opening, seating, … the scene sequence Ryoo+, Lee+, and Aggarwal, ACM CIVR 2010 8
  • 9. Aerial vehicles  Unmanned aerial vehicles (UAVs)  Automated understanding of aerial images  Recognition of military activities  Example> carrying, digging, …  Challenge: visual cues are vague because of low-resolution Chen and Aggarwal, WMVC 2009 9
  • 10. Detecting Persons Climbing Human contour Extracting Detecting Constructing segmentation extreme points stable contacts primitive intervals 4 3.5 3 2.5 2 1.5 1 0.5 0 0 50 100 150 200 250 300 An climbing sequence is decomposed into segments based on primitive intervals formed fr om stable contacts; the recognition is achieved from searching with HMM Yu and Aggarwal, 2007 10
  • 11. Human Computer Interactions  Intelligent HCI system to help users’ physical tasks  Our HCI system observes the task, and guides the user to accomplish it by providing feedback.  Ex> Assembly tasks Insert hard-disk into the red location Thus, Grab hard-disk Insert optical-disk into the green location Thus, move optical-disk Ryoo, Grauman, and Aggarwal, CVIU 10 11
  • 12. Intelligent driving  Human activity recognition  Personal driving diary  Passive recording Ryoo et al., WMVC 2011  Human intention Driver recognition real-time  Real-time driving supporter 12
  • 13. Future directions  Future directions of activity recognition research are driven by applications  Surveillance  Real-time  Multiple cameras, continuous streams  Video search  YouTube – 20 hours of new videos per minute  Large-scale database  Very noisy – camera viewpoints, lighting, … 13
  • 14. Future directions (cont’d)  Longer sequences (story, plot, characters)  3D representation and reasoning  Complex temporal and logical constraints  Allen’s temporal logic, context  Incorporating prior knowledge (Ontologies)  Data driven high-level learning is difficult 14
  • 15. Challenges – real-time  Real-time implementation is required  Surveillance systems  Robots and autonomous systems  Content-based video retrieval  Multiple activity detection  Continuous video streams Computations 15
  • 16. Challenges – real-time  Goal  Process 20 hours of videos every minute  Problem  Sliding window is slow.  Voting?  Feature extraction itself? 16
  • 17. Real-time using GP-GPUs  General purpose graphics processing units (GP-GPUs)  Multiple cores running thousands of threads  Parallel processing 20 times speed-up of video comparison method in Shechtman and Irani 2005: [Rofouei, M., Moazeni, M., and Sarrafzadeh, M., Fast GPU-based space-time correlation for activity recognition in video sequences, 2008] 17
  • 18. Challenges – activity context  An activity involves interactions among  Humans, objects, and scenes Running? Kicking?  Activity recognition must consider motion- object context! [Gupta, A., Kembhavi, A., and Davis, L., Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition, IEEE T PAMI 2009] 18
  • 19. Activity context – pose  Object-pose-motion [Yao, B. and Fei-Fei, L., Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities , CVPR 2010] 19
  • 20. Activity context (cont’d)  Probabilistic modeling of dependencies  Graphical models 20
  • 21. Challenges – interactive learning  Interactive learning approach  Learning by generating questions Human teacher teaching Activity recognition questions system videos? corrections?  Interactive learning  Human-in-the-loop  Explore decision boundaries actively 21
  • 22. Active video composition  A new learning Original video: paradigm  Composed videos Composed videos: from a real video  Automatically create necessary training videos  Structural variations  Who stretches his hand first? Ryoo and Yu, WMVC 2011 22
  • 23. Active structure learning (cont’d) Positive structure Iteration #1: Decision Boundary Label Negative structure Positive? Negative? Query: activity structure space 23
  • 24. Active structure learning (cont’d) Positive structure Iteration #2: Negative structure Positive? Negative? Query: activity structure space 24
  • 25. Active structure learning (cont’d) Positive structure Iteration #3: Negative structure Positive? Negative? Query: activity structure space 25
  • 26. Recent work on ego-centric activity analysis Understanding first-person activities with wearable cameras (not a new idea in the wearable computing community) Starner ISWC’98 Torralba ICCV’03 Ren CVPR’10 Sundaram WEV’09 Sun WEV’09 Fathi ICCV’11 Datasets Georgia Tech PBJ UEC Tokyo Sports CMU Kitchen Fathi CVPR’11 Spriggs WEV’09 Kitani CVPR’11 26
  • 27. Future of ego-centric vision Inside-out Ego-motion analysis cameras 2nd person Extreme sports activity analysis 27
  • 28. Conclusion  Single-layered approaches for actions  Sequential approaches  Space-time approaches  Hierarchical approaches for activities  Syntactic approaches  Description-based approaches  Applications and challenges  Real-time applications  Context and interactive learning 28
  • 29. Thank you  We thank you for your attendance.  Are there any questions? 29