SlideShare uma empresa Scribd logo
1 de 9
Dense-Captioning Events
in Videos
Dense-Captioning
Highlight
• Task: dense-captioning events
• Dataset: ActivityNet Captions
• Events range across multiple time scales and can even overlap.
• generating action proposals to multi-scale detection of events,
processes each video in a forward pass to detect events as they occur
• Events in a given video are usually related to one another.
• introduce a captioning module that utilizes the context from all the
events from our proposal module to generate each sentence
DenseCap:
Fully Convolutional Localization Networks for Dense Captioning
DenseCap:
Fully Convolutional Localization Networks for Dense Captioning
Method V. Escorcia, F. C. Heilbron, J. C. Niebles, and B. Ghanem.
Daps: Deep action proposals for action understanding.
2016,ECCV
J. Johnson, A.
Karpathy, and L.
Fei-Fei.
DenseCap:
Fully
convolutional
localization
networks for
dense
captioning.
A. Alahi, K. Goel, V.
Ramanathan, A.
Robicquet, L. Fei-
Fei,
and S. Savarese.
Social lstm: Human
trajectory prediction
in
crowded spaces.
object-centric
in images
action-centric
in videos
Performance
Discussion Jointly Localizing and Describing Events for Dense Video Captioning
Discussion Joint Event Detection and Description in Continuous Video Streams

Mais conteúdo relacionado

Semelhante a Dense-captioning events in videos

BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
Karthik Murugesan
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
Maryam Farooq
 

Semelhante a Dense-captioning events in videos (20)

Dcnn for text
Dcnn for textDcnn for text
Dcnn for text
 
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结
 
モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
 
Telekinetic
TelekineticTelekinetic
Telekinetic
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
telekinetic-170302195145 (1).pdf
telekinetic-170302195145 (1).pdftelekinetic-170302195145 (1).pdf
telekinetic-170302195145 (1).pdf
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Bagwords
BagwordsBagwords
Bagwords
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the Metaverse
 
Educational technologyvcher nuketnowlan
Educational technologyvcher nuketnowlanEducational technologyvcher nuketnowlan
Educational technologyvcher nuketnowlan
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Dense-captioning events in videos

  • 3. Highlight • Task: dense-captioning events • Dataset: ActivityNet Captions • Events range across multiple time scales and can even overlap. • generating action proposals to multi-scale detection of events, processes each video in a forward pass to detect events as they occur • Events in a given video are usually related to one another. • introduce a captioning module that utilizes the context from all the events from our proposal module to generate each sentence
  • 4. DenseCap: Fully Convolutional Localization Networks for Dense Captioning
  • 5. DenseCap: Fully Convolutional Localization Networks for Dense Captioning
  • 6. Method V. Escorcia, F. C. Heilbron, J. C. Niebles, and B. Ghanem. Daps: Deep action proposals for action understanding. 2016,ECCV J. Johnson, A. Karpathy, and L. Fei-Fei. DenseCap: Fully convolutional localization networks for dense captioning. A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei- Fei, and S. Savarese. Social lstm: Human trajectory prediction in crowded spaces. object-centric in images action-centric in videos
  • 8. Discussion Jointly Localizing and Describing Events for Dense Video Captioning
  • 9. Discussion Joint Event Detection and Description in Continuous Video Streams

Notas do Editor

  1. 1.给定视频,生成特征序列。实验中以16帧为单位,输入C3D提取特征。 2.proposal module。proposal module是在DAPs的基础上做了一点修改,即在每一个time step输出K个proposals。采用LSTM结构,输入上述C3D特征序列,用不同的strides提取特征序列,strides={1,2,4,8}。 生成的proposal在时间上会有重叠。每检测出一个event,就将当前的隐藏层状态作为视频描述。 3.captioning module。利用相邻事件的context来生成event caption。采用LSTM结构。 将所有的事件相对于当前事件分成两个桶:past events和future events。并发事件则依据结束时间分成past events和future events。