SlideShare uma empresa Scribd logo
1 de 13
Technische Universität München




     Violent Scenes Detection with
Large, Brute-Forced Acoustic and Visual
              Feature Sets

   Florian Eyben, Felix Weninger, Nicolas Lehment,
            Gerhard Rigoll, Björn Schuller
          Institute for Human-Machine Communication,
                 Technische Universität München




           Session “Affect Task: Violent Scenes Detection”
                           October 4, 2012
Technische Universität München




“Large”
• Start with frame-wise features (audio / video)
• Summarize over „meaningful unit“
      – Shot?

      – Sliding window?

      – Overlap?

• Application of functionals:
      – Percentiles, moments, …
• Results in 3.8k audio and 9.7k video features


October 4, 2012    TUM / Felix Weninger                                     2
Technische Universität München




Frame-Wise Features (LLDs)
• Acoustic energy LLDs
      – Loudness, energy, ZCR
• Acoustic spectral LLDs
      – MFCCs, band energy, centroid, roll-off
        point, flux, entropy, moments, sharpness, harmonicity
• Visual LLDs
      – HSV histogram
      – Optical Flow: histogram + mean + std.dev.
      – Laplacian edge image histogram + strongest edge




October 4, 2012     TUM / Felix Weninger                                             3
Technische Universität München




“Brute-Forced”
•   Fully data-based approach (no pre-classification)
•   Little hand-crafting / engineering of features
•   Systematic feature (over-)generation
•   Emphasize on machine learning
•   Successful in affect recognition and speaker
    characterization tasks
      – INTERSPEECH 2009 Emotion Challenge
      – INTERSPEECH 2010 Paralinguistic Challenge
      – INTERSPEECH 2011 Intoxication / Sleepiness
• Generalization?

October 4, 2012   TUM / Felix Weninger                                          4
Technische Universität München




A Data-Based Approach
• System development based on 3-fold CV of
  development data
      – „Movie-independent“
      – Stratified by violence proportion and age
• Use all features from development data for evaluation
  on test data




October 4, 2012     TUM / Felix Weninger                                             5
Technische Universität München




„Acoustic and Visual“
• Expect complementarity of modalities
• Late fusion by confidences of single-modal classifiers




October 4, 2012   TUM / Felix Weninger                                     6
Technische Universität München




Segmentation and Classification
• Two segmentations evaluated on development set:
      – Functionals over shots
      – Functionals over X sec. sliding window
• Sliding window segmentation:
      – Classify per window
      – Fuse window classification per shot
      – Alternative: Generate segmentation
• Weka, SVM (SMO), C = 0.01
• Logistic regression to obtain confidences



October 4, 2012     TUM / Felix Weninger                                          7
Technische Universität München




TUM Test Runs


   Run            Modality      Overlap         Overlap   MAP100    MAP100    MAP20
                                 Train           Eval      Test     Dev (CV) Dev (CV)
 TUM-1              A+V              X                     .484          .397                 .525
 TUM-2               A               X                     .376          .445                 .515
 TUM-3               A               X              X      .360          .428                 .518
 TUM-4               A                                     .392          .442                 .503
 TUM-5               V                                     .320          .224                 .213




October 4, 2012              TUM / Felix Weninger                                                    8
Technische Universität München




TUM Test Runs


            Run   Modality       Overlap   Overlap   UA Rec          WA Rec
                                  Train     Eval      Dev             Dev
          TUM-1     A+V              X                .584               .848
          TUM-2      A               X                .648               .830
          TUM-3      A               X       X        .648               .826
          TUM-4      A                                .634               .829
          TUM-5      V                                .537               .832




October 4, 2012     TUM / Felix Weninger                                                9
Technische Universität München




Test Data: MAP 100 by Movie



Movie                               TUM-1 (A+V)          TUM-2 (A)
Dead Poets Society                          .523               .158
Fight Club                                  .321               .315
Independence Day                            .609               .656




October 4, 2012      TUM / Felix Weninger                                           10
Technische Universität München




Discussion
• MAP very sensitive to segmentation
      – Ex.: MAP100 = .73, MAP20 = .88 on Dev iff segment
        boundaries are aligned to violent / non-violent scenes
                                     NV             V

                  Aligned:

          Not Aligned:                          ?

      – Train on aligned data / test on not aligned data: MAP100 = .49
• Accuracies: similar ranking, but less „sensitive“
      – Correlated with target function in learning

October 4, 2012          TUM / Felix Weninger                                            11
Technische Universität München




Conclusions and Outlook
•   Demonstrated feasibility of „brute-force“ approach
•   Acoustic features alone are often competitive
•   Visual features are complementary
•   Future: Deeper analysis of
      – Individual features„ worth
      – Influence of segmentation on model training and evaluation




October 4, 2012     TUM / Felix Weninger                                            12
Technische Universität München




                                   Thank you.

                            weninger@tum.de

            openSMILE: http://opensmile.sourceforge.net




October 4, 2012     TUM / Felix Weninger                                         13

Mais conteúdo relacionado

Destaque

Week 2 discussion 2
Week 2 discussion 2Week 2 discussion 2
Week 2 discussion 2LILBIT2012
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskMediaEval2012
 
CERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection TaskCERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection TaskMediaEval2012
 
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo Collections
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo CollectionsQMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo Collections
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo CollectionsMediaEval2012
 
The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...MediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Mentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoMentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoGrow America
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012Philip Polstra
 
Idea or opportunity?
Idea or opportunity?Idea or opportunity?
Idea or opportunity?Grow America
 
John Richards: My Life Lessons As An Entrepreneur
John Richards: My Life Lessons As An EntrepreneurJohn Richards: My Life Lessons As An Entrepreneur
John Richards: My Life Lessons As An EntrepreneurGrow America
 
Simha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published VersionSimha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published VersionPrithvi Simha
 
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012MediaEval2012
 

Destaque (16)

Week 2 discussion 2
Week 2 discussion 2Week 2 discussion 2
Week 2 discussion 2
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED Task
 
Closing
ClosingClosing
Closing
 
CERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection TaskCERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection Task
 
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo Collections
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo CollectionsQMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo Collections
QMUL @ MediaEval 2012: Social Event Detection in Collaborative Photo Collections
 
The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Mentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoMentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and Video
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012
 
Live pitch event
Live pitch eventLive pitch event
Live pitch event
 
Idea or opportunity?
Idea or opportunity?Idea or opportunity?
Idea or opportunity?
 
Thotcon2013
Thotcon2013Thotcon2013
Thotcon2013
 
John Richards: My Life Lessons As An Entrepreneur
John Richards: My Life Lessons As An EntrepreneurJohn Richards: My Life Lessons As An Entrepreneur
John Richards: My Life Lessons As An Entrepreneur
 
Simha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published VersionSimha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published Version
 
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
The JHU-HLTCOE Spoken Web Search System for MediaEval 2012
 

Mais de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012MediaEval2012
 

Mais de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
 

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature Sets

  • 1. Technische Universität München Violent Scenes Detection with Large, Brute-Forced Acoustic and Visual Feature Sets Florian Eyben, Felix Weninger, Nicolas Lehment, Gerhard Rigoll, Björn Schuller Institute for Human-Machine Communication, Technische Universität München Session “Affect Task: Violent Scenes Detection” October 4, 2012
  • 2. Technische Universität München “Large” • Start with frame-wise features (audio / video) • Summarize over „meaningful unit“ – Shot? – Sliding window? – Overlap? • Application of functionals: – Percentiles, moments, … • Results in 3.8k audio and 9.7k video features October 4, 2012 TUM / Felix Weninger 2
  • 3. Technische Universität München Frame-Wise Features (LLDs) • Acoustic energy LLDs – Loudness, energy, ZCR • Acoustic spectral LLDs – MFCCs, band energy, centroid, roll-off point, flux, entropy, moments, sharpness, harmonicity • Visual LLDs – HSV histogram – Optical Flow: histogram + mean + std.dev. – Laplacian edge image histogram + strongest edge October 4, 2012 TUM / Felix Weninger 3
  • 4. Technische Universität München “Brute-Forced” • Fully data-based approach (no pre-classification) • Little hand-crafting / engineering of features • Systematic feature (over-)generation • Emphasize on machine learning • Successful in affect recognition and speaker characterization tasks – INTERSPEECH 2009 Emotion Challenge – INTERSPEECH 2010 Paralinguistic Challenge – INTERSPEECH 2011 Intoxication / Sleepiness • Generalization? October 4, 2012 TUM / Felix Weninger 4
  • 5. Technische Universität München A Data-Based Approach • System development based on 3-fold CV of development data – „Movie-independent“ – Stratified by violence proportion and age • Use all features from development data for evaluation on test data October 4, 2012 TUM / Felix Weninger 5
  • 6. Technische Universität München „Acoustic and Visual“ • Expect complementarity of modalities • Late fusion by confidences of single-modal classifiers October 4, 2012 TUM / Felix Weninger 6
  • 7. Technische Universität München Segmentation and Classification • Two segmentations evaluated on development set: – Functionals over shots – Functionals over X sec. sliding window • Sliding window segmentation: – Classify per window – Fuse window classification per shot – Alternative: Generate segmentation • Weka, SVM (SMO), C = 0.01 • Logistic regression to obtain confidences October 4, 2012 TUM / Felix Weninger 7
  • 8. Technische Universität München TUM Test Runs Run Modality Overlap Overlap MAP100 MAP100 MAP20 Train Eval Test Dev (CV) Dev (CV) TUM-1 A+V X .484 .397 .525 TUM-2 A X .376 .445 .515 TUM-3 A X X .360 .428 .518 TUM-4 A .392 .442 .503 TUM-5 V .320 .224 .213 October 4, 2012 TUM / Felix Weninger 8
  • 9. Technische Universität München TUM Test Runs Run Modality Overlap Overlap UA Rec WA Rec Train Eval Dev Dev TUM-1 A+V X .584 .848 TUM-2 A X .648 .830 TUM-3 A X X .648 .826 TUM-4 A .634 .829 TUM-5 V .537 .832 October 4, 2012 TUM / Felix Weninger 9
  • 10. Technische Universität München Test Data: MAP 100 by Movie Movie TUM-1 (A+V) TUM-2 (A) Dead Poets Society .523 .158 Fight Club .321 .315 Independence Day .609 .656 October 4, 2012 TUM / Felix Weninger 10
  • 11. Technische Universität München Discussion • MAP very sensitive to segmentation – Ex.: MAP100 = .73, MAP20 = .88 on Dev iff segment boundaries are aligned to violent / non-violent scenes NV V Aligned: Not Aligned: ? – Train on aligned data / test on not aligned data: MAP100 = .49 • Accuracies: similar ranking, but less „sensitive“ – Correlated with target function in learning October 4, 2012 TUM / Felix Weninger 11
  • 12. Technische Universität München Conclusions and Outlook • Demonstrated feasibility of „brute-force“ approach • Acoustic features alone are often competitive • Visual features are complementary • Future: Deeper analysis of – Individual features„ worth – Influence of segmentation on model training and evaluation October 4, 2012 TUM / Felix Weninger 12
  • 13. Technische Universität München Thank you. weninger@tum.de openSMILE: http://opensmile.sourceforge.net October 4, 2012 TUM / Felix Weninger 13