This document summarizes the approaches taken by researchers from the National Institute of Informatics (NII) in Japan for the violent scenes detection task at MediaEval 2012. They used the NII-KAORI-SECODE framework to extract shot-based features from 5 keyframes per shot, including color moments, color histograms, edge orientation histograms and local binary patterns. They evaluated late fusion with visual attributes and used LibSVM for classifier learning. Their discussion noted challenges with the broad definition of violence and diversity of shot lengths.
Boost PC performance: How more available memory can improve productivity
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
1. NII, Japan at MediaEval 2012
Violent Scenes Detection Task
Vu Lam(1), Duy-Dinh Le(2), Sang-Phan Le(2)
Shin’ichi Satoh(2), Duc Anh Duong(3)
(1) University of Science
lqvu@fit.hcmus.edu.vn
(2) National Institute of Informatics
{ledduy,plsang,satoh}@nii.ac.jp
(3) University of Information Technology
ducda@uit.edu.vn
2. Our approaches
Using NII-KAORI-SECODE, a general frame-work for
semantic concept detection.
Using 5 keyframes per shot.
Try to apply shot-based features using the global
features (color moments, color histogram, edge
orientation histogram, and local binary patterns) for
violent scenes detection.
Evaluate the performance of late fusion with visual
attributes (blood, fights, gore, car chase, gore, cold
arms, firearm).
10/5/2012 NII, Japan at MediaEval 2012 2
Violent Scenes Detection Task
4. Classifier learning & Experiment
Classifier learning: LibSVM
10/5/2012 NII, Japan at MediaEval 2012 4
Violent Scenes Detection Task
5. NII Runs
10/5/2012 NII, Japan at MediaEval 2012 5
Violent Scenes Detection Task
6. Discussion & Future work
The definition of violence is so general
The length of shots are very diverse, many shots are
very short, and might be easily classified as non-
violent shots based on the definition.
Fusion of the violence detection results with other
visual attributes results cannot improve the
performance.
Future work is to study how to use visual attributes to
represent violent scenes
10/5/2012 NII, Japan at MediaEval 2012 6
Violent Scenes Detection Task