SlideShare uma empresa Scribd logo
1 de 11
Baixar para ler offline
Content               System architecture                 Experimental Results   Conclusion




          TUKE MediaEval 2012: Spoken Web Search using
                 DTW and Unsupervised SVM
           MediaEval Benchmarking Initiative for Multimedia Evaluation


                      Jozef Vavrek, Mat´ˇ Pleva, Jozef Juh´r
                                       us                 a

                  Department of Electronics and Multimedia Communications
                       Technical University of Koˇice, Slovak Republic
                                                 s


                    e-mail:{jozef.vavrek; matus.pleva; jozef.juhar}@tuke.sk
                                            04 October, 2012
Content                System architecture   Experimental Results   Conclusion




      1   System architecture
            Segmentation
            Feature Extraction
            Support Vector Machine Method
            Searching Algorithm


      2   Experimental Results


      3   Conclusion
Content                     System architecture          Experimental Results               Conclusion



Proposed query-by-example searching architecture




          Audio documents                                  Feature                  DTW
                                          Segmentation
            utterances                                    extraction               (MCA)




                                                                                Support Vector
          Audio documents
                                                                                  Machine
              queries
Content               System architecture                              Experimental Results     Conclusion



Segmentation and pre-processing
          segmentation: into the segments with variable length: lsegment = lquery ⇒
          rectangular window
          use: for further phase of pre-processing and feature extraction
          pre-processing: pre-emphasis filtering, Hamming’s window: lwindow = lquery /100
          ⇒ overlapping - 50%,
          use: to emphasize higher frequency components, to reduce abrupt changes
          within the spectrum of the signal, to increase classification performance of the
          SVM classifier



                    utterance


                                       1.segment       2.segment       3.segment    4.segment
                           framing




                                                          lwindow=lquery/100
                      query

                                     lsegment=lquery
Content                         System architecture                          Experimental Results                                       Conclusion



Feature Extraction



                                                                                                                       coefficients
                                                                                                                        (features)




                                                                                                frames (instances)
                                                                                                                      0           12
                                                                                                                      0           12

                               log of amplitude                                    IDFT                                           12
          transformation                                     filtering                                                 0

            (DFT, FFT)             spectrum               (Mel filter bank)        (DCT)


                                                                                                                      0           12
                                                  Mel
                                                                                               Feature vector matrix
      avgMCA



          1000                                                                                                       utterance segment




                                                                                                       query
           500


           250,1

                                                                                                                                  MCA


                      MFCCs   MFCCs+ZCR MFCCs+ZCR+MPEG-7
                                                               Dimension
                                                                                                    Similarity matrix 13x13
                                        (ASS, ASC, ASF, ASE)
                                                                                                      (Cost matrix)
Content                System architecture                          Experimental Results                   Conclusion



Support Vector Machine classifier

          linear SVM with soft and hard margin defined by decision hyperplane
                                                                          l
                                d(w, x, b) = w· x + b =                       wi xi + b,                   (1)
                                                                        i=1



            x2                                                     x2   Hard margin
                                     Class 1; y=+1                                         Class 1; y=+1
             Soft margin




                                             Decision hyperplane

            Class 2; y=-1                                           Class 2; y=-1
                                               x1                                                     x1
Content             System architecture                                                Experimental Results          Conclusion



Nonlinear SVM classifier
          mapping into the high-dimensional feature space by kernel functions
                                                                     l
                                                   d(x) =                 αi yi z(x)· z(xi ) + b,                    (2)
                                                                    i=1

                                               K (xi , xj ) = zi · zj             = Φ(xi )· Φ(xj ) .                 (3)

                     x2                                                           x2




                                                                         Φ( )                               Φ( )
                                                                                                  Φ(      )
                                                                                       Φ(    )
                                                                                                         Φ( ) Φ( )
                                                                                        Φ(    )         Φ( )
                                                                                             Φ(       )      Φ( )
                                                                                  Φ(    )
                                                                                            Φ(    )


                                                           x1                                                   x1




          used kernel functions
                          Mat. expression                                Type

                          K (xi , xj ) = xi · xj                         Linear
                                                                d
                          K (xi , xj ) =   γ xi · xj + 1                 Polynomial of degree d


                          K (xi , xj ) = exp(−γ|xi − xj |2 )             Gaussian Radial Basis Function (RBF)
Content                              System architecture                       Experimental Results                                          Conclusion



SVM based searching (classification) algorithm
                          Segment 1        Segment 2       Segment 3               . . .                   Segment N




                            lquery

          query001
                                                                                                                                     frames
                          segment 1


            +1                                                                                                          lwindow=lquery/100
                              -1                                                             0 1       ... 11 12
                                                                                                   13 MFCCs
                 query001            segment 2


                     +1                  -1                                                query001        segment N



                 Compute MCA of DTW                                                          +1                    -1
                                                                              < threshold



                 Train SVM with linear             SVM model            Compute
                                                                        miss(+1)
                 kernel and C=1                                         miss(-1)
                                                   Num. of iterations

                                                             Query detected   > threshold
Content       System architecture                  Experimental Results      Conclusion



Experimental results

                                        Number of iteration
             Score parameter:                  100               = 2.82
                                    correctly predicted frames
            Error rate: 1 −              all tested frames        = 0.18
                                             miss(+)+miss(−)
           Miss-classification rate:          all predicted data     = 0.12

             Evaluation results of the           tested algorithm
            database set P(FA)                   P(Miss) ATWV
            evalQ-devC     1.54617                0.960     -0.052
            devQ-evalC     1.62595                0.948     -0.233
            evalQ-evalC 1.68694                   0.974     -0.164
            devQ-devC      1.78786                0.943     -0.194
Content            System architecture      Experimental Results       Conclusion



Conclusions and Future Work

          Proposed query-by-example searching system based on the
          minimum cost alignment of DTW algorithm and unsupervised
          SVM miss-classification error rate.
          No other resources were used during the development.
          Poor detection performance with high number of false alarms
          and miss-detections caused by variable length of queries and
          detected terms with similar spectral characteristics within
          each utterances.
          Relatively high computational time (searching time) of
          proposed algorithm.

          Future work: design an effective query-by-example searching
          system with lower computational time and miss-detections.
Content      System architecture   Experimental Results   Conclusion




          Thank You For Your Attention

Mais conteúdo relacionado

Mais procurados

Programacion multiobjetivo
Programacion multiobjetivoProgramacion multiobjetivo
Programacion multiobjetivo
Diego Bass
 
Cvpr2010 open source vision software, intro and training part i vl feat libra...
Cvpr2010 open source vision software, intro and training part i vl feat libra...Cvpr2010 open source vision software, intro and training part i vl feat libra...
Cvpr2010 open source vision software, intro and training part i vl feat libra...
zukun
 
Abaqus users (commercial finite element code
Abaqus users (commercial finite element codeAbaqus users (commercial finite element code
Abaqus users (commercial finite element code
basma2006
 
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
ICAC09
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
zukun
 
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
Edson Oliveira Junior
 
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural Network
Putri Wikie
 

Mais procurados (20)

The DEVS-Driven Modeling Language: Syntax and Semantics Definition by Meta-Mo...
The DEVS-Driven Modeling Language: Syntax and Semantics Definition by Meta-Mo...The DEVS-Driven Modeling Language: Syntax and Semantics Definition by Meta-Mo...
The DEVS-Driven Modeling Language: Syntax and Semantics Definition by Meta-Mo...
 
Structural Dynamics Toolbox and OpenFEM, a technical overview
Structural Dynamics Toolbox and OpenFEM, a technical overviewStructural Dynamics Toolbox and OpenFEM, a technical overview
Structural Dynamics Toolbox and OpenFEM, a technical overview
 
Dynamic Event-Driven Actors (DERA)
Dynamic Event-Driven Actors (DERA)Dynamic Event-Driven Actors (DERA)
Dynamic Event-Driven Actors (DERA)
 
D25014017
D25014017D25014017
D25014017
 
Programacion multiobjetivo
Programacion multiobjetivoProgramacion multiobjetivo
Programacion multiobjetivo
 
Cvpr2010 open source vision software, intro and training part i vl feat libra...
Cvpr2010 open source vision software, intro and training part i vl feat libra...Cvpr2010 open source vision software, intro and training part i vl feat libra...
Cvpr2010 open source vision software, intro and training part i vl feat libra...
 
Gh2411361141
Gh2411361141Gh2411361141
Gh2411361141
 
Abaqus users (commercial finite element code
Abaqus users (commercial finite element codeAbaqus users (commercial finite element code
Abaqus users (commercial finite element code
 
Leveraging collaborativetaggingforwebitemdesign ajithajjarani
Leveraging collaborativetaggingforwebitemdesign ajithajjaraniLeveraging collaborativetaggingforwebitemdesign ajithajjarani
Leveraging collaborativetaggingforwebitemdesign ajithajjarani
 
48
4848
48
 
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
 
iPaper@GlobIS - Interactive Paper Research
iPaper@GlobIS - Interactive Paper ResearchiPaper@GlobIS - Interactive Paper Research
iPaper@GlobIS - Interactive Paper Research
 
Hq3114621465
Hq3114621465Hq3114621465
Hq3114621465
 
Ocr
OcrOcr
Ocr
 
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
SMartyParser: an XMI Parser for UML-based Software Product Line Variability M...
 
Uncertainty propagation in structural dynamics
Uncertainty propagation in structural dynamics Uncertainty propagation in structural dynamics
Uncertainty propagation in structural dynamics
 
Mesh Generation and Topological Data Analysis
Mesh Generation and Topological Data AnalysisMesh Generation and Topological Data Analysis
Mesh Generation and Topological Data Analysis
 
Neural network and mlp
Neural network and mlpNeural network and mlp
Neural network and mlp
 
(Artificial) Neural Network
(Artificial) Neural Network(Artificial) Neural Network
(Artificial) Neural Network
 

Destaque

Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanew
ilovepurin
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
MediaEval2012
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-Tagging
MediaEval2012
 
The L2F Spoken Web Search system for Mediaeval 2012
The L2F Spoken Web Search system for Mediaeval 2012The L2F Spoken Web Search system for Mediaeval 2012
The L2F Spoken Web Search system for Mediaeval 2012
MediaEval2012
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
MediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
MediaEval2012
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing Task
MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
MediaEval2012
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharon
Sharon Jimenez
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
MediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
MediaEval2012
 
GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012
MediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
MediaEval2012
 
Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skills
JNavarro0321
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
MediaEval2012
 

Destaque (20)

Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanew
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
How Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-TaggingHow Spatial Segmentation improves the Multimodal Geo-Tagging
How Spatial Segmentation improves the Multimodal Geo-Tagging
 
The L2F Spoken Web Search system for Mediaeval 2012
The L2F Spoken Web Search system for Mediaeval 2012The L2F Spoken Web Search system for Mediaeval 2012
The L2F Spoken Web Search system for Mediaeval 2012
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing Task
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
10 ρ. δρακουλησ
10 ρ. δρακουλησ10 ρ. δρακουλησ
10 ρ. δρακουλησ
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharon
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing Plan
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация сту
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skills
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 

Semelhante a TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM

MEMS Extraction & Verification
MEMS Extraction & VerificationMEMS Extraction & Verification
MEMS Extraction & Verification
intellisense
 
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
Benton "Ben" Bovée
 
UML profiles for Embedded Systems
UML profiles for Embedded SystemsUML profiles for Embedded Systems
UML profiles for Embedded Systems
pboulet
 
Supercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO AmsterdamSupercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO Amsterdam
Acunu
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
MediaEval2012
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012
MediaEval2012
 
Use of NS-2 to Simulate MANET Routing Algorithms
Use of NS-2 to Simulate MANET Routing AlgorithmsUse of NS-2 to Simulate MANET Routing Algorithms
Use of NS-2 to Simulate MANET Routing Algorithms
Giancarlo Romeo
 

Semelhante a TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM (20)

Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...Sparse feature analysis for detection of clustered microcalcifications in mam...
Sparse feature analysis for detection of clustered microcalcifications in mam...
 
MEMS Extraction & Verification
MEMS Extraction & VerificationMEMS Extraction & Verification
MEMS Extraction & Verification
 
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
SSTC-2012 BenKBovée 2933a_Backup Slides 26-Apr 1130-1300 Track1
 
UML profiles for Embedded Systems
UML profiles for Embedded SystemsUML profiles for Embedded Systems
UML profiles for Embedded Systems
 
Open Safety-Critical Java
Open Safety-Critical JavaOpen Safety-Critical Java
Open Safety-Critical Java
 
Supercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO AmsterdamSupercharging Cassandra - GOTO Amsterdam
Supercharging Cassandra - GOTO Amsterdam
 
FPGA Based Design of High Performance Decimator using DALUT Algorithm
FPGA Based Design of High Performance Decimator using DALUT AlgorithmFPGA Based Design of High Performance Decimator using DALUT Algorithm
FPGA Based Design of High Performance Decimator using DALUT Algorithm
 
VALID Rules - A language for cloud verification (EU CSP\’12)
VALID Rules - A language for cloud verification (EU CSP\’12)VALID Rules - A language for cloud verification (EU CSP\’12)
VALID Rules - A language for cloud verification (EU CSP\’12)
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
 
Dynamic Analysis - SCOTCH: Improving Test-to-Code Traceability using Slicing ...
Dynamic Analysis - SCOTCH: Improving Test-to-Code Traceability using Slicing ...Dynamic Analysis - SCOTCH: Improving Test-to-Code Traceability using Slicing ...
Dynamic Analysis - SCOTCH: Improving Test-to-Code Traceability using Slicing ...
 
Xo Rskillsmatrix Oct09
Xo Rskillsmatrix Oct09Xo Rskillsmatrix Oct09
Xo Rskillsmatrix Oct09
 
Event-driven Model Transformations in Domain-specific Modeling Languages
Event-driven Model Transformations in Domain-specific Modeling LanguagesEvent-driven Model Transformations in Domain-specific Modeling Languages
Event-driven Model Transformations in Domain-specific Modeling Languages
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012
 
CG OpenGL line & area-course 3
CG OpenGL line & area-course 3CG OpenGL line & area-course 3
CG OpenGL line & area-course 3
 
Microsoft HPC User Group
Microsoft HPC User Group Microsoft HPC User Group
Microsoft HPC User Group
 
Use of NS-2 to Simulate MANET Routing Algorithms
Use of NS-2 to Simulate MANET Routing AlgorithmsUse of NS-2 to Simulate MANET Routing Algorithms
Use of NS-2 to Simulate MANET Routing Algorithms
 
Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Workshop NGS data analysis - 3
Workshop NGS data analysis - 3
 
Configuring Mahout Clustering Jobs - Frank Scholten
Configuring Mahout Clustering Jobs - Frank ScholtenConfiguring Mahout Clustering Jobs - Frank Scholten
Configuring Mahout Clustering Jobs - Frank Scholten
 
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?
 
Guide
GuideGuide
Guide
 

Mais de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
MediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
MediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
MediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
MediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
MediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
MediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
MediaEval2012
 

Mais de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM

  • 1. Content System architecture Experimental Results Conclusion TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM MediaEval Benchmarking Initiative for Multimedia Evaluation Jozef Vavrek, Mat´ˇ Pleva, Jozef Juh´r us a Department of Electronics and Multimedia Communications Technical University of Koˇice, Slovak Republic s e-mail:{jozef.vavrek; matus.pleva; jozef.juhar}@tuke.sk 04 October, 2012
  • 2. Content System architecture Experimental Results Conclusion 1 System architecture Segmentation Feature Extraction Support Vector Machine Method Searching Algorithm 2 Experimental Results 3 Conclusion
  • 3. Content System architecture Experimental Results Conclusion Proposed query-by-example searching architecture Audio documents Feature DTW Segmentation utterances extraction (MCA) Support Vector Audio documents Machine queries
  • 4. Content System architecture Experimental Results Conclusion Segmentation and pre-processing segmentation: into the segments with variable length: lsegment = lquery ⇒ rectangular window use: for further phase of pre-processing and feature extraction pre-processing: pre-emphasis filtering, Hamming’s window: lwindow = lquery /100 ⇒ overlapping - 50%, use: to emphasize higher frequency components, to reduce abrupt changes within the spectrum of the signal, to increase classification performance of the SVM classifier utterance 1.segment 2.segment 3.segment 4.segment framing lwindow=lquery/100 query lsegment=lquery
  • 5. Content System architecture Experimental Results Conclusion Feature Extraction coefficients (features) frames (instances) 0 12 0 12 log of amplitude IDFT 12 transformation filtering 0 (DFT, FFT) spectrum (Mel filter bank) (DCT) 0 12 Mel Feature vector matrix avgMCA 1000 utterance segment query 500 250,1 MCA MFCCs MFCCs+ZCR MFCCs+ZCR+MPEG-7 Dimension Similarity matrix 13x13 (ASS, ASC, ASF, ASE) (Cost matrix)
  • 6. Content System architecture Experimental Results Conclusion Support Vector Machine classifier linear SVM with soft and hard margin defined by decision hyperplane l d(w, x, b) = w· x + b = wi xi + b, (1) i=1 x2 x2 Hard margin Class 1; y=+1 Class 1; y=+1 Soft margin Decision hyperplane Class 2; y=-1 Class 2; y=-1 x1 x1
  • 7. Content System architecture Experimental Results Conclusion Nonlinear SVM classifier mapping into the high-dimensional feature space by kernel functions l d(x) = αi yi z(x)· z(xi ) + b, (2) i=1 K (xi , xj ) = zi · zj = Φ(xi )· Φ(xj ) . (3) x2 x2 Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) x1 x1 used kernel functions Mat. expression Type K (xi , xj ) = xi · xj Linear d K (xi , xj ) = γ xi · xj + 1 Polynomial of degree d K (xi , xj ) = exp(−γ|xi − xj |2 ) Gaussian Radial Basis Function (RBF)
  • 8. Content System architecture Experimental Results Conclusion SVM based searching (classification) algorithm Segment 1 Segment 2 Segment 3 . . . Segment N lquery query001 frames segment 1 +1 lwindow=lquery/100 -1 0 1 ... 11 12 13 MFCCs query001 segment 2 +1 -1 query001 segment N Compute MCA of DTW +1 -1 < threshold Train SVM with linear SVM model Compute miss(+1) kernel and C=1 miss(-1) Num. of iterations Query detected > threshold
  • 9. Content System architecture Experimental Results Conclusion Experimental results Number of iteration Score parameter: 100 = 2.82 correctly predicted frames Error rate: 1 − all tested frames = 0.18 miss(+)+miss(−) Miss-classification rate: all predicted data = 0.12 Evaluation results of the tested algorithm database set P(FA) P(Miss) ATWV evalQ-devC 1.54617 0.960 -0.052 devQ-evalC 1.62595 0.948 -0.233 evalQ-evalC 1.68694 0.974 -0.164 devQ-devC 1.78786 0.943 -0.194
  • 10. Content System architecture Experimental Results Conclusion Conclusions and Future Work Proposed query-by-example searching system based on the minimum cost alignment of DTW algorithm and unsupervised SVM miss-classification error rate. No other resources were used during the development. Poor detection performance with high number of false alarms and miss-detections caused by variable length of queries and detected terms with similar spectral characteristics within each utterances. Relatively high computational time (searching time) of proposed algorithm. Future work: design an effective query-by-example searching system with lower computational time and miss-detections.
  • 11. Content System architecture Experimental Results Conclusion Thank You For Your Attention