SlideShare a Scribd company logo
1 of 14
Application of Fisher Linear Discriminant Analysis
to Speech/Music Classification
Enrique Alexandre, Manuel Rosa, Lucas Cuadra, and Roberto Gil-Pita
Departamento de Teor´ıa de la Se˜nal y Comunicaciones
Universidad de Alcal´a. 28805 - Alcal´a de Henares,
Madrid, Spain
Presented By:
S. Lushanthan
Agenda
 Objective
 Time Frequency Decomposition
 Feature Extraction
 Classification Algorithms
 Data Collection
 Results and Discussion
Objective
 The well-known K-N-N algorithm has been widely used in many sound
classification applications. The Objective here is to,
“Demonstrate the superior behavior of the Fishers Linear Discriminant
algorithm compared to the K-Nearest-Neighbor algorithm”
Why Speech/ Music Classification?
 Fisher LDA Classifier has not been tried much in the domain of speech/ audio
classification
 If this succeeds, this would be a first-step in many Music-Genre Classification
systems
Signal Processing – Time frequency
decomposition
Feature Extraction
 Literature says that features can be classified in to 3 different classes,
1. Timbre-related
2. Rhythm related
3. Pitch-related
 For simplification purposes only “timbre-related” features are used
 A 512-samples window is used, with no overlap between adjacent frames
 The time-frequency decomposition is performed using either a Modified
Discrete Cosine Transform (MDCT), or a Discrete Fourier Transform (DFT)
 All the features are calculated and their mean and standard deviation are
computed every 43 frames (1.85 seconds at our sampling rate). Thus a 2-
dimensional vector, containing the mean and standard deviation
computed every 43 frames
Feature Description Mathematical Equation
Spectral Centroid Measure of brightness of a sound
Spectral Roll-off Shape of the spectrum
Zero Crossing Rate (ZCR) How noisy a signal is
High Zero Crossing Rate Ratio # of frames whose ZCR is 1.5x above
the mean ZCR
Short-Time Energy (STE) Mean energy of the signal within each
analysis frame
Low Short-Time Energy Ratio Ratio of frames whose STE is 0.5x below
the mean STE
Mel-frequency Cepstral
Coefficients (MFCC)
Provide a compact representation of
the spectral envelope
Voice2White measure of the energy inside the
typical speech band (300-4000 Hz)
respect to the whole energy of the
signal
Activity Level calculated using method for the
objective measurement of active
speech
Classification Algorithms
K- Nearest- Neighbor
 Classification Rule
Assume that we have a training set with L vectors grouped into C different classes. To
obtain the class corresponding to a new observed vector X, the algorithm has simply
to look for the K nearest neighbors to the test vector X, and weigh their class
numbers they belong to, usually using a majority rule.
Fisher LDA
 Data are projected onto a line, and the classification is performed in this one-
dimensional space
 The class separability function in a direction w є Rn is defined as:
 Find an analytic expression for w which maximizes J(w):
SB and SW are the between-class and
within class scatter matrixes respectively
Data Collection
 Corpus for speech/music classification provided by Dan Ellis originally recorded by
Eric Scheirer during his internship at Interval Research Corporation
“Music-Speech” Corpus
Training
Data
music
(60 files)
speech
(60 files)
m + s
(60 files)
Test
Data
speech
(without bgm)
(120 files)
music with
no vocals
(126 files)
music with
vocals
(120 files)
45 minutes,
15 seconds
each
15.25 minutes,
2.5 seconds
each
Results and Discussion
 Fisher LDA, 1-N-N, 3-N-N for all features individually
 Probability of Error
Feature Fisher 1-NN 3-NN
Centroid (MDCT) 8.74% 17.48% 21.85%
Centroid (DFT) 16.66% 29.23% 30.60%
Roll-off (MDCT) 14.48% 25.40% 21.85%
Roll-off (DFT) 8.19% 13.11% 13.11%
ZCR 9.83% 19.67% 18.03%
HZCRR 25.13% 39.89% 36.33%
STE 48.63% 22.40% 22.67%
LSTER 11.74% 33.87% 23.77%
MFCC 4.09% 22.13% 26.50%
Voice2White 4.91% 6.28% 6.01%
Activity level 12.84% 18.03% 18.85%
Combination of two or more
of these features does not
seem to improve the results.
e.g:
MFCC and the Voice2White
features with a Fisher linear
discriminant classifier, leads to a
probability of error equal to
4.09%,the same with MFCC alone
Confusion matrixes using the Voice2White
feature
Classifier Speech Music
Fisher
Speech 104 16
Music 2 244
1-N-N
Speech 114 6
Music 17 229
3-N-N Speech 116 4
Music 18 228
Fisher LDA has
high probability
of error when the
input is Speech
K-N-N has high
probability of
error when the
input is Music
So Why not combine classifiers using
Majority Rule for better results?
Probability of Error drops to 4.5%
Conclusion
 Fisher linear discriminant analysis can provide very promising results using
only one feature for the classification
 Better results may be obtained combining the results obtained from two or
more classifiers
Thank you!

More Related Content

What's hot

NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference Natan Katz
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: MixturesCVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtureszukun
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureRajesh Piryani
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationQuentin Pleplé
 
Error Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source ConditionError Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source Conditioncsandit
 
Spectral Clustering Report
Spectral Clustering ReportSpectral Clustering Report
Spectral Clustering ReportMiaolan Xie
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Admission in india 2015
Admission in india 2015Admission in india 2015
Admission in india 2015Edhole.com
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsJames Bell
 
Smooth entropies a tutorial
Smooth entropies a tutorialSmooth entropies a tutorial
Smooth entropies a tutorialwtyru1989
 
icml2004 tutorial on spectral clustering part II
icml2004 tutorial on spectral clustering part IIicml2004 tutorial on spectral clustering part II
icml2004 tutorial on spectral clustering part IIzukun
 
icml2004 tutorial on spectral clustering part I
icml2004 tutorial on spectral clustering part Iicml2004 tutorial on spectral clustering part I
icml2004 tutorial on spectral clustering part Izukun
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsSelman Bozkır
 
presentation
presentationpresentation
presentationjie ren
 

What's hot (20)

NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: MixturesCVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
 
Interactive Latent Dirichlet Allocation
Interactive Latent Dirichlet AllocationInteractive Latent Dirichlet Allocation
Interactive Latent Dirichlet Allocation
 
Error Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source ConditionError Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source Condition
 
Db Scan
Db ScanDb Scan
Db Scan
 
Spectral Clustering Report
Spectral Clustering ReportSpectral Clustering Report
Spectral Clustering Report
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
Admission in india 2015
Admission in india 2015Admission in india 2015
Admission in india 2015
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
 
Smooth entropies a tutorial
Smooth entropies a tutorialSmooth entropies a tutorial
Smooth entropies a tutorial
 
Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
 
icml2004 tutorial on spectral clustering part II
icml2004 tutorial on spectral clustering part IIicml2004 tutorial on spectral clustering part II
icml2004 tutorial on spectral clustering part II
 
icml2004 tutorial on spectral clustering part I
icml2004 tutorial on spectral clustering part Iicml2004 tutorial on spectral clustering part I
icml2004 tutorial on spectral clustering part I
 
Dft
DftDft
Dft
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
 
presentation
presentationpresentation
presentation
 
main
mainmain
main
 

Viewers also liked

Introduction to Functional Data Analysis
Introduction to Functional Data AnalysisIntroduction to Functional Data Analysis
Introduction to Functional Data AnalysisRené Franck Essomba
 
face recognition system
face recognition systemface recognition system
face recognition systemAnil Kumar
 
Morphological Analyzer and Generator for Tamil Language
Morphological Analyzer and Generator for Tamil LanguageMorphological Analyzer and Generator for Tamil Language
Morphological Analyzer and Generator for Tamil LanguageLushanthan Sivaneasharajah
 
LDA presentation
LDA presentationLDA presentation
LDA presentationMohit Gupta
 
4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptxmustafa sarac
 
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15MLconf
 
Kernel fisher discriminant
Kernel fisher discriminantKernel fisher discriminant
Kernel fisher discriminantĐỗ Hợp
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorUnited States Air Force Academy
 
Face recognition using laplacian faces
Face recognition using laplacian facesFace recognition using laplacian faces
Face recognition using laplacian facesPulkiŧ Sharma
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisBhasker Rajan
 
T18 discriminant analysis
T18 discriminant analysisT18 discriminant analysis
T18 discriminant analysiskompellark
 
discriminant analysis
discriminant analysisdiscriminant analysis
discriminant analysiskrishnadk
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisMurali Raj
 

Viewers also liked (20)

Introduction to Functional Data Analysis
Introduction to Functional Data AnalysisIntroduction to Functional Data Analysis
Introduction to Functional Data Analysis
 
face recognition system
face recognition systemface recognition system
face recognition system
 
Morphological Analyzer and Generator for Tamil Language
Morphological Analyzer and Generator for Tamil LanguageMorphological Analyzer and Generator for Tamil Language
Morphological Analyzer and Generator for Tamil Language
 
LDA presentation
LDA presentationLDA presentation
LDA presentation
 
4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx
 
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15
Isabelle Guyon, President, ChaLearn at MLconf SF - 11/13/15
 
Kernel fisher discriminant
Kernel fisher discriminantKernel fisher discriminant
Kernel fisher discriminant
 
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super VectorLec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
 
Microsoft Web Technology Stack
Microsoft Web Technology StackMicrosoft Web Technology Stack
Microsoft Web Technology Stack
 
Face recognition using LDA
Face recognition using LDAFace recognition using LDA
Face recognition using LDA
 
Face recognition using laplacian faces
Face recognition using laplacian facesFace recognition using laplacian faces
Face recognition using laplacian faces
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
LDA
LDALDA
LDA
 
PCA vs LDA
PCA vs LDAPCA vs LDA
PCA vs LDA
 
T18 discriminant analysis
T18 discriminant analysisT18 discriminant analysis
T18 discriminant analysis
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
Lda
LdaLda
Lda
 
discriminant analysis
discriminant analysisdiscriminant analysis
discriminant analysis
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 

Similar to Application of Fisher Linear Discriminant Analysis to Speech/Music Classification

sound level meter octave band ananlyser.pptx
sound level meter octave band ananlyser.pptxsound level meter octave band ananlyser.pptx
sound level meter octave band ananlyser.pptxpriyankatabhane
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...April Smith
 
129966863723746268[1]
129966863723746268[1]129966863723746268[1]
129966863723746268[1]威華 王
 
Acoustic fMRI noise reduction: a perceived loudness approach
Acoustic fMRI noise reduction: a perceived loudness approachAcoustic fMRI noise reduction: a perceived loudness approach
Acoustic fMRI noise reduction: a perceived loudness approachDimitri Vrehen
 
A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
A Novel Method for Silence Removal in Sounds Produced by Percussive InstrumentsA Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
A Novel Method for Silence Removal in Sounds Produced by Percussive InstrumentsIJMTST Journal
 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...IRJET Journal
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONIRJET Journal
 
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition TechniquesAnalysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...csandit
 
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...cscpconf
 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...csandit
 
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...CSCJournals
 
Graphical visualization of musical emotions
Graphical visualization of musical emotionsGraphical visualization of musical emotions
Graphical visualization of musical emotionsPranay Prasoon
 
129966864160453838[1]
129966864160453838[1]129966864160453838[1]
129966864160453838[1]威華 王
 

Similar to Application of Fisher Linear Discriminant Analysis to Speech/Music Classification (20)

sound level meter octave band ananlyser.pptx
sound level meter octave band ananlyser.pptxsound level meter octave band ananlyser.pptx
sound level meter octave band ananlyser.pptx
 
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
A Combined Sub-Band And Reconstructed Phase Space Approach To Phoneme Classif...
 
F010334548
F010334548F010334548
F010334548
 
129966863723746268[1]
129966863723746268[1]129966863723746268[1]
129966863723746268[1]
 
Acoustic fMRI noise reduction: a perceived loudness approach
Acoustic fMRI noise reduction: a perceived loudness approachAcoustic fMRI noise reduction: a perceived loudness approach
Acoustic fMRI noise reduction: a perceived loudness approach
 
A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
A Novel Method for Silence Removal in Sounds Produced by Percussive InstrumentsA Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
 
T26123129
T26123129T26123129
T26123129
 
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATIONHUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition TechniquesAnalysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition Techniques
 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...
 
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...
AN EFFICIENT PEAK VALLEY DETECTION BASED VAD ALGORITHM FOR ROBUST DETECTION O...
 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...
 
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
 
Graphical visualization of musical emotions
Graphical visualization of musical emotionsGraphical visualization of musical emotions
Graphical visualization of musical emotions
 
S@P Noise.pptx
S@P Noise.pptxS@P Noise.pptx
S@P Noise.pptx
 
129966864160453838[1]
129966864160453838[1]129966864160453838[1]
129966864160453838[1]
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Application of Fisher Linear Discriminant Analysis to Speech/Music Classification

  • 1. Application of Fisher Linear Discriminant Analysis to Speech/Music Classification Enrique Alexandre, Manuel Rosa, Lucas Cuadra, and Roberto Gil-Pita Departamento de Teor´ıa de la Se˜nal y Comunicaciones Universidad de Alcal´a. 28805 - Alcal´a de Henares, Madrid, Spain Presented By: S. Lushanthan
  • 2. Agenda  Objective  Time Frequency Decomposition  Feature Extraction  Classification Algorithms  Data Collection  Results and Discussion
  • 3. Objective  The well-known K-N-N algorithm has been widely used in many sound classification applications. The Objective here is to, “Demonstrate the superior behavior of the Fishers Linear Discriminant algorithm compared to the K-Nearest-Neighbor algorithm” Why Speech/ Music Classification?  Fisher LDA Classifier has not been tried much in the domain of speech/ audio classification  If this succeeds, this would be a first-step in many Music-Genre Classification systems
  • 4. Signal Processing – Time frequency decomposition
  • 5. Feature Extraction  Literature says that features can be classified in to 3 different classes, 1. Timbre-related 2. Rhythm related 3. Pitch-related  For simplification purposes only “timbre-related” features are used  A 512-samples window is used, with no overlap between adjacent frames  The time-frequency decomposition is performed using either a Modified Discrete Cosine Transform (MDCT), or a Discrete Fourier Transform (DFT)  All the features are calculated and their mean and standard deviation are computed every 43 frames (1.85 seconds at our sampling rate). Thus a 2- dimensional vector, containing the mean and standard deviation computed every 43 frames
  • 6. Feature Description Mathematical Equation Spectral Centroid Measure of brightness of a sound Spectral Roll-off Shape of the spectrum Zero Crossing Rate (ZCR) How noisy a signal is High Zero Crossing Rate Ratio # of frames whose ZCR is 1.5x above the mean ZCR Short-Time Energy (STE) Mean energy of the signal within each analysis frame Low Short-Time Energy Ratio Ratio of frames whose STE is 0.5x below the mean STE Mel-frequency Cepstral Coefficients (MFCC) Provide a compact representation of the spectral envelope Voice2White measure of the energy inside the typical speech band (300-4000 Hz) respect to the whole energy of the signal Activity Level calculated using method for the objective measurement of active speech
  • 7. Classification Algorithms K- Nearest- Neighbor  Classification Rule Assume that we have a training set with L vectors grouped into C different classes. To obtain the class corresponding to a new observed vector X, the algorithm has simply to look for the K nearest neighbors to the test vector X, and weigh their class numbers they belong to, usually using a majority rule.
  • 8. Fisher LDA  Data are projected onto a line, and the classification is performed in this one- dimensional space  The class separability function in a direction w є Rn is defined as:  Find an analytic expression for w which maximizes J(w): SB and SW are the between-class and within class scatter matrixes respectively
  • 9. Data Collection  Corpus for speech/music classification provided by Dan Ellis originally recorded by Eric Scheirer during his internship at Interval Research Corporation
  • 10. “Music-Speech” Corpus Training Data music (60 files) speech (60 files) m + s (60 files) Test Data speech (without bgm) (120 files) music with no vocals (126 files) music with vocals (120 files) 45 minutes, 15 seconds each 15.25 minutes, 2.5 seconds each
  • 11. Results and Discussion  Fisher LDA, 1-N-N, 3-N-N for all features individually  Probability of Error Feature Fisher 1-NN 3-NN Centroid (MDCT) 8.74% 17.48% 21.85% Centroid (DFT) 16.66% 29.23% 30.60% Roll-off (MDCT) 14.48% 25.40% 21.85% Roll-off (DFT) 8.19% 13.11% 13.11% ZCR 9.83% 19.67% 18.03% HZCRR 25.13% 39.89% 36.33% STE 48.63% 22.40% 22.67% LSTER 11.74% 33.87% 23.77% MFCC 4.09% 22.13% 26.50% Voice2White 4.91% 6.28% 6.01% Activity level 12.84% 18.03% 18.85% Combination of two or more of these features does not seem to improve the results. e.g: MFCC and the Voice2White features with a Fisher linear discriminant classifier, leads to a probability of error equal to 4.09%,the same with MFCC alone
  • 12. Confusion matrixes using the Voice2White feature Classifier Speech Music Fisher Speech 104 16 Music 2 244 1-N-N Speech 114 6 Music 17 229 3-N-N Speech 116 4 Music 18 228 Fisher LDA has high probability of error when the input is Speech K-N-N has high probability of error when the input is Music So Why not combine classifiers using Majority Rule for better results? Probability of Error drops to 4.5%
  • 13. Conclusion  Fisher linear discriminant analysis can provide very promising results using only one feature for the classification  Better results may be obtained combining the results obtained from two or more classifiers

Editor's Notes

  1. Discreate Fourier Transformation
  2. J(w) Equation- Rayleigh quotient