Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Robust ASR system : Malayalam
1. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Robust ASR system : Malayalam
Carrol Xavier,
Mohammed Musfir,
Rahmathulla,
Supriya,
Yasif
Guided By :
Mr.Edet Bijoy K
Assistant Professor
Department of ECE
MES College of Engineering
May 3, 2012ASR system : Malayalam
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif
Robust
2. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Objective
To implement a digit recognizing prototype for Malayalam
Language 0-9 using HMM model of speech
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
3. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
4. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
5. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
6. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
7. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
8. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Contents
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
9. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
10. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
What is Speech?
Produced when air from
lungs passes through
glottis, throat and mouth
Excitation in three ways:
Voiced excitation
Unvoiced excitation
Transient excitation
Some sounds -
Combinations of three
excitations
Spectral Changes - Vocal
Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
11. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
What is Speech?
Produced when air from
lungs passes through
glottis, throat and mouth
Excitation in three ways:
Voiced excitation
Unvoiced excitation
Transient excitation
Some sounds -
Combinations of three
excitations
Spectral Changes - Vocal
Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
12. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
What is Speech?
Produced when air from
lungs passes through
glottis, throat and mouth
Excitation in three ways:
Voiced excitation
Unvoiced excitation
Transient excitation
Some sounds -
Combinations of three
excitations
Spectral Changes - Vocal
Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
13. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
What is Speech?
Produced when air from
lungs passes through
glottis, throat and mouth
Excitation in three ways:
Voiced excitation
Unvoiced excitation
Transient excitation
Some sounds -
Combinations of three
excitations
Spectral Changes - Vocal
Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
14. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Pictorial Representation of “SHOP”
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
15. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Characteristics of Speech
Bandwidth - 4 KHz
Fundamental Frequency - Depends on the type of
articulation
Peaks in the Spectrum -
Voiced excitation - P(f ) - Triangular Pulse
Unvoiced excitation - a white noise generator
Pitch Extraction:
Rabiner Gold Pitch Tracker
Autocorrelation Pitch Tracker
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
16. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Pitch Extraction - Autocorrelation
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
17. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Formant Frequency
Concentration of acoustic energy on particular frequency
At 1000 Hz intervals
Resonance in Vocal Tracts
Spectrogram - Darkness: Strength of formant
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
18. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Spectrogram
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
19. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Speech Production Model
S(f ) = (v P(f ) + uN(f ))H(f )R(f ) = X (f )H(f )R(f )
The mixture between voiced and unvoiced excitation
determined by v and u
The fundamental frequency determined by P(f )
The spectral shaping determined by H(f )
The signal amplitude depending on v and u
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
20. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing and
challenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
21. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing and
challenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
22. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing and
challenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
23. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing and
challenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
24. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing and
challenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
25. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Classifying ASR system
System contains two subsystems:
ASR - Transcribe natural speech
SU - Understand the meaning of transcribed speech
ASR system classified as:
DVI - Direct Voice Input
LVCSR - Large Vocabulary Continuous Speech
Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
26. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Block Diagram of ASR
Acoustic Properties - Linguistic representation
Initial acquisition - Signal transduction or Recording
Feature extraction - Spectral Analysis
Segmentation - Phoneme Boundary Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
27. Introduction
Implementation Methodology Speech
HTK Implementation Automatic Speech Recognition
Analysis and Result Components of ASR
Future Work Approaches of ASR
Conclusion
Approaches of ASR
Template Based Approach
Knowledge Based Approach
Statistical Approach
Conversational Recognition
Recognition using Learning Approach
Artificial Intelligence in Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
28. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
29. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
Implementation Challenges
Successive Recognition - Artificial Pauses
Continuous speech recognition - Co Articulation
Physiological parameters
Prosody and Temporal features
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
30. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
Database Preparation
Most important phase for training and recognition
accuracy
50 people - 25 males and 25 females
10 words repeated 20 time each
10000 words for training
35 speakers used for training and 15 reserved for
recognition
Utterances converted to Cepstral domain
Optimization for HMM parameter determination
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
31. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
Feature Extraction
Temporal - SPEAKER Recognition
Spectral - SPEECH Recognition
Critical band filter
Cepstral Analysis
N−1
S(k) = s(n)exp((−j2π/N)nk) (1)
n=0
ˆ
S(k) = log (S(K )) (2)
N−1
ˆ
S(n) = 1/N ˆ
S(k)exp((−j2π/N)nk) (3)
k=0
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
32. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
MFCC
Fourier of a windowed signal
Map power of spectrum on mel scale
Logs of power at each mel
DCT
Amplitude - MFCC
Normalising
Raising log mel amplitudes to higher powers
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
33. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
MFCC for HTK
Usually static
Performance - Time derivative
Delta D
Acceleration A
Third Differential
Suppress Absolute energy - Optionally
Vocal Tract Length Normalisation (VTLN)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
34. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
HMM for isolated word recognition
In normal method - Isolated word concatenation
Recognizer map between sequences of speech vectors and
symbol sequences
But one to one mapping complex as underlying sequences
produce similar sounds
Boundaries between symbols cannot be identified
explicitly
Sequence of speech vectors corresponding to each word
generated by a Markov model
A Markov model is a finite state machine which changes
state once every time unit
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
35. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
A Markov Generation Model
Bayesian Interpretation - Finite State Bayesian model
with Markovian prior
Θ∗ = ArgMax P(Θ) P(S|Θ)P(Y |S, Θ) (4)
Θ
s∈S
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
36. Introduction
Implementation Methodology Implementation Challenges
HTK Implementation Database Preparation
Analysis and Result Feature Extraction
Future Work HMM Implementation
Conclusion
HMM for isolated word recognition
Modeling of HMM - HTK
Six state model moves through the state sequence X = 1,
2, 2, 3, 4, 4, 5, 6 to generate the sequence o1 to o6
P(O, X | M) = a12 b2 (o1 )a22 b2 (o2 )a23 b3 (o3 )...
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
37. Introduction
Implementation Methodology
HTK Implementation What is HTK?
Analysis and Result HTK Familiarisation
Future Work
Conclusion
1 Introduction
Speech
Automatic Speech Recognition
Approaches of ASR
2 Implementation Methodology
Implementation Challenges
Database Preparation
Feature Extraction
3 HTK Implementation
What is HTK?
HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
38. Introduction
Implementation Methodology
HTK Implementation What is HTK?
Analysis and Result HTK Familiarisation
Future Work
Conclusion
What is HTK?
HMM Toolkit
Cambridge University - Initially by MS
Used for OCR, WSN and Speech Recognition
39 tools and customized tools ...
Variety of options: Time limitation, thus only default used
Portable
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
39. Introduction
Implementation Methodology
HTK Implementation What is HTK?
Analysis and Result HTK Familiarisation
Future Work
Conclusion
HTK Familiarisation
Tools Function
HParse Parsing using Backus Naur
HDMan Dictionary Creation of HTK format
HLEd MLF file Manipulation
HCopy Feature Extraction - Acoustic Analysis
HCompV HMM prototype creation
HRest Training - Baum Welch
HHed HMM manipulation
HVite Viterbi -Decode
HResult Gives the result
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
40. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Analysis and Result
50 types of database - 25 training and 25 Testing
35 training and 15 testing
Speaker dependent - 90%
Speaker Independent - 83%
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
41. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Confusions
Numbers Confusion 1
0 3
1 3
2 -
3 1
4 -
5 -
6 -
7 8
8 7
9 -
START SIL -
END SIL -
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
42. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
43. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
44. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
45. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
46. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
47. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Future Work
Extended word recognition system
MS SDK
Acoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
48. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
49. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
50. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
51. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
52. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
53. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
54. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
55. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Conclusion
Speech - Technical approach
ASR
Approaches
Challenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
56. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Appendix
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
57. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Mel Scale and Cepstrum
Convert Hz to Mel
f f
m = 2529log10 1 + = 1127loge 1 + (5)
700 700
Gunnar Fant proposed
1000 f
m= log10 1 + (6)
log2 1000
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
58. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Real time understanding of HMM
Evaluate - Forward Algorithm
Decode - Viterbi
Train - Baum Welch
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
59. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
HMM Example
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
60. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Viterbi Example
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
61. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Baum Welch
Generalized Expectation-Maximization (GEM) algorithm
Maximum Likelihood Estimates
Posterior Mode Estimate
Transition and Emission probabilities
Dividing the expected transition from Si to Sj by the
expected transitions from Si
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
62. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Frequency Warping
Performed by applying the unitary warping operator U
One spectral representation on a certain frequency scale
and with a certain frequency resolution transformed to
another representation on a new frequency scale
Resolution uniform on the new scale - Non-Uniform with
respect to old scale
Scale transform of a function
∞
e −j2πlnf
DX (c) = X (f ) √ df (7)
0 f
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
63. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
Inverse scale-transform
∞
∞
√ e −2πclnf
DX (c) = αX (αf ) √ df = e j2πlnα DX (c) (8)
0 f
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
64. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
MLE
Maximum Likelihood Estimation
Value of parameter vector maximizing the probability
Searching the multi-dimensional parameter space
MLE Estimate
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
65. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
References
[1 ] Claudio Bechetti and Klucio Prina Ricotti, Speech Recognition Theory and
C++ Implementation, John Wiley and Sons, pp. 10, 2004.
[2 ] Davis K. H., Biddulph R. and Balashek S, Automatic Recognition of Spoken
Digits, Journal of Acoustical Society of America, Volume:24, Issue:6, pp.
637-642, 1952.
[3 ] Rabiner, L., R., Wilpon, J. G., Considerations in applying clustering
techniques to speaker-independent word recognition, Journal of Acoustical
Society of America,Volume:66, Number:3, pp. 663-673. 1979.
[4 ] Mori R.D, Lam L, Gilloux M., Learning and plan refinement in a knowledge
based system for automatic speech recognition,IEEE Transaction on Pattern
Analysis Machine Intelligence, Volume 9, Number 2, pp.289-305, 2001.
[5 ] Huang, C., Tao, C., Chang,E., Accent Issues in Large Vocabulary Continuous
Speech Recognition, International Journal Of Speech Technology, Volume:7,
pp.141-153, 2004
[6 ] Steve Young et. al.,The HTK Book(for HTK Version 3.4),Cambridge
University Engineering Department, pp.3-6, 2009
[7 ] E. J. Candes,Compressive sampling,Proceedings of International Congress of
Mathematicians, 2006
[8 ] S. F. Cotter,Sparse Representation for accurate classifi cation of corrupted and
occluded facial expressions ,Proceedings of ICASSP, pp. 838-841, 2010
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
66. Introduction
Implementation Methodology
HTK Implementation
Analysis and Result
Future Work
Conclusion
“A technology is a real progress when it is available to
anyone”- Henry Ford
THANK YOU
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam