SlideShare uma empresa Scribd logo
1 de 66
Baixar para ler offline
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion




                     Robust ASR system : Malayalam

                                             Carrol Xavier,
                                           Mohammed Musfir,
                                             Rahmathulla,
                                               Supriya,
                                                 Yasif
                                                   Guided By :
                                                 Mr.Edet Bijoy K
                                                Assistant Professor
                                        Department of ECE
                                      MES College of Engineering

                                                 May 3, 2012ASR system : Malayalam
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif
                                                        Robust
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Objective




      To implement a digit recognizing prototype for Malayalam
      Language 0-9 using HMM model of speech




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Contents
      1    Introduction
              Speech
              Automatic Speech Recognition
              Approaches of ASR
      2    Implementation Methodology
              Implementation Challenges
              Database Preparation
              Feature Extraction
      3    HTK Implementation
              What is HTK?
              HTK Familiarisation
      4    Analysis and Result
      5    Future Work
      6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion

       1    Introduction
               Speech
               Automatic Speech Recognition
               Approaches of ASR
       2    Implementation Methodology
               Implementation Challenges
               Database Preparation
               Feature Extraction
       3    HTK Implementation
               What is HTK?
               HTK Familiarisation
       4    Analysis and Result
       5    Future Work
       6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 What is Speech?
           Produced when air from
           lungs passes through
           glottis, throat and mouth
           Excitation in three ways:
                   Voiced excitation
                   Unvoiced excitation
                   Transient excitation
           Some sounds -
           Combinations of three
           excitations
           Spectral Changes - Vocal
           Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 What is Speech?
           Produced when air from
           lungs passes through
           glottis, throat and mouth
           Excitation in three ways:
                   Voiced excitation
                   Unvoiced excitation
                   Transient excitation
           Some sounds -
           Combinations of three
           excitations
           Spectral Changes - Vocal
           Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 What is Speech?
           Produced when air from
           lungs passes through
           glottis, throat and mouth
           Excitation in three ways:
                   Voiced excitation
                   Unvoiced excitation
                   Transient excitation
           Some sounds -
           Combinations of three
           excitations
           Spectral Changes - Vocal
           Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 What is Speech?
           Produced when air from
           lungs passes through
           glottis, throat and mouth
           Excitation in three ways:
                   Voiced excitation
                   Unvoiced excitation
                   Transient excitation
           Some sounds -
           Combinations of three
           excitations
           Spectral Changes - Vocal
           Tract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Pictorial Representation of “SHOP”




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Characteristics of Speech

               Bandwidth - 4 KHz
               Fundamental Frequency - Depends on the type of
               articulation
               Peaks in the Spectrum -
                       Voiced excitation - P(f ) - Triangular Pulse
                       Unvoiced excitation - a white noise generator
               Pitch Extraction:
                       Rabiner Gold Pitch Tracker
                       Autocorrelation Pitch Tracker


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Pitch Extraction - Autocorrelation




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Formant Frequency



               Concentration of acoustic energy on particular frequency
               At 1000 Hz intervals
               Resonance in Vocal Tracts
               Spectrogram - Darkness: Strength of formant




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Spectrogram




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Speech Production Model




      S(f ) = (v P(f ) + uN(f ))H(f )R(f ) = X (f )H(f )R(f )
           The mixture between voiced and unvoiced excitation
           determined by v and u
           The fundamental frequency determined by P(f )
           The spectral shaping determined by H(f )
           The signal amplitude depending on v and u
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 About Automatic Speech Recognition


               Automatic Speech Recognition - Advancing and
               challenging
               Most of the research works - English, Arabic, Mandarin
               Native Indian Languages - Minimal work
               Industry - AT & T, Nuance, IBM
               Open Source - Vox Forge




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 About Automatic Speech Recognition


               Automatic Speech Recognition - Advancing and
               challenging
               Most of the research works - English, Arabic, Mandarin
               Native Indian Languages - Minimal work
               Industry - AT & T, Nuance, IBM
               Open Source - Vox Forge




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 About Automatic Speech Recognition


               Automatic Speech Recognition - Advancing and
               challenging
               Most of the research works - English, Arabic, Mandarin
               Native Indian Languages - Minimal work
               Industry - AT & T, Nuance, IBM
               Open Source - Vox Forge




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 About Automatic Speech Recognition


               Automatic Speech Recognition - Advancing and
               challenging
               Most of the research works - English, Arabic, Mandarin
               Native Indian Languages - Minimal work
               Industry - AT & T, Nuance, IBM
               Open Source - Vox Forge




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 About Automatic Speech Recognition


               Automatic Speech Recognition - Advancing and
               challenging
               Most of the research works - English, Arabic, Mandarin
               Native Indian Languages - Minimal work
               Industry - AT & T, Nuance, IBM
               Open Source - Vox Forge




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Classifying ASR system




               System contains two subsystems:
                       ASR - Transcribe natural speech
                       SU - Understand the meaning of transcribed speech
               ASR system classified as:
                       DVI - Direct Voice Input
                       LVCSR - Large Vocabulary Continuous Speech
                       Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Block Diagram of ASR




               Acoustic Properties - Linguistic representation
               Initial acquisition - Signal transduction or Recording
               Feature extraction - Spectral Analysis
               Segmentation - Phoneme Boundary Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Speech
                                     HTK Implementation      Automatic Speech Recognition
                                      Analysis and Result    Components of ASR
                                             Future Work     Approaches of ASR
                                               Conclusion


 Approaches of ASR


               Template Based Approach
               Knowledge Based Approach
               Statistical Approach
               Conversational Recognition
               Recognition using Learning Approach
               Artificial Intelligence in Recognition




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion

       1    Introduction
               Speech
               Automatic Speech Recognition
               Approaches of ASR
       2    Implementation Methodology
               Implementation Challenges
               Database Preparation
               Feature Extraction
       3    HTK Implementation
               What is HTK?
               HTK Familiarisation
       4    Analysis and Result
       5    Future Work
       6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 Implementation Challenges



               Successive Recognition - Artificial Pauses
               Continuous speech recognition - Co Articulation
               Physiological parameters
               Prosody and Temporal features




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 Database Preparation

               Most important phase for training and recognition
               accuracy
               50 people - 25 males and 25 females
               10 words repeated 20 time each
               10000 words for training
               35 speakers used for training and 15 reserved for
               recognition
               Utterances converted to Cepstral domain
               Optimization for HMM parameter determination

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 Feature Extraction
               Temporal - SPEAKER Recognition
               Spectral - SPEECH Recognition
                       Critical band filter
                       Cepstral Analysis


                                            N−1
                              S(k) =               s(n)exp((−j2π/N)nk)                       (1)
                                            n=0
                                             ˆ
                                            S(k) = log (S(K ))                               (2)
                                                N−1
                           ˆ
                          S(n) = 1/N                  ˆ
                                                      S(k)exp((−j2π/N)nk)                    (3)
                                                k=0
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 MFCC


               Fourier of a windowed signal
               Map power of spectrum on mel scale
               Logs of power at each mel
               DCT
               Amplitude - MFCC
               Normalising
               Raising log mel amplitudes to higher powers



Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 MFCC for HTK


               Usually static
               Performance - Time derivative
               Delta D
               Acceleration A
               Third Differential
               Suppress Absolute energy - Optionally
               Vocal Tract Length Normalisation (VTLN)



Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 HMM for isolated word recognition
               In normal method - Isolated word concatenation
               Recognizer map between sequences of speech vectors and
               symbol sequences
               But one to one mapping complex as underlying sequences
               produce similar sounds
               Boundaries between symbols cannot be identified
               explicitly
               Sequence of speech vectors corresponding to each word
               generated by a Markov model
               A Markov model is a finite state machine which changes
               state once every time unit
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology       Implementation Challenges
                                     HTK Implementation        Database Preparation
                                      Analysis and Result      Feature Extraction
                                             Future Work       HMM Implementation
                                               Conclusion


 A Markov Generation Model
               Bayesian Interpretation - Finite State Bayesian model
               with Markovian prior

                    Θ∗ = ArgMax                 P(Θ)               P(S|Θ)P(Y |S, Θ)            (4)
                                    Θ
                                                             s∈S




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif     Robust ASR system : Malayalam
Introduction
                              Implementation Methodology     Implementation Challenges
                                     HTK Implementation      Database Preparation
                                      Analysis and Result    Feature Extraction
                                             Future Work     HMM Implementation
                                               Conclusion


 HMM for isolated word recognition



               Modeling of HMM - HTK
               Six state model moves through the state sequence X = 1,
               2, 2, 3, 4, 4, 5, 6 to generate the sequence o1 to o6
               P(O, X | M) = a12 b2 (o1 )a22 b2 (o2 )a23 b3 (o3 )...




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation      What is HTK?
                                      Analysis and Result    HTK Familiarisation
                                             Future Work
                                               Conclusion

       1    Introduction
               Speech
               Automatic Speech Recognition
               Approaches of ASR
       2    Implementation Methodology
               Implementation Challenges
               Database Preparation
               Feature Extraction
       3    HTK Implementation
               What is HTK?
               HTK Familiarisation
       4    Analysis and Result
       5    Future Work
       6    Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation      What is HTK?
                                      Analysis and Result    HTK Familiarisation
                                             Future Work
                                               Conclusion


 What is HTK?


               HMM Toolkit
               Cambridge University - Initially by MS
               Used for OCR, WSN and Speech Recognition
               39 tools and customized tools ...
               Variety of options: Time limitation, thus only default used
               Portable




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation      What is HTK?
                                      Analysis and Result    HTK Familiarisation
                                             Future Work
                                               Conclusion


 HTK Familiarisation

                  Tools                             Function
                 HParse              Parsing using Backus Naur
                 HDMan               Dictionary Creation of HTK format
                 HLEd                MLF file Manipulation
                 HCopy               Feature Extraction - Acoustic Analysis
                 HCompV              HMM prototype creation
                 HRest               Training - Baum Welch
                 HHed                HMM manipulation
                 HVite               Viterbi -Decode
                 HResult             Gives the result


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Analysis and Result



               50 types of database - 25 training and 25 Testing
               35 training and 15 testing
               Speaker dependent - 90%
               Speaker Independent - 83%




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Confusions
                                       Numbers  Confusion 1
                                          0          3
                                          1          3
                                          2          -
                                          3          1
                                          4          -
                                          5          -
                                          6          -
                                          7          8
                                          8          7
                                          9          -
                                      START SIL      -
                                       END SIL       -
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Future Work


               Extended word recognition system
                       MS SDK
                       Acoustic unstable field
               System can be easily adopted to Continuous Speech
                       Real time recognition
               Blocksets to existing tools




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Conclusion

               Speech - Technical approach
               ASR
                       Approaches
                       Challenges
               Feature Extraction
               HTK Familiarization
               Inaccuracy - Lack of Database
               Extended digit recognition


Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion




                                            Appendix



Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Mel Scale and Cepstrum

      Convert Hz to Mel
                                                   f                                    f
              m = 2529log10 1 +                              = 1127loge 1 +                  (5)
                                                  700                                  700

      Gunnar Fant proposed

                                             1000            f
                                   m=             log10 1 +                                  (6)
                                             log2           1000



Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Real time understanding of HMM



               Evaluate - Forward Algorithm
               Decode - Viterbi
               Train - Baum Welch




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 HMM Example




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Viterbi Example




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Baum Welch


               Generalized Expectation-Maximization (GEM) algorithm
               Maximum Likelihood Estimates
               Posterior Mode Estimate
               Transition and Emission probabilities
               Dividing the expected transition from Si to Sj by the
               expected transitions from Si




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Frequency Warping

               Performed by applying the unitary warping operator U
               One spectral representation on a certain frequency scale
               and with a certain frequency resolution transformed to
               another representation on a new frequency scale
               Resolution uniform on the new scale - Non-Uniform with
               respect to old scale
               Scale transform of a function
                                                                ∞
                                                                          e −j2πlnf
                                       DX (c) =                     X (f ) √ df                     (7)
                                                            0                  f

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif          Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 Inverse scale-transform




                                     ∞
              ∞
                                         √           e −2πclnf
             DX (c)       =                  αX (αf ) √ df = e j2πlnα DX (c)                 (8)
                                 0                        f




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 MLE
               Maximum Likelihood Estimation
               Value of parameter vector maximizing the probability
               Searching the multi-dimensional parameter space
               MLE Estimate




Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion


 References
          [1 ] Claudio Bechetti and Klucio Prina Ricotti, Speech Recognition Theory and
             C++ Implementation, John Wiley and Sons, pp. 10, 2004.
          [2 ] Davis K. H., Biddulph R. and Balashek S, Automatic Recognition of Spoken
             Digits, Journal of Acoustical Society of America, Volume:24, Issue:6, pp.
             637-642, 1952.
          [3 ] Rabiner, L., R., Wilpon, J. G., Considerations in applying clustering
             techniques to speaker-independent word recognition, Journal of Acoustical
             Society of America,Volume:66, Number:3, pp. 663-673. 1979.
          [4 ] Mori R.D, Lam L, Gilloux M., Learning and plan refinement in a knowledge
             based system for automatic speech recognition,IEEE Transaction on Pattern
             Analysis Machine Intelligence, Volume 9, Number 2, pp.289-305, 2001.
          [5 ] Huang, C., Tao, C., Chang,E., Accent Issues in Large Vocabulary Continuous
             Speech Recognition, International Journal Of Speech Technology, Volume:7,
             pp.141-153, 2004
          [6 ] Steve Young et. al.,The HTK Book(for HTK Version 3.4),Cambridge
             University Engineering Department, pp.3-6, 2009
          [7 ] E. J. Candes,Compressive sampling,Proceedings of International Congress of
             Mathematicians, 2006
          [8 ] S. F. Cotter,Sparse Representation for accurate classifi cation of corrupted and
             occluded facial expressions ,Proceedings of ICASSP, pp. 838-841, 2010
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam
Introduction
                              Implementation Methodology
                                     HTK Implementation
                                      Analysis and Result
                                             Future Work
                                               Conclusion




        “A technology is a real progress when it is available to
                         anyone”- Henry Ford

                                    THANK YOU



Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif   Robust ASR system : Malayalam

Mais conteúdo relacionado

Último

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Último (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Robust ASR system : Malayalam

  • 1. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Robust ASR system : Malayalam Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Guided By : Mr.Edet Bijoy K Assistant Professor Department of ECE MES College of Engineering May 3, 2012ASR system : Malayalam Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust
  • 2. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Objective To implement a digit recognizing prototype for Malayalam Language 0-9 using HMM model of speech Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 3. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 4. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 5. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 6. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 7. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 8. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Contents 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 9. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 10. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion What is Speech? Produced when air from lungs passes through glottis, throat and mouth Excitation in three ways: Voiced excitation Unvoiced excitation Transient excitation Some sounds - Combinations of three excitations Spectral Changes - Vocal Tract Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 11. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion What is Speech? Produced when air from lungs passes through glottis, throat and mouth Excitation in three ways: Voiced excitation Unvoiced excitation Transient excitation Some sounds - Combinations of three excitations Spectral Changes - Vocal Tract Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 12. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion What is Speech? Produced when air from lungs passes through glottis, throat and mouth Excitation in three ways: Voiced excitation Unvoiced excitation Transient excitation Some sounds - Combinations of three excitations Spectral Changes - Vocal Tract Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 13. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion What is Speech? Produced when air from lungs passes through glottis, throat and mouth Excitation in three ways: Voiced excitation Unvoiced excitation Transient excitation Some sounds - Combinations of three excitations Spectral Changes - Vocal Tract Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 14. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Pictorial Representation of “SHOP” Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 15. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Characteristics of Speech Bandwidth - 4 KHz Fundamental Frequency - Depends on the type of articulation Peaks in the Spectrum - Voiced excitation - P(f ) - Triangular Pulse Unvoiced excitation - a white noise generator Pitch Extraction: Rabiner Gold Pitch Tracker Autocorrelation Pitch Tracker Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 16. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Pitch Extraction - Autocorrelation Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 17. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Formant Frequency Concentration of acoustic energy on particular frequency At 1000 Hz intervals Resonance in Vocal Tracts Spectrogram - Darkness: Strength of formant Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 18. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Spectrogram Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 19. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Speech Production Model S(f ) = (v P(f ) + uN(f ))H(f )R(f ) = X (f )H(f )R(f ) The mixture between voiced and unvoiced excitation determined by v and u The fundamental frequency determined by P(f ) The spectral shaping determined by H(f ) The signal amplitude depending on v and u Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 20. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion About Automatic Speech Recognition Automatic Speech Recognition - Advancing and challenging Most of the research works - English, Arabic, Mandarin Native Indian Languages - Minimal work Industry - AT & T, Nuance, IBM Open Source - Vox Forge Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 21. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion About Automatic Speech Recognition Automatic Speech Recognition - Advancing and challenging Most of the research works - English, Arabic, Mandarin Native Indian Languages - Minimal work Industry - AT & T, Nuance, IBM Open Source - Vox Forge Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 22. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion About Automatic Speech Recognition Automatic Speech Recognition - Advancing and challenging Most of the research works - English, Arabic, Mandarin Native Indian Languages - Minimal work Industry - AT & T, Nuance, IBM Open Source - Vox Forge Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 23. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion About Automatic Speech Recognition Automatic Speech Recognition - Advancing and challenging Most of the research works - English, Arabic, Mandarin Native Indian Languages - Minimal work Industry - AT & T, Nuance, IBM Open Source - Vox Forge Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 24. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion About Automatic Speech Recognition Automatic Speech Recognition - Advancing and challenging Most of the research works - English, Arabic, Mandarin Native Indian Languages - Minimal work Industry - AT & T, Nuance, IBM Open Source - Vox Forge Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 25. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Classifying ASR system System contains two subsystems: ASR - Transcribe natural speech SU - Understand the meaning of transcribed speech ASR system classified as: DVI - Direct Voice Input LVCSR - Large Vocabulary Continuous Speech Recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 26. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Block Diagram of ASR Acoustic Properties - Linguistic representation Initial acquisition - Signal transduction or Recording Feature extraction - Spectral Analysis Segmentation - Phoneme Boundary Recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 27. Introduction Implementation Methodology Speech HTK Implementation Automatic Speech Recognition Analysis and Result Components of ASR Future Work Approaches of ASR Conclusion Approaches of ASR Template Based Approach Knowledge Based Approach Statistical Approach Conversational Recognition Recognition using Learning Approach Artificial Intelligence in Recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 28. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 29. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion Implementation Challenges Successive Recognition - Artificial Pauses Continuous speech recognition - Co Articulation Physiological parameters Prosody and Temporal features Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 30. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion Database Preparation Most important phase for training and recognition accuracy 50 people - 25 males and 25 females 10 words repeated 20 time each 10000 words for training 35 speakers used for training and 15 reserved for recognition Utterances converted to Cepstral domain Optimization for HMM parameter determination Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 31. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion Feature Extraction Temporal - SPEAKER Recognition Spectral - SPEECH Recognition Critical band filter Cepstral Analysis N−1 S(k) = s(n)exp((−j2π/N)nk) (1) n=0 ˆ S(k) = log (S(K )) (2) N−1 ˆ S(n) = 1/N ˆ S(k)exp((−j2π/N)nk) (3) k=0 Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 32. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion MFCC Fourier of a windowed signal Map power of spectrum on mel scale Logs of power at each mel DCT Amplitude - MFCC Normalising Raising log mel amplitudes to higher powers Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 33. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion MFCC for HTK Usually static Performance - Time derivative Delta D Acceleration A Third Differential Suppress Absolute energy - Optionally Vocal Tract Length Normalisation (VTLN) Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 34. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion HMM for isolated word recognition In normal method - Isolated word concatenation Recognizer map between sequences of speech vectors and symbol sequences But one to one mapping complex as underlying sequences produce similar sounds Boundaries between symbols cannot be identified explicitly Sequence of speech vectors corresponding to each word generated by a Markov model A Markov model is a finite state machine which changes state once every time unit Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 35. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion A Markov Generation Model Bayesian Interpretation - Finite State Bayesian model with Markovian prior Θ∗ = ArgMax P(Θ) P(S|Θ)P(Y |S, Θ) (4) Θ s∈S Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 36. Introduction Implementation Methodology Implementation Challenges HTK Implementation Database Preparation Analysis and Result Feature Extraction Future Work HMM Implementation Conclusion HMM for isolated word recognition Modeling of HMM - HTK Six state model moves through the state sequence X = 1, 2, 2, 3, 4, 4, 5, 6 to generate the sequence o1 to o6 P(O, X | M) = a12 b2 (o1 )a22 b2 (o2 )a23 b3 (o3 )... Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 37. Introduction Implementation Methodology HTK Implementation What is HTK? Analysis and Result HTK Familiarisation Future Work Conclusion 1 Introduction Speech Automatic Speech Recognition Approaches of ASR 2 Implementation Methodology Implementation Challenges Database Preparation Feature Extraction 3 HTK Implementation What is HTK? HTK Familiarisation 4 Analysis and Result 5 Future Work 6 Conclusion Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 38. Introduction Implementation Methodology HTK Implementation What is HTK? Analysis and Result HTK Familiarisation Future Work Conclusion What is HTK? HMM Toolkit Cambridge University - Initially by MS Used for OCR, WSN and Speech Recognition 39 tools and customized tools ... Variety of options: Time limitation, thus only default used Portable Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 39. Introduction Implementation Methodology HTK Implementation What is HTK? Analysis and Result HTK Familiarisation Future Work Conclusion HTK Familiarisation Tools Function HParse Parsing using Backus Naur HDMan Dictionary Creation of HTK format HLEd MLF file Manipulation HCopy Feature Extraction - Acoustic Analysis HCompV HMM prototype creation HRest Training - Baum Welch HHed HMM manipulation HVite Viterbi -Decode HResult Gives the result Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 40. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Analysis and Result 50 types of database - 25 training and 25 Testing 35 training and 15 testing Speaker dependent - 90% Speaker Independent - 83% Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 41. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Confusions Numbers Confusion 1 0 3 1 3 2 - 3 1 4 - 5 - 6 - 7 8 8 7 9 - START SIL - END SIL - Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 42. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 43. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 44. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 45. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 46. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 47. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Future Work Extended word recognition system MS SDK Acoustic unstable field System can be easily adopted to Continuous Speech Real time recognition Blocksets to existing tools Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 48. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 49. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 50. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 51. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 52. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 53. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 54. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 55. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Conclusion Speech - Technical approach ASR Approaches Challenges Feature Extraction HTK Familiarization Inaccuracy - Lack of Database Extended digit recognition Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 56. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Appendix Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 57. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Mel Scale and Cepstrum Convert Hz to Mel f f m = 2529log10 1 + = 1127loge 1 + (5) 700 700 Gunnar Fant proposed 1000 f m= log10 1 + (6) log2 1000 Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 58. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Real time understanding of HMM Evaluate - Forward Algorithm Decode - Viterbi Train - Baum Welch Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 59. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion HMM Example Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 60. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Viterbi Example Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 61. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Baum Welch Generalized Expectation-Maximization (GEM) algorithm Maximum Likelihood Estimates Posterior Mode Estimate Transition and Emission probabilities Dividing the expected transition from Si to Sj by the expected transitions from Si Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 62. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Frequency Warping Performed by applying the unitary warping operator U One spectral representation on a certain frequency scale and with a certain frequency resolution transformed to another representation on a new frequency scale Resolution uniform on the new scale - Non-Uniform with respect to old scale Scale transform of a function ∞ e −j2πlnf DX (c) = X (f ) √ df (7) 0 f Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 63. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Inverse scale-transform ∞ ∞ √ e −2πclnf DX (c) = αX (αf ) √ df = e j2πlnα DX (c) (8) 0 f Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 64. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion MLE Maximum Likelihood Estimation Value of parameter vector maximizing the probability Searching the multi-dimensional parameter space MLE Estimate Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 65. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion References [1 ] Claudio Bechetti and Klucio Prina Ricotti, Speech Recognition Theory and C++ Implementation, John Wiley and Sons, pp. 10, 2004. [2 ] Davis K. H., Biddulph R. and Balashek S, Automatic Recognition of Spoken Digits, Journal of Acoustical Society of America, Volume:24, Issue:6, pp. 637-642, 1952. [3 ] Rabiner, L., R., Wilpon, J. G., Considerations in applying clustering techniques to speaker-independent word recognition, Journal of Acoustical Society of America,Volume:66, Number:3, pp. 663-673. 1979. [4 ] Mori R.D, Lam L, Gilloux M., Learning and plan refinement in a knowledge based system for automatic speech recognition,IEEE Transaction on Pattern Analysis Machine Intelligence, Volume 9, Number 2, pp.289-305, 2001. [5 ] Huang, C., Tao, C., Chang,E., Accent Issues in Large Vocabulary Continuous Speech Recognition, International Journal Of Speech Technology, Volume:7, pp.141-153, 2004 [6 ] Steve Young et. al.,The HTK Book(for HTK Version 3.4),Cambridge University Engineering Department, pp.3-6, 2009 [7 ] E. J. Candes,Compressive sampling,Proceedings of International Congress of Mathematicians, 2006 [8 ] S. F. Cotter,Sparse Representation for accurate classifi cation of corrupted and occluded facial expressions ,Proceedings of ICASSP, pp. 838-841, 2010 Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
  • 66. Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion “A technology is a real progress when it is available to anyone”- Henry Ford THANK YOU Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam