SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
Tribhuvan University
                 Institute of Engineering
                    Pulchowk Campus
   Department of Electronics and Computer Engineering


         MAJOR PROJECT FINAL PRESENTATION :

  TEXT PROMPTED REMOTE
  SPEAKER AUTHENTICATION
Project Supervisor :                         Project Members:
Dr. Subarna Shakya                    Ganesh Tiwari (75010)
Associate Professor                  Madhav Pandey(75014)
                                     Manoj Shrestha(75018)

Internal Examiner:                         External Examiner
Er. Manoj Ghimire                        Er. Bimal Acharya
INTRODUCTION


   Voice biometric system
       User login


   Text-Prompted system
       Claimant is asked to speak a prompted(random) text
       Speech and Speaker Recognition

   Why Text prompted ?
       Playback attack
OUR SYSTEM




   Feature : MFCC

   Modeling and Classifications : both statistical

       GMM - Speaker Modeling :

       HMM/VQ - Speech Modeling :
PROPERTIES OF SPEECH SIGNAL

   Carries both Speech Content and Speaker identity

   What makes Speech Signal Unique ?
       Each phoneme resonates at its own fundamental frequency
        and harmonics of it
       Studied over short period : short time spectral analysis


   What is Speaker Dependent information
       Fundamental frequency, primarily
            function of the dimensions and tension of the vocal chords
            size and shape of the mouth, throat, nose, and teeth

       Studied over long period : all the variations from that speaker
UNIQUENESS IN PHONEME
                                                                              Phoneme /ah/




            0.15



             0.1



            0.05



               0
Amplitude




            -0.05



             -0.1



            -0.15
                                                                      Phoneme /i:/
             -0.2
                    0     500   1000             1500   2000   2500
                                       Samples
Pre-Processing and Feature Extraction
PREPROCESSING : STEPS

           1)Silence Removal

  1


 0.5


  0


-0.5


  -1
       0      1    2              3          4          5         6       7   8          9
                                                                                     4
                                                                                  x 10



Silence Signal
                         1


                       0.5


                         0


                       -0.5

Silence Removed         -1
                              0   0.5   1   1.5   2   2.5   3   3.5   4
                                                                      4
PREPROCESSING :STEPS (CONTD..)

1)Silence Removal                           2)Pre-Emphasis
                   0.05


                   0.04

                                                                                                        Suppressed high
                                                                                                        Frequencies
                   0.03
          |Y(f)|




                   0.02


                   0.01


                       0
                              0                2000    4000         6000         8000   10000   12000
                                                               Frequency (Hz)


                                          -3
                                      x 10
                                  5


                                  4
                                                                                                        Boosted high
                                                                                                        Frequencies
                                  3
                     |Y(f)|




                                  2


                                  1


                                  0
                                      0         2000    4000         6000        8000   10000   12000
                                                                Frequency (Hz)
PREPROCESSING :STEPS (CONTD..)

                                3)Framing
1)Silence Removal2)Pre-Emphasis



      50% overlapped, 23ms
PREPROCESSING :STEPS (CONTD..)

1)Silence Removal2)Pre-Emphasis3)Framing                                             4)Windowing
     0.05

     0.04

     0.03

     0.02

     0.01

        0

     -0.01                                                                  0.04

     -0.02
                                                                            0.03
     -0.03

     -0.04                                                                  0.02

     -0.05
             0       200   400    600        800         1000        1200   0.01


                                                                               0


                                                                            -0.01
        1
                                                   Hamming Window           -0.02
      0.9

      0.8                                                                   -0.03
      0.7
                                                                            -0.04
      0.6                                                                           0   200   400   600   800   1000   1200

      0.5

      0.4

      0.3

      0.2

      0.1                                                                                           Windowed Signal
                 0    10   20    30     40          50          60

                 Hamming Window
FEATURE EXTRACTION


   MFCC : Mel Filter Cepstral Coefficients

       Perceptual approach
           Human Ear processes audio signal in Mel scale


       Mel scale : linear up to 1KHz and logarithmic after
        1KHz
MFCC EXTRACTION: (CONTD..)
   Steps :

    FFT         Mel Filter              Log       DCT              CMS


                                                               Mel Filter Bank




   Mel Filter : 12
       Filtering of absolute fft coefficients using triangular filter bank in
        Mel scale

   MFCC gives distribution of energy acc. to filters in Mel
    frequency band
EXTRA FEATURES :ENERGY AND DELTAS


   For achieving high recognition rate

   A Energy Feature

   Delta and Delta-Delta

       delta velocity feature
                                             Co-articulation
       double delta acceleration feature
COMPOSITION OF FEATURE VECTOR

  12 MFCC Features
  12 Δ MFCC
  12 Δ Δ MFCC
   1 Energy Feature
   1 Δ Energy
   1 Δ Δ Energy

 39 Features from each frame
Speech Recognition/Verification by

           HMM/VQ
HIDDEN MARKOV MODEL (HMM)

   HMM is the extension of Markov Process

   Markov Process consist of observable states

   HMM has hidden states and observable symbols
    per states

   HMM is the stochastic model
HMM (CONTD…)

   Parameters

    1) The initial state distribution (π)
    2) State transition probability distribution (A)
    3) Observation symbol probability distribution (B)


   The HMM Model              (A,B, )



               
EXAMPLE:
PRONUNCIATION MODEL OF WORD TOMATO




                  (A,B, )
HMM IMPLEMENTATION

   Feature Vector  observation symbols , 256

   Phonemes hidden states, 6

   Left to right HMM

   Discrete Hidden Markov Model (DHMM) with
    Vector Quantization (VQ) technique
SPEECH RECOGNITION SYSTEM
VECTOR QUANTIZATION
Speaker Recognition/Verification by

              GMM
SPEAKER VERIFICATION SYSTEM
SPEAKER MODELING (GMM)


 Gaussian         Mixture Model
     Parametric probability density function
     Based on soft clustering technique
     Mixture of Gaussian components

      = (������������ , ������������ , ������������ )
SPEAKER MODEL TRAINING


 Estimate the model parameters
 Expectation Maximization algorithm
SPEAKER VERIFICATION


   Based on likelihood ratio

             ������������������������������������ℎ������������������ ������ ������������������������������ ������������������������ ������ℎ������ ������������������������������������������ ′ ������ ������������������������������
        =
                 ������������������������������������ℎ������������������ ������ ������������������������������ ������������������������ ������������������������������������������������ ′ ������ ������������������������������
TOOLS USED

 Languages:
     Adobe Flex
     Java
     Blaze DS for RPC


 Servers:
     Apache Tomcat
     MySQL


 Versioning
     Tortoise SVN
OUTPUT : SNAPSHOT (GUI)
APPLICATION AREAS


   Telephone transaction
     Telephone credit card purchase,
     Telephone stock trading


   Access control
       Physical facilities
       Computer networks
       Information retrieval
       Customers information


   Forensics
       Voice sample matching
LIMITATION AND FUTURE ENHANCEMENT


   Noise reduction

   Training on more data

   Combine with
     other features
     other classification methods
Thanks


   Any queries ?

Mais conteúdo relacionado

Destaque

Speech based password authentication system on FPGA
Speech based password authentication system on FPGASpeech based password authentication system on FPGA
Speech based password authentication system on FPGARajesh Roshan
 
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...CSCJournals
 
Jailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsJailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsolberger
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
 
LPC for Speech Recognition
LPC for Speech RecognitionLPC for Speech Recognition
LPC for Speech RecognitionDr. Uday Saikia
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeerRanbeer Tyagi
 
Export Import And Documentation Project Report
Export Import And Documentation Project Report Export Import And Documentation Project Report
Export Import And Documentation Project Report Sumit Guleria
 
Project presentation template
Project presentation templateProject presentation template
Project presentation templateAbhishek Bhardwaj
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...gt_ebuddy
 
speech processing basics
speech processing basicsspeech processing basics
speech processing basicssivakumar m
 
Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1Samiul Parag
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signalVinodhini
 
Seminar abiotic stress
Seminar abiotic stressSeminar abiotic stress
Seminar abiotic stressAkula Dinesh
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 

Destaque (20)

Speech based password authentication system on FPGA
Speech based password authentication system on FPGASpeech based password authentication system on FPGA
Speech based password authentication system on FPGA
 
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic F...
 
Sound
SoundSound
Sound
 
Jailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsJailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import efforts
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
LPC for Speech Recognition
LPC for Speech RecognitionLPC for Speech Recognition
LPC for Speech Recognition
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeer
 
Export Import And Documentation Project Report
Export Import And Documentation Project Report Export Import And Documentation Project Report
Export Import And Documentation Project Report
 
Project presentation template
Project presentation templateProject presentation template
Project presentation template
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
 
speech processing basics
speech processing basicsspeech processing basics
speech processing basics
 
Digital speech processing lecture1
Digital speech processing lecture1Digital speech processing lecture1
Digital speech processing lecture1
 
Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signal
 
Why Use MVC?
Why Use MVC?Why Use MVC?
Why Use MVC?
 
Speech processing
Speech processingSpeech processing
Speech processing
 
Why MVC?
Why MVC?Why MVC?
Why MVC?
 
Seminar abiotic stress
Seminar abiotic stressSeminar abiotic stress
Seminar abiotic stress
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
Model View Controller (MVC)
Model View Controller (MVC)Model View Controller (MVC)
Model View Controller (MVC)
 

Semelhante a Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System Final Presentation Slide

SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKSPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKSPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKSPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKSPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARK
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARKSPICE MODEL of SG10SC4M (Standard Model) in SPICE PARK
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARK
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARKSPICE MODEL of XBS104S14R (Standard Model) in SPICE PARK
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARK
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARKSPICE MODEL of 1SS193 (Professional Model) in SPICE PARK
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKSPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKSPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKTsuyoshi Horigome
 
SPICE MODEL of UF5408 (Professional Model) in SPICE PARK
SPICE MODEL of UF5408 (Professional Model) in SPICE PARKSPICE MODEL of UF5408 (Professional Model) in SPICE PARK
SPICE MODEL of UF5408 (Professional Model) in SPICE PARKTsuyoshi Horigome
 

Semelhante a Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System Final Presentation Slide (20)

SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC4MT (Professional Model) in SPICE PARK
 
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKSPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
 
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARKSPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Standard Model) in SPICE PARK
 
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
 
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARKSPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
SPICE MODEL of S30SC4MT (Professional Model) in SPICE PARK
 
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
 
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Standard Model) in SPICE PARK
 
Compressive sensing for transient analsyis
Compressive sensing for transient analsyisCompressive sensing for transient analsyis
Compressive sensing for transient analsyis
 
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKSPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
 
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARKSPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Standard Model) in SPICE PARK
 
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
 
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARKSPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
SPICE MODEL of S60SC6MT (Professional Model) in SPICE PARK
 
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARK
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARKSPICE MODEL of SG10SC4M (Standard Model) in SPICE PARK
SPICE MODEL of SG10SC4M (Standard Model) in SPICE PARK
 
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARK
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARKSPICE MODEL of XBS104S14R (Standard Model) in SPICE PARK
SPICE MODEL of XBS104S14R (Standard Model) in SPICE PARK
 
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
 
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARKSPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
SPICE MODEL of SG30JC6M (Standard Model) in SPICE PARK
 
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARK
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARKSPICE MODEL of 1SS193 (Professional Model) in SPICE PARK
SPICE MODEL of 1SS193 (Professional Model) in SPICE PARK
 
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKSPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
 
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARKSPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
SPICE MODEL of SG30SC6M (Professional Model) in SPICE PARK
 
SPICE MODEL of UF5408 (Professional Model) in SPICE PARK
SPICE MODEL of UF5408 (Professional Model) in SPICE PARKSPICE MODEL of UF5408 (Professional Model) in SPICE PARK
SPICE MODEL of UF5408 (Professional Model) in SPICE PARK
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System Final Presentation Slide

  • 1. Tribhuvan University Institute of Engineering Pulchowk Campus Department of Electronics and Computer Engineering MAJOR PROJECT FINAL PRESENTATION : TEXT PROMPTED REMOTE SPEAKER AUTHENTICATION Project Supervisor : Project Members: Dr. Subarna Shakya Ganesh Tiwari (75010) Associate Professor Madhav Pandey(75014) Manoj Shrestha(75018) Internal Examiner: External Examiner Er. Manoj Ghimire Er. Bimal Acharya
  • 2. INTRODUCTION  Voice biometric system  User login  Text-Prompted system  Claimant is asked to speak a prompted(random) text  Speech and Speaker Recognition  Why Text prompted ?  Playback attack
  • 3. OUR SYSTEM  Feature : MFCC  Modeling and Classifications : both statistical  GMM - Speaker Modeling :  HMM/VQ - Speech Modeling :
  • 4. PROPERTIES OF SPEECH SIGNAL  Carries both Speech Content and Speaker identity  What makes Speech Signal Unique ?  Each phoneme resonates at its own fundamental frequency and harmonics of it  Studied over short period : short time spectral analysis  What is Speaker Dependent information  Fundamental frequency, primarily  function of the dimensions and tension of the vocal chords  size and shape of the mouth, throat, nose, and teeth  Studied over long period : all the variations from that speaker
  • 5. UNIQUENESS IN PHONEME Phoneme /ah/ 0.15 0.1 0.05 0 Amplitude -0.05 -0.1 -0.15 Phoneme /i:/ -0.2 0 500 1000 1500 2000 2500 Samples
  • 7. PREPROCESSING : STEPS 1)Silence Removal 1 0.5 0 -0.5 -1 0 1 2 3 4 5 6 7 8 9 4 x 10 Silence Signal 1 0.5 0 -0.5 Silence Removed -1 0 0.5 1 1.5 2 2.5 3 3.5 4 4
  • 8. PREPROCESSING :STEPS (CONTD..) 1)Silence Removal 2)Pre-Emphasis 0.05 0.04 Suppressed high Frequencies 0.03 |Y(f)| 0.02 0.01 0 0 2000 4000 6000 8000 10000 12000 Frequency (Hz) -3 x 10 5 4 Boosted high Frequencies 3 |Y(f)| 2 1 0 0 2000 4000 6000 8000 10000 12000 Frequency (Hz)
  • 9. PREPROCESSING :STEPS (CONTD..) 3)Framing 1)Silence Removal2)Pre-Emphasis  50% overlapped, 23ms
  • 10. PREPROCESSING :STEPS (CONTD..) 1)Silence Removal2)Pre-Emphasis3)Framing 4)Windowing 0.05 0.04 0.03 0.02 0.01 0 -0.01 0.04 -0.02 0.03 -0.03 -0.04 0.02 -0.05 0 200 400 600 800 1000 1200 0.01 0 -0.01 1 Hamming Window -0.02 0.9 0.8 -0.03 0.7 -0.04 0.6 0 200 400 600 800 1000 1200 0.5 0.4 0.3 0.2 0.1 Windowed Signal 0 10 20 30 40 50 60 Hamming Window
  • 11. FEATURE EXTRACTION  MFCC : Mel Filter Cepstral Coefficients  Perceptual approach  Human Ear processes audio signal in Mel scale  Mel scale : linear up to 1KHz and logarithmic after 1KHz
  • 12. MFCC EXTRACTION: (CONTD..)  Steps : FFT  Mel Filter  Log  DCT  CMS Mel Filter Bank  Mel Filter : 12  Filtering of absolute fft coefficients using triangular filter bank in Mel scale  MFCC gives distribution of energy acc. to filters in Mel frequency band
  • 13. EXTRA FEATURES :ENERGY AND DELTAS  For achieving high recognition rate  A Energy Feature  Delta and Delta-Delta  delta velocity feature Co-articulation  double delta acceleration feature
  • 14. COMPOSITION OF FEATURE VECTOR 12 MFCC Features 12 Δ MFCC 12 Δ Δ MFCC 1 Energy Feature 1 Δ Energy 1 Δ Δ Energy  39 Features from each frame
  • 16. HIDDEN MARKOV MODEL (HMM)  HMM is the extension of Markov Process  Markov Process consist of observable states  HMM has hidden states and observable symbols per states  HMM is the stochastic model
  • 17. HMM (CONTD…)  Parameters 1) The initial state distribution (π) 2) State transition probability distribution (A) 3) Observation symbol probability distribution (B)  The HMM Model   (A,B, ) 
  • 18. EXAMPLE: PRONUNCIATION MODEL OF WORD TOMATO   (A,B, )
  • 19. HMM IMPLEMENTATION  Feature Vector  observation symbols , 256  Phonemes hidden states, 6  Left to right HMM  Discrete Hidden Markov Model (DHMM) with Vector Quantization (VQ) technique
  • 24. SPEAKER MODELING (GMM)  Gaussian Mixture Model  Parametric probability density function  Based on soft clustering technique  Mixture of Gaussian components   = (������������ , ������������ , ������������ )
  • 25. SPEAKER MODEL TRAINING  Estimate the model parameters  Expectation Maximization algorithm
  • 26. SPEAKER VERIFICATION  Based on likelihood ratio ������������������������������������ℎ������������������ ������ ������������������������������ ������������������������ ������ℎ������ ������������������������������������������ ′ ������ ������������������������������ = ������������������������������������ℎ������������������ ������ ������������������������������ ������������������������ ������������������������������������������������ ′ ������ ������������������������������
  • 27. TOOLS USED  Languages:  Adobe Flex  Java  Blaze DS for RPC  Servers:  Apache Tomcat  MySQL  Versioning  Tortoise SVN
  • 29. APPLICATION AREAS  Telephone transaction  Telephone credit card purchase,  Telephone stock trading  Access control  Physical facilities  Computer networks  Information retrieval  Customers information  Forensics  Voice sample matching
  • 30. LIMITATION AND FUTURE ENHANCEMENT  Noise reduction  Training on more data  Combine with  other features  other classification methods
  • 31. Thanks Any queries ?