Enviar pesquisa
Carregar
52 57
•
2 gostaram
•
688 visualizações
Ijarcsee Journal
Seguir
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 6
Recomendados
Kc3517481754
Kc3517481754
IJERA Editor
Speech Recognition
Speech Recognition
Ahmed Moawad
Speech Recognition System By Matlab
Speech Recognition System By Matlab
Ankit Gujrati
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEY
IJCERT
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh Tomar
WithTheBest
Speaker recognition on matlab
Speaker recognition on matlab
Arcanjo Salazaku
Speaker recognition in android
Speaker recognition in android
Anshuli Mittal
Automatic speech recognition system
Automatic speech recognition system
Alok Tiwari
Recomendados
Kc3517481754
Kc3517481754
IJERA Editor
Speech Recognition
Speech Recognition
Ahmed Moawad
Speech Recognition System By Matlab
Speech Recognition System By Matlab
Ankit Gujrati
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEY
IJCERT
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh Tomar
WithTheBest
Speaker recognition on matlab
Speaker recognition on matlab
Arcanjo Salazaku
Speaker recognition in android
Speaker recognition in android
Anshuli Mittal
Automatic speech recognition system
Automatic speech recognition system
Alok Tiwari
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
IDES Editor
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
Editor IJARCET
Classification of Language Speech Recognition System
Classification of Language Speech Recognition System
ijtsrd
E0502 01 2327
E0502 01 2327
IJMER
Ijetcas14 426
Ijetcas14 426
Iasir Journals
Speech recognition
Speech recognition
Charu Joshi
Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learning
Ankan Dutta
Automatic speech recognition system
Automatic speech recognition system
Alok Tiwari
Utterance based speaker identification
Utterance based speaker identification
IJCSEA Journal
Voice recognition system
Voice recognition system
avinash raibole
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
IDES Editor
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
ijnlc
Ai based character recognition and speech synthesis
Ai based character recognition and speech synthesis
Ankita Jadhao
Speech recognition final presentation
Speech recognition final presentation
himanshubhatti
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
sipij
Speech Recognition
Speech Recognition
Hardik Kanjariya
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...
Alexander Decker
Voice Recognition
Voice Recognition
Amrita More
HISTORIA DE JESUS DE NATZARET
HISTORIA DE JESUS DE NATZARET
the niks
Slideshare brecha
Slideshare brecha
Clara Alfaro
Infografía día del padre recomendaciones.v1.0
Infografía día del padre recomendaciones.v1.0
Francisco Javier Palomino
Mais conteúdo relacionado
Mais procurados
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
IDES Editor
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
ijcsit
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
Editor IJARCET
Classification of Language Speech Recognition System
Classification of Language Speech Recognition System
ijtsrd
E0502 01 2327
E0502 01 2327
IJMER
Ijetcas14 426
Ijetcas14 426
Iasir Journals
Speech recognition
Speech recognition
Charu Joshi
Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learning
Ankan Dutta
Automatic speech recognition system
Automatic speech recognition system
Alok Tiwari
Utterance based speaker identification
Utterance based speaker identification
IJCSEA Journal
Voice recognition system
Voice recognition system
avinash raibole
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
IDES Editor
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
ijnlc
Ai based character recognition and speech synthesis
Ai based character recognition and speech synthesis
Ankita Jadhao
Speech recognition final presentation
Speech recognition final presentation
himanshubhatti
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
sipij
Speech Recognition
Speech Recognition
Hardik Kanjariya
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...
Alexander Decker
Voice Recognition
Voice Recognition
Amrita More
Mais procurados
(19)
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
Classification of Language Speech Recognition System
Classification of Language Speech Recognition System
E0502 01 2327
E0502 01 2327
Ijetcas14 426
Ijetcas14 426
Speech recognition
Speech recognition
Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learning
Automatic speech recognition system
Automatic speech recognition system
Utterance based speaker identification
Utterance based speaker identification
Voice recognition system
Voice recognition system
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
Ai based character recognition and speech synthesis
Ai based character recognition and speech synthesis
Speech recognition final presentation
Speech recognition final presentation
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
Speech Recognition
Speech Recognition
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...
Voice Recognition
Voice Recognition
Destaque
HISTORIA DE JESUS DE NATZARET
HISTORIA DE JESUS DE NATZARET
the niks
Slideshare brecha
Slideshare brecha
Clara Alfaro
Infografía día del padre recomendaciones.v1.0
Infografía día del padre recomendaciones.v1.0
Francisco Javier Palomino
FusionCharts Suite XT Product Brochure
FusionCharts Suite XT Product Brochure
FusionCharts
Running a Successful Incentive Program
Running a Successful Incentive Program
Paul Butler
Examen físico
Examen físico
Centro Universitario de Ciencias de la Salud UDG
Revista avanzada
Revista avanzada
edwin maldonado
Solid waste management system for organic waste management
Solid waste management system for organic waste management
vnsenviro_biotechq
Brexit wave 2 - TNS
Brexit wave 2 - TNS
Romain Brami
new cv Operations Manager
new cv Operations Manager
amritpal dandiwal
How 2-Tier ERP Can Benefit Your Business
How 2-Tier ERP Can Benefit Your Business
BurCom Consulting Ltd.
Why Salesforce is the best CRM
Why Salesforce is the best CRM
Suyati Technologies
Destaque
(12)
HISTORIA DE JESUS DE NATZARET
HISTORIA DE JESUS DE NATZARET
Slideshare brecha
Slideshare brecha
Infografía día del padre recomendaciones.v1.0
Infografía día del padre recomendaciones.v1.0
FusionCharts Suite XT Product Brochure
FusionCharts Suite XT Product Brochure
Running a Successful Incentive Program
Running a Successful Incentive Program
Examen físico
Examen físico
Revista avanzada
Revista avanzada
Solid waste management system for organic waste management
Solid waste management system for organic waste management
Brexit wave 2 - TNS
Brexit wave 2 - TNS
new cv Operations Manager
new cv Operations Manager
How 2-Tier ERP Can Benefit Your Business
How 2-Tier ERP Can Benefit Your Business
Why Salesforce is the best CRM
Why Salesforce is the best CRM
Semelhante a 52 57
H42045359
H42045359
IJERA Editor
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
IJCSEA Journal
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
IJCSEA Journal
P141omfccu
P141omfccu
lodhabhavik
Speaker identification using mel frequency
Speaker identification using mel frequency
Phan Duy
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
AIRCC Publishing Corporation
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
IRJET Journal
Bachelors project summary
Bachelors project summary
Aditya Deshmukh
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
Nicole Heredia
Speaker Identification
Speaker Identification
sipij
50120140502007
50120140502007
IAEME Publication
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
TELKOMNIKA JOURNAL
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
ijsrd.com
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
sophiabelthome
De4201715719
De4201715719
IJERA Editor
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
CSCJournals
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
IJET - International Journal of Engineering and Techniques
Ijetcas14 390
Ijetcas14 390
Iasir Journals
K010416167
K010416167
IOSR Journals
19 ijcse-01227
19 ijcse-01227
Shivlal Mewada
Semelhante a 52 57
(20)
H42045359
H42045359
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
P141omfccu
P141omfccu
Speaker identification using mel frequency
Speaker identification using mel frequency
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
A Review Paper on Speech Based Emotion Detection Using Deep Learning
A Review Paper on Speech Based Emotion Detection Using Deep Learning
Bachelors project summary
Bachelors project summary
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
Speaker Identification
Speaker Identification
50120140502007
50120140502007
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
De4201715719
De4201715719
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
Ijetcas14 390
Ijetcas14 390
K010416167
K010416167
19 ijcse-01227
19 ijcse-01227
Mais de Ijarcsee Journal
130 133
130 133
Ijarcsee Journal
122 129
122 129
Ijarcsee Journal
116 121
116 121
Ijarcsee Journal
109 115
109 115
Ijarcsee Journal
104 108
104 108
Ijarcsee Journal
99 103
99 103
Ijarcsee Journal
93 98
93 98
Ijarcsee Journal
88 92
88 92
Ijarcsee Journal
82 87
82 87
Ijarcsee Journal
78 81
78 81
Ijarcsee Journal
73 77
73 77
Ijarcsee Journal
65 72
65 72
Ijarcsee Journal
58 64
58 64
Ijarcsee Journal
52 57
52 57
Ijarcsee Journal
46 51
46 51
Ijarcsee Journal
41 45
41 45
Ijarcsee Journal
36 40
36 40
Ijarcsee Journal
28 35
28 35
Ijarcsee Journal
24 27
24 27
Ijarcsee Journal
19 23
19 23
Ijarcsee Journal
Mais de Ijarcsee Journal
(20)
130 133
130 133
122 129
122 129
116 121
116 121
109 115
109 115
104 108
104 108
99 103
99 103
93 98
93 98
88 92
88 92
82 87
82 87
78 81
78 81
73 77
73 77
65 72
65 72
58 64
58 64
52 57
52 57
46 51
46 51
41 45
41 45
36 40
36 40
28 35
28 35
24 27
24 27
19 23
19 23
Último
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
Sujit Pal
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Delhi Call girls
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Último
(20)
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
52 57
1.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 SPEAKER RECOGNITION IN NOISY ENVIRONMENT Mr. Mohammed Imdad N1 , Dr . Shameem Akhtar N1, Prof.Mohammad Imran Akhtar 2 1 Computer Science and Engineering department , KBN College of Engineering, Gulbarga, India 2 Electronics and Communication department , AITM , Bhatkal Abstract--- This paper investigates the problem of The speech signal conveys several levels of speaker identification and verification in noisy conditions, information. Primarily, the speech signal conveys the words or assuming that speech signals are corrupted by noise. This paper message being spoken, but on a secondary level, the signal describes a method that combines multi-condition model training also conveys information about the identity of the speaker. and missing-feature theory to model noise with unknown temporal-spectral characteristics. Introduction of such technique The area of speaker recognition is concerned with extracting is very useful since it remove avoids the problem of recognizing the identity of the person speaking an utterance. As speech voice and can also be implemented since here user is not required interaction with the computers become more pervasive in to remember his password login and hence no stilling chance. activities such as telephone transactions and information retrieval from speech databases, the utility of automatically Index Terms— Cepstrum, Missing Feature method, Multi- recognizing a speaker based on his vocal characteristics condition model training, Vector quantization increases. I. INTRODUCTION II. WORKING OF A SPEAKER RECOGNITION Spoken language is the most natural way used by SYSTEM humans to communicate information. The speech signal conveys several types of information. From the speech Like most pattern recognition problems, a speaker production point of view, the speech signal conveys linguistic recognition system can be partitioned into two modules: information (e.g., message and language) and speaker feature extraction and classification. The classification module information (e.g., emotional, regional, and physiological has two components: pattern matching and decision. The characteristics). From the speech perception point of view, it feature extraction module estimates a set of features from the also conveys information about the environment in which the speech signal that represent some speaker-specific speech was produced and transmitted. Even though this wide information. The speaker-specific information is the result of range of information is encoded in a complex form into the complex transformations occurring at different levels of the speech signal, humans can easily decode most of the speech production: semantic, phonologic, phonetic, and information. This speech technology has found wide acoustic. applications such as automatic dictation, voice command control, audio archive indexing and retrieval etc. Speaker recognition refers to two fields: Speaker Identification (SI) and Speaker Verification (SV). In speaker identification, the goal is to determine which one of group of known voices best matches the input voice sample. There are two tasks: text-dependent and text-independent speaker identification. In text dependent identification, the spoken phrase is known to the system whereas in the text independent case, the spoken phrase is unknown. Success in both identification tasks depends on extracting and modeling the Figure 1 : Generic speaker recognition system speaker dependent characteristics of the speech signal, which can effectively distinguish between talkers The pattern matching module is responsible for comparing the estimated features to the speaker models. There 52 All Rights Reserved © 2012 IJARCSEE
2.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 are many types of pattern matching methods and called quasi-stationary). An example of speech signal is shown corresponding models used in speaker recognition [13]. Some in Figure 2. When examined over a sufficiently short period of of the methods include hidden Markov models (HMM), time (between 5 and 100 msec), its characteristics are fairly dynamic time warping (DTW), and vector quantization (VQ). stationary. However, over long periods of time (on the order of 1/5 seconds or more) the signal characteristic change to reflect the different speech sounds being spoken. Therefore, III. SPEAKER RECOGNITION PRINCIPLES short-time spectral analysis is the most common way to Depending on the application, the general area of characterize the speech signal. speaker recognition can be divided into three specific tasks: A wide range of possibilities exist for parametrically identification, detection/verification, and segmentation and representing the speech signal for the speaker recognition task, clustering. The goal of the speaker identification task is to such as Linear Prediction Coding (LPC), Mel-Frequency determine which speaker out of a group of known speakers Cepstrum Coefficients (MFCC), and others. MFCC is perhaps produces the input voice sample. There are two modes of the best known and most popular, and these will be used in operation that are related to the set of known voices- closed- this project. set mode and open-set mode. In the closed-set mode, the system assumes that the to-be- determined voice must come from the set of known voices. Otherwise, the system is in open-set mode. The closed- set speaker identification can be considered as a multiple-class classification problem. In open-set mode, the speakers that do not belong to the set of known voices are referred to as impostors. This task can be used for forensic applications, e.g., speech evidence can be used to recognize the perpetrator’s identity among several known suspects. In speaker verification, the goal is to determine Figure: 2 An example of speech signal. whether a person is who he or she claims to be according to his/her voice sample. This task is also known as voice The technique used for speech feature extraction verification or authentication, speaker authentication, talker make use of MFCC’s are based on the known variation of the verification or authentication, and speaker detection. Speaker human ear’s critical bandwidths with frequency filters spaced segmentation and clustering techniques are also used in linearly at low frequencies and logarithmically at high multiple-speaker recognition scenarios. In many speech frequencies have been used to capture the phonetically recognition and it’s applications, it is often assumed that the important characteristics of speech. This is expressed in the speech from a particular individual is available for processing. mel-frequency scale, which is linear frequency spacing below When this is not the case, and the speech from the desired 1000 Hz and a logarithmic spacing above 1000 Hz. The speaker is intermixed with other speakers, it is desired to process of computing MFCCs is described in more detail next. segregate the speech into segments from the individuals before the recognition process commences. So the goal of this task is V. Mel-Frequency Cepstrum Coefficients Processor to divide the input audio into homogeneous segments and then label them via speaker identity. Recently, this task has A block diagram of the structure of an MFCC received more attention due to increased inclusion of multiple- processor is given in Figure The speech input is typically speaker audio such as recorded news show or meetings in recorded at a sampling rate above 16000 Hz. This sampling commonly used web searches and consumer electronic frequency was chosen to minimize the effects of aliasing in devices. Speaker segmentation and clustering is one way to the analog-to-digital conversion. index audio archives so that to make the retrieval easier. According to the constraints placed on the speech used to train and test the system, Automatic speaker recognition can be further classified into text-dependent or text-independent tasks. IV. SPEECH FEATURE EXTRACTION The purpose of this module is to convert the speech waveform to some type of parametric representation (at a considerably lower information rate) for further analysis and Figure: 3 MFCC Processor. processing. This is often referred as the signal-processing front end. The speech signal is a slowly timed varying signal (it is 53 All Rights Reserved © 2012 IJARCSEE
3.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 VI. Vector Quantization likelihood function of frame feature vector X associated with speaker S trained on data set . In this paper, we assume that It is a feature matching techniques used in speaker each frame vector X consists of N subband features: X = recognition. Here , VQ approach will be used, due to ease of (x1,x2,..,xn), where xn represents the feature for the nth subband. implementation and high accuracy. VQ is a process of We obtain by dividing the whole speech frequency-band into n mapping vectors from a large vector space to a finite number subbands, and then calculating the feature coefficients for each of regions in that space. Each region is called a cluster and can subband independently of the other subbands. The subband be represented by its center called a codeword. The collection feature framework has been used in speech recognition for of all codewords is called a codebook. Figure shows a isolating local frequency-band corruption from spreading into conceptual diagram to illustrate this recognition process. In the the features of the other bands. figure, only two speakers and two dimensions of the acoustic space are shown. The proposed approach for modeling noise includes two steps. The first step is to generate multiple copies of training set ø0, by introducing corruption of different characteristics into ø0. Primarily, we could add white noise at various signal-to-noise ratios (SNRs) to the clean training data to simulate the corruption. Assume that this leads to augmented training sets ø0, ø1,.., øl, where øl denotes the lth training set derived from with the inclusion of a certain noise condition. Then, new likelihood function for the test frame vector can be formed by combining the likelihood functions trained on the individual training sets p(X / S)=Σ(l=0,L) p(X / S, øl) P(øl / S) …….(1) Figure 4: conceptual diagram illustrating vector quantization codebook formation. where p(X / S, øl)is the likelihood function of frame vector X trained on set øl, and is the prior probability for the One speaker can be discriminated from another based of the occurrence of the noise condition , for speaker S. Equation (1) location of centroids. In the training phase, a speaker-specific is a multicondition model. A recognition system based on (1) VQ codebook is generated for each known speaker by should have improved robustness to the noise conditions seen clustering his/her training acoustic vectors. The result in the training sets øl, as compared to a system based on p(X / codewords (centroids) are shown in Figure by black circles S, ø0). and black triangles for speaker 1 and 2, respectively. The distance from a vector to the closest codeword of a codebook The second step of the new approach is to make (1) is called a VQ-distortion. In the recognition phase, an input robust to noise conditions not fully matched by the training utterance of an unknown voice is “vector-quantized” using sets øl without assuming extra noise information. One way to each trained codebook and the total VQ distortion is this is to ignore the heavily mismatched subbands and focus computed. The speaker corresponding to the VQ codebook the score only on the matching subbands. Let X = (x1,x2,..,xn), with smallest total distortion is identified. be a test frame vector and Xl c X be a subset in containing all the subband features corrupted at noise condition øl. Then, After the enrolment session, the acoustic vectors using Xl in place of X as the test vector for each training noise extracted from input speech of a speaker provide a set of condition, (1) can be redefined as training vectors. As described above, the next important step is to build a speaker-specific VQ codebook for this speaker using p(X / S)=Σ(l=0,L) p(Xl / S, øl) P(øl / S) ……..(2) those training vectors. There is a well-know algorithm, namely LBG algorithm [Linde, Buzo and Gray, 1980], for clustering a where p(Xl / S, øl) is the marginal likelihood of the matching set of L training vectors into a set of M codebook vectors. feature subset Xl, derived from p(X / S, øl) with the mismatched subband features ignored to improve mismatch robustness between the test frame X and the training noise condition . VII. SPEAKER MODELLING It deals with designing for speaker for recognition of voice. It mainly consist of two phase training and testing phase VIII. SPEAKER VERIFICATION and both the phase mainly depends on feature extraction and parameter matching. Speaker verification is the process of automatically verify who is speaking on the basis of individual information Let ø0 denote the training data set, containing clean included in speech waves. This technique makes it possible to speech data, for speaker S, and let p(X / S, ø0 ) represent the use the speaker's voice to verify their identity and control 54 All Rights Reserved © 2012 IJARCSEE
4.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 access to services such as voice dialing, banking by telephone, Add push button here will perform add to database similarly telephone shopping, database access services, information remove push button will perform remove from database from services, voice mail, security control for confidential database. information areas, and remote access to computers. Speaker recognition can be classified into identification and verification. Speaker identification is the process of determining which registered speaker provides a given speech. Speaker verification, on the other hand, is the process of accepting or rejecting the identity claim of a speaker. At the highest level, all speaker recognition systems contain two main modules: feature extraction and feature pattern matching. Feature extraction is the process that extracts a small amount of data from the voice signal that can later be used to represent each speaker. Feature matching involves the actual procedure to identify the unknown speaker by comparing extracted features from his/her voice input with the ones from a set of known speakers. Snapshot 2. An example of adding voice named (IMRAN1) All speaker recognition systems have to serve two on top push button ,this to add the voice sample of respective distinguishes phases. The first one is referred to the enrollment user. After this click the push button record file. sessions or training phase while the second one is referred to as the operation sessions or testing phase. In the training phase, each registered speaker has to provide samples of their speech so that the system can build or train a reference model for that speaker. In case of speaker verification systems, in addition, a speaker-specific threshold is also computed from the training samples. IX. RESULTS The experiment conducted using three voice signal of each person with different level noisy environment. After passing the input speech through microphone speaker, Feature vector transformation of input voice took place for the purpose of testing and training. Snapshot of corresponding experiment running and decision making for corresponding speaker Snapshot 3. A prompt of record voice signal is displayed identification and verification has been displayed below. asking permission for recording the voice of concerned user. Click here the push button yes to record the voice. Snapshot 4. After recording the voice, a prompt of playing voice signal is displayed. Click here the push button yes to play back the recorded voice. Snapshot 1. Here four bush button is there named Add, Remove, Recognized, Exit. 55 All Rights Reserved © 2012 IJARCSEE
5.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 Snapshot 8. A prompt containing Playing Voice Signal will appear. Click here the push button yes to play back your recorded voice. Snapshot 5. Then time graph of voice signal and spectrum of noise signal appears separately as two different figures. Showing the speech signal varying with time and in other concerned noise added to it. Snapshot 9. Then two separate figure of time graph of voice signal and spectrum signal appears. Snapshot 6. Click here the push button recognize to recognize the speaker and compare the frequency template of speaker in data base with the present input speech signal. Snapshot 10. Fig for match of calculated and the best match stored codebook (MFCC) appear. Snapshot 7. A prompt of Recode Voice Signal containing push button speak now will appear. Click push button yes to record your voice for further comparison. 56 All Rights Reserved © 2012 IJARCSEE
6.
ISSN: 2277 –
9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 4, June 2012 REFERENCES [1] Dr. Joseph Picone, Fundamentals of speech recognition,. a short course Institute for Signal and Information Processing. Department of Electrical and Computer Engineering., Mississippi State University. [2] Monson H Hayes, Digital Signal Processing,. Text book., Schaum’s outline. [3] D. A. Reynolds, .Experimental evaluation of features for robust Snapshot 11. Shows to whom the voice matches and time speaker identification,. IEEE Trans. Speech Audio Processing, vol. 2, taken for matching the two voice in seconds. pp. 639-643, Oct. 1994. [4] R. Mammone, X. Zhang and R. P. Ramachandran, .Robust speaker recognition - a feature-based approach,. IEEE Signal Processing Magazine, pp. 58-71, Sep. 1996. [5] H. A. Murthy, F. Beaufays, L. P. Heck and M. Weintraub, .Robust text-independent speaker identification over telephone channels,. IEEE Trans. Speech Audio Processing, vol. 7, pp. 554-568, Sep. 1999. [6] L. F. Lamel and J. L. Gauvain, .Speaker verification over the Snapshot 12. Shows the decision that input voice not present telephone,. Speech Commun., vol. 31, pp. 141-154, 2000. in database hence speaker not recognized. [7] G. R. Doddington, et al., .The NIST speaker recognition evaluation - overview, methodology, systems, results, perspective., Speech Commun., vol. 31, pp. 225-254, 2000. X. CONCLUSION [8] Y.Kao,P.Rajashekaran and J.Baras, “Free-text speaker identification Speaker recognition can be used to verify one’s identity over long distance telephone channel using phonetic segmentation”, in when the interface favors the use of a telephone or proc.IEEE ICASSP 1992 pp II 177-II 180. microphone. With proper expectations, planning and education, speaker verification has already proven to be the most natural yet very secure solution to verifying one’s identity. Voice analysis technology has been around for years. Applying it used to be tougher than rocket science. Now you Mohammed Imdad N , received B.E in can get all the benefits of advanced technology without all the Electronics and communication from complexity and overhead of managing Gigabytes of voice VTU, Belgaum. He is presently pursuing reference data, dealing with advanced speech technology, and M.Tech in computer science and worrying about all the legal issues involved. engineering from VTU, Belgaum. topic Working on the project to develop 1. This technique is been used for speaker recognition speaker recognition in noisy environment and to identify the user using the speaker. for his PG thesis under the guidance of 2. This technique makes it possible to use the speaker's Dr. Shameem Akhtar N. voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas. Mohammad Imran Akhtar, received B.E in Information technology from MG Dr. Shameem Akhtar N received B.E in Computer Science University, Kottyam. He completed and Engineering from Gulbarga university and M.Tech in M.Tech in digital communication and Computer Science and Engineering from VTU, Belgaum and Networking from UBDT College the Ph.D degree in digital image processing From Gitam of Engineering. He is an Assistant University. She has more than 10 years of experience in Professor in Electronics and teaching and research. She is life member of Indian society of communication department of AITM technical Education. She is an Assistant Professor in the Bhatkal. His main research interest are department of Computer Science and Engineering in KBN include speech processing, image college of Engineering. processing. 57 All Rights Reserved © 2012 IJARCSEE