2. INTRODUCTION This voice recognition project consists of two major components, a speech recognition module and a motorized robot. Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture). The motorized robot will consist of two DC motors and will make the robot forward and backward directions. DEPARTMENT OF ECE 2
3. PROJECT DESCRIPTION The Speaker Recognition can be classified into two phases. 1 Training Phase. 2 Testing Phase. DEPARTMENT OF ECE 3
4. Training Phase. In Training Phase ,the frequency components of the given speech signal is extracted. Each registered speaker has to provide samples of their speech (given words). so that the system an build or train a reference model for that speaker. DEPARTMENT OF ECE 4
9. FEATURES OF ADSP 2181 PROCESSOR 25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 Volts Single-Cycle Instruction Execution Multifunction Instructions Low Power Dissipation in Idle Mode 16K Words On-Chip Program Memory RAM 16K Words On-Chip Data Memory RAM Independent ALU, Multiplier/Accumulator, and Barrel Shifter Units 3-Bus Architecture Allows Dual Operand Fetches in every Instruction Cycle DEPARTMENT OF ECE 7
10. ALU and MAC The ALU performs a standard set of arithmetic and logic operations in addition to division primitives. The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations. DEPARTMENT OF ECE 8
11. SHIFTER The shifter performs logical and arithmetic shifts, normalization, de-normalization, and derive exponent operations. The shifter implements numeric format control including multiword floating-point representations. DEPARTMENT OF ECE 9
12. SPEECH The input speech is given in the form of nos. like1, 2,3.. The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHz In coding only 2000 samples are considered because only 0.25 sec will be taken for one character 10 DEPARTMENT OF ECE
14. Block Diagram Input speech via mic ADSP 2181 DEPARTMENT OF ECE 12 WINDOWING FFT CODEC FRAMMING MEL SPECTRUM MEL FREQ WRAP MEL CEPSTRUM DC MOTOR
15. FRAMING Speech signal is blocked into frames of N samples (n=256) Adjacent Frames are separated by M samples (M=100) Frame1= 0-256 Frame2=100-356 Such kind of 18 frames are required for 2000 samples/sec character. 13 DEPARTMENT OF ECE
17. Windowing Minimizes signal discontinuity in each frame Reduced spectral distortion Window signal is obtained by Y1(n)=x1(n)*w(n) ; 0<=n<N-1 Where w(n) is Hamming Window and is given by w(n)=0.54-0.46Cos(2∏ n/N-1); 0<=n<N-1 15 DEPARTMENT OF ECE
19. Result of Windowing 256 values are o/p of this process These values are given as an input for FFT. Some values of windowing for 1 kHz is shown 0x0000 0x0826 0x0BE6 0x08B7 0x000F 0xF6C7 0xF26C 0xF5FC 0xFFE8 0x0AA9 0x0FC7 17 DEPARTMENT OF ECE
20. Fast Fourier Transform Converts time domain signal into frequency domain signal Power spectrum is obtained with real and imaginary part of the frequency domain of the speech signal. 18 DEPARTMENT OF ECE
21. Wrapping A subjective pitch for each frequency is computed using Mel Scale Mel frequency scale is given by mel(f)=2595*log10(1+f/700) 19 DEPARTMENT OF ECE
22. Mel Frequency Coefficients 20 DEPARTMENT OF ECE
23. MFCC It is Mel Frequency Cepstrum Coefficient It consists of various frequency coefficient components. It contains: Mel Spectrum (frequency domain) Mel Cepstrum (time domain) 21 DEPARTMENT OF ECE
24. SPECTRUM Samples are convoluted with mel filter bank to obtain mel frequency spectrum. Mel frequency spectrum is given by s(n)=y(n)*f(n) s(n)------>mel frequency spectrum y(n)------>samples f(n)------->filter coefficients 22 DEPARTMENT OF ECE
25. Inverse Discrete Cosine Transformation Mel frequency power spectrum is in frequency domain function In order to obtain a time domain function the signal undergoes IDCT Now mel frequency spectrum is converted into mel frequency cepstrum. 23 DEPARTMENT OF ECE
26. CEPSTRUM MFCC real numbers and are convoluted to time domain using IDCT The time domain coefficients are called mel frequency cepstrum coefficients.. MFCC is given by c(n)=sum of log (Sk * cos (n(k-.5)*pi/k) 24 DEPARTMENT OF ECE
27. LEAST MEAN SQUARE ALGORITHM (LMS) This algorithm is used to find out the the minimum deviation between certain values. During testing phase the input speech is compared with the stored 4 values. The least deviated value is sent. 25 DEPARTMENT OF ECE
28. INTERFACING PC WITH KIT RS-232 SERIAL CABLE DEPARTMENT OF ECE 26 PC DSP PROCESSOR
34. It causes a current flow in the DC Motor.DEPARTMENT OF ECE 29
35. Details of dc motor Speed of the motor - 300 rpm Current – 750mA Voltage – 7.5V DEPARTMENT OF ECE 30
36. Advantages It is SPEECH recognizable Processing time is less Easy and efficient Useful for physically disable people Less cost Maintenance is easy DEPARTMENT OF ECE 31
37. Limitations Mismatching of frequency may affect the compatibility with the hardware. Each and everyone voice should be trained before testing it. DEPARTMENT OF ECE 32
38. APPLICATIONS Physically and visually impaired friendly device where only the speech signals of the user is required. In cases of acute problems like system crashes and all, this method can be utilized for emergency. 33 DEPARTMENT OF ECE
39. CONCLUSION and FUTURE MODIFICATIONS Speech recognition is still an active research area. Speech Recognition brings in the communication between human and machine. This project recognizes the given speech signal and the word is displayed on the PC. DEPARTMENT OF ECE 34