Speech and Language Processing

SUBMITTED TO-MR.
ABHISHEK SRIVASTAVA
SUBMITTED BY-VIKALP
MAHENDRA
(EC-11)

 Introduction
 Block Diagram
 Linguistic Levels Of Analysis
 Phonetics
 Organs Of Speech And Articulation
 Acoustic Model
 Circuit Diagram
 Components Used
 Features Of HM2007
 Working
 Extracting Phonemes In Frequency Domain
 Markov Model
 Advantages
 Applications
 Conclusion

 Analyses sound and converts spoken word into text
Uses knowledge of spoken English
 Programs are available for voice recognition.
 Systems work best on Windows XP & Windows Vista

Computers
Databases Algorithms
Robotics Natural Language Processing Search
Information
Retrieval
Machine
Translation
Language
Analysis
Semantics

 Speech
 Written language
 Phonology: sounds / letters / pronunciation
 Morphology: the structure of words
 Syntax: how these sequences are structured
 Semantics: meaning of the strings

The Study of the way Humans make, Transmit, and
receive sounds
Phonology - the study of sound systems of languages
 A typical word such as moon broken down into
three phonemes: m, ue , n.
Phoneme represents all vowels and consonants of
spoken speech

 Most vowel sounds are modified by the shape of
the lips (rounded / spread / neutral)
 Sounds are made by vibrating the vocal cords
(voicing)
 Vowels can be :-
 Single sounds – Monophthongs or pure vowels
 Double sounds - Diphthongs
 Triple sounds - Triphthongs
 Pure vowels usually come in pairs consisting of
long and short sounds

This is found in the word tea. The lips are spread and the sound is long.
This is found in the word hip. The lips are slightly spread and the sound is short.
The tongue tip is raised slightly at the front towards the alveolar. In the longer sound the
tongue is raised higher.

 This sound is made by relaxing the mouth and
keeping your lips in a neutral position and making
a short sound. It is found in words like paper,
over, about, and common in weak verbs in spoken
English.

The long sound – you, too & blue
The short sound –Good, would &
wool
The lips are rounded and the centre
and back of the tongue is raised towards
the soft plate. For the longer sound the
tongue is raised higher and the lips are
more rounded.
This sound is made with the mouth
spread wide open. It is found in – cat,
man, apple & ran

 Here we have three sounds: The sounds from -
1) for 2) tour 3) go
 Triphthongs are combinations of three sounds-
English has 1 triphthong (a diphthong + a
schwa sound)
 Diphthongs are combinations of two sounds.

Diphthongs are combinations of pure vowels.
•a:+ I = ‘aI’ - tie, buy, height & night
•e + I = ‘eI’ -way, paid & gate
•o: + I = ‘oI’ – boy, coin & coy
•e + = e - where, hair & care
• I + = I - here, hear & beer
e e
e e

 The audio recording of speech to create a
statistical representation of sound.
 To create a speech recognition engine, a large
database of models is created to match each
phoneme
 These database models have stored
phonemes
 The language model has the grammar of the
sentence to decode our spoken word to text.

 HM 2007 IC
 SRAM 8K*8
 LATCH 74LS373
 INPUT BUFFER 7448
 XTAL 3.57MHz
 PCB
 KEYPAD
 PC MOUNTED SWITCHES
 7 SEGMENT DISPLAY
 MICROPHONE
 22K RESISTOR
 100K RESISTOR
 .0047F CAPACITOR

 A single chip voice recognition system
having 48 pin .
 Manufactured by Hualon
 Maximum 40 word and word length 1.92 sec
 Microphone support
 5V power supply

 How a computer convert spoken speech into data ??
 When we speak, a microphone converts the analog signal of our voice into
digital chunks of data that the computer analyzes.
 It is from this data that the computer extracts enough information that
confidently guess the word being spoken

 To extract phonemes
 Phonemes are linguistic units
 The sounds that group together form words
 Phoneme converts into sound & depends on many factors

 aa - father
 ae - cat
 ah - cut
 ao - dog
 aw - foul
 ng - sing
 t - talk
 th - thin
 uh - book
waveform shows
phonemes freq
characteristics

 Phonemes are extracted by running waveform through Fourier
transform
 Easily visible in frequency domain
 This can be make out by seeing spectrograph
 Spectrograph is a 3-D plot of waveform freq and amplitude
versus time and amplitude is shown in grey colour

 Computer generates list of phoneme
 These phoneme have to be converted into words and to
sentence so Markov model is used
 It compares the observed phoneme with the stored phoneme

 In this, word tomato is written both in English and American
English format
 This idea is used upto the level of sentences and improved
recognition

 It is used to translate different form of language
 It Is used in telephones
 The std land line telephone has a bandwidth of 64kb/s.
 Sampling rate of 8khz
 In Std desktop P.C ,the limiting factor is sound card.It can
record sampling rate between 16 kHz to 48 kHz

 MILITARY
 HELICOPTERS
 IN MOBILE SMARTPHONES`
 SPEECH CONTROLLED
APPLIANCES
 VOICE RECOGNITION SECURITY

 Speech recognition system is one of the latest technology .
 Ir reduces costs like that of training
 Steps :
 Fourier transform of signal
 Extraction of Phonemes
 Formation of word on the basis of Markov Models
 Charm of Simplicity
 With the advent of this technology, we will hopefully see a
new era of human computer interaction .

 From: Chapter 1 of An Introduction to Natural Language
Processing, Computational Linguistics, and Speech
Recognition, by Daniel Jurafsky and James H. Martin
 http://en.wikipedia.org/wiki/acoustic model
 http://en.wikipedia.org/wiki/speech recognition
 www.wikpedia.org
 www.slideshare.net
 Natural Language Processing by Rada Mihalcea
 www.youtube.com

Speech and Language Processing

Speech and Language Processing

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (8)

Semelhante a Speech and Language Processing

Semelhante a Speech and Language Processing (20)

Último

Último (20)

Speech and Language Processing