SlideShare uma empresa Scribd logo
1 de 18
Speech Recognition




                     1
Introduction
•   What is Speech Recognition?
           - Voice Recognition?
•   Where can it be used?
    - Dictation
    - System control/navigation
    - Commercial/Industrial applications
    - Hand held digital recorders
                                   2
Contents:
•   Continuous/Discrete
•   How does it work?
•   Recent improvements
•   Current software options
•   Future of SR



                          3
Continuous or Discrete?
    • Continuous speech
       - dictation
    • Discrete speech
       - system controls




                           4
How does SR work?
  •   Recognition
  •   Training
  •   Correction
  •   Command/Control




                        5
Recognition (1)
Voice Input     Analog to Digital      Acoustic Model



                                       Language Model




     Feedback      Display          Speech Engine



                                           6
Recognition (2)
Acoustic Modeling
• Spoken words: “I think there are…..”
• Phonemes: ‘ ay th-in-nk-kd dh-eh-r aa-
  r’
• H.M.M.’s: 5 state representation
• Speech Engine


                               7
Recognition (3)
Language Modeling
• Word context
• Word frequency
• Transition possibilities




                         8
Voice Training (1)
Can be done by:
• Predetermined text segments
• Individual words
Compare new acoustic with old and combines
• More training = better recognition



                                9
Voice Training (2)
User specific Voice file
• Voice qualities
• Pronunciation
• Patterns of word use
• Preferred vocabulary



                           10
Making Corrections
•   Move cursor by voice command
•   Memorize edit commands
•   List of possible alternatives
•   Make correction manually




                            11
Command/Control
•   Desktop grid
•   Program or Link name/number
•   URL name
•   Memorized commands




                          12
Recent Improvements in SR
  •   Faster training ~10 min.
  •   Better recognition ~95%
  •   More compatible software
  •   Better system control/command




                              13
Current Software Options for PC
•   Dragon Systems – Naturally Speaking
•   Philips – FreeSpeech
•   IBM – ViaVoice
•   Lernout & Hauspie – Voice Xpress




                                  14
How well do the work?
           Training   Dictation App.        Command
                      Correct. Integrat.    - Control
Dragon     Excellent Excellent Good         Good

Philips    Fair       Fair      Good        Good

IBM        Excellent Good       Good        Excellent

L&H        Good       Good      Good        Good

                                       15
Future of SR
• SUI – Speech-based User Interface
• Improvements needed:
  - Greater accuracy
  - Greater system control/command
  - More compatible software



                                 16
Conclusion
•   SR Uses
•   How does it work?
•   Current Software
•   Problems of SR
•   More SR coming soon….



                        17
References
• 1. Alwang, Greg. “Speech Recognition,” PC Magazine, December 1
  1999
• 2. Hauptmann, Alexander G. Jang, Photina Jaeyun. Carnegie Mellon
  University. “Learning to Recognize Speech by Watching Television,”
  IEEE Intelligent Systems, September/October 1999.
• 3. Miastkowski, Stan. “Latest Speech Software Gets You Up and
  Running Faster,” PC World, November 1999.




                                                     18

Mais conteúdo relacionado

Mais procurados

College forum software
College forum softwareCollege forum software
College forum software
Rahul E
 
Computer Languages....ppt
Computer Languages....pptComputer Languages....ppt
Computer Languages....ppt
hashgeneration
 

Mais procurados (20)

What is a programmer
What is a programmerWhat is a programmer
What is a programmer
 
Computer programming
Computer programmingComputer programming
Computer programming
 
Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016Voice Recognition and Natural Language - Dallas TechFest 2016
Voice Recognition and Natural Language - Dallas TechFest 2016
 
Lecture 11
Lecture 11Lecture 11
Lecture 11
 
Building Voice UI products for events
Building Voice UI products for eventsBuilding Voice UI products for events
Building Voice UI products for events
 
Text to Speech for Mobile Voice
Text to Speech for Mobile Voice Text to Speech for Mobile Voice
Text to Speech for Mobile Voice
 
classification of computer language
classification of computer languageclassification of computer language
classification of computer language
 
computer languages
computer languagescomputer languages
computer languages
 
Computer Language
Computer LanguageComputer Language
Computer Language
 
Computer languages
Computer languagesComputer languages
Computer languages
 
Army architect
Army architectArmy architect
Army architect
 
College forum software
College forum softwareCollege forum software
College forum software
 
Presentation on computer language
Presentation on computer languagePresentation on computer language
Presentation on computer language
 
computer languages
computer languagescomputer languages
computer languages
 
Applying Filmmaking Tools and Techniques to Interaction Design
Applying Filmmaking Tools and Techniques to Interaction DesignApplying Filmmaking Tools and Techniques to Interaction Design
Applying Filmmaking Tools and Techniques to Interaction Design
 
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
 
MIS software concepts, Dr. Ashish K. Gupta
MIS software concepts, Dr. Ashish K. GuptaMIS software concepts, Dr. Ashish K. Gupta
MIS software concepts, Dr. Ashish K. Gupta
 
Software (Application and System Software)
Software (Application and System Software)Software (Application and System Software)
Software (Application and System Software)
 
Computer Languages....ppt
Computer Languages....pptComputer Languages....ppt
Computer Languages....ppt
 
Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NET
 

Destaque

Winter deliverables ii
Winter deliverables iiWinter deliverables ii
Winter deliverables ii
jvj002
 
Presentation1
Presentation1Presentation1
Presentation1
nabil1927
 
Answer to question I - MDS presentation
Answer to question I - MDS presentationAnswer to question I - MDS presentation
Answer to question I - MDS presentation
pj7291
 
An_activity_I_enjoy:_Intense_physical_challenges
An_activity_I_enjoy:_Intense_physical_challengesAn_activity_I_enjoy:_Intense_physical_challenges
An_activity_I_enjoy:_Intense_physical_challenges
pj7291
 
On wireless scheduling algorithms for minimizing the queue overflow probability
On wireless scheduling algorithms for minimizing the queue overflow probabilityOn wireless scheduling algorithms for minimizing the queue overflow probability
On wireless scheduling algorithms for minimizing the queue overflow probability
Preet Kanwal
 
Behavior-based robotics
Behavior-based roboticsBehavior-based robotics
Behavior-based robotics
Preet Kanwal
 

Destaque (10)

Winter deliverables ii
Winter deliverables iiWinter deliverables ii
Winter deliverables ii
 
Presentation1
Presentation1Presentation1
Presentation1
 
Answer to question I - MDS presentation
Answer to question I - MDS presentationAnswer to question I - MDS presentation
Answer to question I - MDS presentation
 
Cookies
CookiesCookies
Cookies
 
An_activity_I_enjoy:_Intense_physical_challenges
An_activity_I_enjoy:_Intense_physical_challengesAn_activity_I_enjoy:_Intense_physical_challenges
An_activity_I_enjoy:_Intense_physical_challenges
 
Key
KeyKey
Key
 
On wireless scheduling algorithms for minimizing the queue overflow probability
On wireless scheduling algorithms for minimizing the queue overflow probabilityOn wireless scheduling algorithms for minimizing the queue overflow probability
On wireless scheduling algorithms for minimizing the queue overflow probability
 
Grouper
GrouperGrouper
Grouper
 
SILABUS MULTIMEDIA LENGKAP
SILABUS MULTIMEDIA LENGKAPSILABUS MULTIMEDIA LENGKAP
SILABUS MULTIMEDIA LENGKAP
 
Behavior-based robotics
Behavior-based roboticsBehavior-based robotics
Behavior-based robotics
 

Semelhante a Speech recognition1

Fixing the program my computer learned: End-user debugging of machine-learned...
Fixing the program my computer learned: End-user debugging of machine-learned...Fixing the program my computer learned: End-user debugging of machine-learned...
Fixing the program my computer learned: End-user debugging of machine-learned...
City University London
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
Vinay Jaisriram
 
Week 5
Week 5Week 5
Week 5
A VD
 
Week 5
Week 5Week 5
Week 5
A VD
 
Python-unit -I.pptx
Python-unit -I.pptxPython-unit -I.pptx
Python-unit -I.pptx
crAmth
 

Semelhante a Speech recognition1 (20)

Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Artificial intelligence in speech recognition
Artificial intelligence in speech recognitionArtificial intelligence in speech recognition
Artificial intelligence in speech recognition
 
System softare
System softareSystem softare
System softare
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Presentation2
Presentation2Presentation2
Presentation2
 
Fixing the program my computer learned: End-user debugging of machine-learned...
Fixing the program my computer learned: End-user debugging of machine-learned...Fixing the program my computer learned: End-user debugging of machine-learned...
Fixing the program my computer learned: End-user debugging of machine-learned...
 
Ppl 13 july2019
Ppl 13 july2019Ppl 13 july2019
Ppl 13 july2019
 
VA ppt.pdf
VA ppt.pdfVA ppt.pdf
VA ppt.pdf
 
computer architecture and organization.ppt
computer architecture and organization.pptcomputer architecture and organization.ppt
computer architecture and organization.ppt
 
Ask me anything: A Conversational Interface to Augment Information Security w...
Ask me anything:A Conversational Interface to Augment Information Security w...Ask me anything:A Conversational Interface to Augment Information Security w...
Ask me anything: A Conversational Interface to Augment Information Security w...
 
Aplikace pro rozpoznávání řeči - Jan Šedivý
Aplikace pro rozpoznávání řeči - Jan ŠedivýAplikace pro rozpoznávání řeči - Jan Šedivý
Aplikace pro rozpoznávání řeči - Jan Šedivý
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Voice Enabled Desktop Interaction and Control System (VEDICS).
Voice Enabled Desktop Interaction and Control System (VEDICS).Voice Enabled Desktop Interaction and Control System (VEDICS).
Voice Enabled Desktop Interaction and Control System (VEDICS).
 
DTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptxDTUI6_chap09_accessiblePPT.pptx
DTUI6_chap09_accessiblePPT.pptx
 
Unit 1 computer concepts
Unit 1   computer conceptsUnit 1   computer concepts
Unit 1 computer concepts
 
Week 5
Week 5Week 5
Week 5
 
Week 5
Week 5Week 5
Week 5
 
Software
SoftwareSoftware
Software
 
Python-unit -I.pptx
Python-unit -I.pptxPython-unit -I.pptx
Python-unit -I.pptx
 

Último

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

Speech recognition1

  • 2. Introduction • What is Speech Recognition? - Voice Recognition? • Where can it be used? - Dictation - System control/navigation - Commercial/Industrial applications - Hand held digital recorders 2
  • 3. Contents: • Continuous/Discrete • How does it work? • Recent improvements • Current software options • Future of SR 3
  • 4. Continuous or Discrete? • Continuous speech - dictation • Discrete speech - system controls 4
  • 5. How does SR work? • Recognition • Training • Correction • Command/Control 5
  • 6. Recognition (1) Voice Input Analog to Digital Acoustic Model Language Model Feedback Display Speech Engine 6
  • 7. Recognition (2) Acoustic Modeling • Spoken words: “I think there are…..” • Phonemes: ‘ ay th-in-nk-kd dh-eh-r aa- r’ • H.M.M.’s: 5 state representation • Speech Engine 7
  • 8. Recognition (3) Language Modeling • Word context • Word frequency • Transition possibilities 8
  • 9. Voice Training (1) Can be done by: • Predetermined text segments • Individual words Compare new acoustic with old and combines • More training = better recognition 9
  • 10. Voice Training (2) User specific Voice file • Voice qualities • Pronunciation • Patterns of word use • Preferred vocabulary 10
  • 11. Making Corrections • Move cursor by voice command • Memorize edit commands • List of possible alternatives • Make correction manually 11
  • 12. Command/Control • Desktop grid • Program or Link name/number • URL name • Memorized commands 12
  • 13. Recent Improvements in SR • Faster training ~10 min. • Better recognition ~95% • More compatible software • Better system control/command 13
  • 14. Current Software Options for PC • Dragon Systems – Naturally Speaking • Philips – FreeSpeech • IBM – ViaVoice • Lernout & Hauspie – Voice Xpress 14
  • 15. How well do the work? Training Dictation App. Command Correct. Integrat. - Control Dragon Excellent Excellent Good Good Philips Fair Fair Good Good IBM Excellent Good Good Excellent L&H Good Good Good Good 15
  • 16. Future of SR • SUI – Speech-based User Interface • Improvements needed: - Greater accuracy - Greater system control/command - More compatible software 16
  • 17. Conclusion • SR Uses • How does it work? • Current Software • Problems of SR • More SR coming soon…. 17
  • 18. References • 1. Alwang, Greg. “Speech Recognition,” PC Magazine, December 1 1999 • 2. Hauptmann, Alexander G. Jang, Photina Jaeyun. Carnegie Mellon University. “Learning to Recognize Speech by Watching Television,” IEEE Intelligent Systems, September/October 1999. • 3. Miastkowski, Stan. “Latest Speech Software Gets You Up and Running Faster,” PC World, November 1999. 18