SlideShare uma empresa Scribd logo
1 de 40
JavaScript
Speech
Recognition
Applicationsand maybe some other HTML5 goodness
Who is this guy?
core contributor
nutty about
IBM'er
@macdonst
macdonst on Github
simonmacdonald.com
PhoneGap
speech recognition
Why do I care about speech rec?
Here's a conversation between
two Cape Bretoners
P1: jeet?
P2: naw, jew?
P1: naw, t'rly t'eet bye.
And here's the translation
P1: jeet?
P1: Did you eat?
P2: naw, jew?
P2: No, did you?
P1: naw, t'rly t'eet bye.
P1: No, it's too early to eat buddy.
What is
speech
recognition?
Speech
recognition is
the process of
translating
the spoken
word into text.
The process of speech
rec includes
1. Record and digitize the audio data
2. Split data into phonemes
3. Apply the phonemes to the recognition model
4. Analyze the results against the grammar
5. Return a confidence weighted result
Basically...
So how do we
add speech
rec to our
app?
You may look at the W3C
Speech API Specification
but only Chrome on the
desktop has
implemented that spec
But that's okay!
The spec looks like this:
interface SpeechRecognition : EventTarget {
// recognition parameters
attribute SpeechGrammarList grammars;
attribute DOMString lang;
attribute boolean continuous;
attribute boolean interimResults;
attribute unsigned long maxAlternatives;
attribute DOMString serviceURI;
// methods to drive the speech interaction
void start();
void stop();
void abort();
};
With additional event
methods to control
behaviour:
attribute EventHandler onaudiostart;
attribute EventHandler onsoundstart;
attribute EventHandler onspeechstart;
attribute EventHandler onspeechend;
attribute EventHandler onsoundend;
attribute EventHandler onaudioend;
attribute EventHandler onresult;
attribute EventHandler onnomatch;
attribute EventHandler onerror;
attribute EventHandler onstart;
attribute EventHandler onend;
Let's recognize some
speech
Click to Speak
hello world
var recognition = new SpeechRecognition();
recognition.onresult = function(event) {
if (event.results.length > 0) {
var test1 = document.getElementById("test1");
test1.innerHTML = event.results[0][0].transcript;
}
};
recognition.start();
So that's
pretty cool...
...if taking
dictation gets
you going
But I want to
do something
more exciting
with the
result
Let's do something a little
less trivial
Click to Speak
recognition.onresult = function(event) {
var result = event.results[0][0].transcript;
var music = document.getElementById("music");
switch(result) {
case "jazz":
music.src="jazz.mp3";
music.play();
break;
case "rock":
music.src="rock.mp3";
music.play();
break;
case "stop":
default:
music.pause();
}
};
Which seems
much cooler
to me
Let's ask the web a
question
Click to Speak
Q: what day is it today
A: Friday July 19th, 2013
Works pretty
good...
...but ugly!
Let's style our
button with
some CSS
+
=
<a class="speechinput">
<img src="images/mic.png">
</a>
#speechinput input {
cursor:pointer;
margin:auto;
margin:15px;
color:transparent;
background-color:transparent;
border:5px;
width:15px;
-webkit-transform: scale(3.0, 3.0);
}
And we'll add some color
using
by Nicholas Gallagher
Speech
Bubbles
Pure-CSS-Speech-Bubbles
Then pull it all
together!
what is steve jobs middle name
Steven Paul Jobs
But wait, why
am I using my
eyes like a
sucker
We'll output the answer
using SpeechSynthesis
The SpeechSynthesis
spec looks like this:
interface SpeechSynthesis {
readonly attribute boolean pending;
readonly attribute boolean speaking;
readonly attribute boolean paused;
void speak(SpeechSynthesisUtterance utterance);
void cancel();
void pause();
void resume();
SpeechSynthesisVoiceList getVoices();
};
The
SpeechSynthesisUtterance
spec looks like this:
interface SpeechSynthesisUtterance : EventTarget {
attribute DOMString text;
attribute DOMString lang;
attribute DOMString voiceURI;
attribute float volume;
attribute float rate;
attribute float pitch;
};
With additional event
methods to control
behaviour:
attribute EventHandler onstart;
attribute EventHandler onend;
attribute EventHandler onerror;
attribute EventHandler onpause;
attribute EventHandler onresume;
attribute EventHandler onmark;
attribute EventHandler onboundary;
who won the stanley cup this year
Chicago Blackhawks
Plugin repo's
SpeechRecognitionPlugin -
https://github.com/macdonst/SpeechRecognitionPlugin
SpeechSynthesisPlugin -
https://github.com/macdonst/SpeechSynthesisPlugin
THE END

Mais conteúdo relacionado

Semelhante a JavaScript Speech Recognition and Speech Synthesis Applications

JavaScript Speech Recognition
JavaScript Speech RecognitionJavaScript Speech Recognition
JavaScript Speech RecognitionFITC
 
Voicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.comVoicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.comVoxeo Corp
 
Device Emulation with OSGi and Flash
Device Emulation with OSGi and FlashDevice Emulation with OSGi and Flash
Device Emulation with OSGi and Flashgeorgemesesan
 
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...Rachel Evans
 
NDC 2011 - The FLUID Principles
NDC 2011 - The FLUID PrinciplesNDC 2011 - The FLUID Principles
NDC 2011 - The FLUID Principlesanoras
 
通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016Chloe Chen
 
Prophet - Beijing Perl Workshop
Prophet - Beijing Perl WorkshopProphet - Beijing Perl Workshop
Prophet - Beijing Perl WorkshopJesse Vincent
 
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schillerscottschiller
 
Passing The Joel Test In The PHP World
Passing The Joel Test In The PHP WorldPassing The Joel Test In The PHP World
Passing The Joel Test In The PHP WorldLorna Mitchell
 
Spring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good partsSpring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good partsJarek Ratajski
 
Design and Evolution of cyber-dojo
Design and Evolution of cyber-dojoDesign and Evolution of cyber-dojo
Design and Evolution of cyber-dojoJon Jagger
 
London Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companiesLondon Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companiesStrangeloop
 
Programming For Designers V3
Programming For Designers V3Programming For Designers V3
Programming For Designers V3sqoo
 
DevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web DevelopmentDevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web DevelopmentJohannes Brodwall
 
Hey man, can I get a clue?
Hey man, can I get a clue?Hey man, can I get a clue?
Hey man, can I get a clue?Voxeo Corp
 

Semelhante a JavaScript Speech Recognition and Speech Synthesis Applications (20)

JavaScript Speech Recognition
JavaScript Speech RecognitionJavaScript Speech Recognition
JavaScript Speech Recognition
 
Voicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.comVoicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.com
 
Device Emulation with OSGi and Flash
Device Emulation with OSGi and FlashDevice Emulation with OSGi and Flash
Device Emulation with OSGi and Flash
 
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
 
NDC 2011 - The FLUID Principles
NDC 2011 - The FLUID PrinciplesNDC 2011 - The FLUID Principles
NDC 2011 - The FLUID Principles
 
通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016
 
What lies beneath
What lies beneathWhat lies beneath
What lies beneath
 
Prophet - Beijing Perl Workshop
Prophet - Beijing Perl WorkshopProphet - Beijing Perl Workshop
Prophet - Beijing Perl Workshop
 
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
 
Passing The Joel Test In The PHP World
Passing The Joel Test In The PHP WorldPassing The Joel Test In The PHP World
Passing The Joel Test In The PHP World
 
mri-bp2015
mri-bp2015mri-bp2015
mri-bp2015
 
Spring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good partsSpring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good parts
 
Design and Evolution of cyber-dojo
Design and Evolution of cyber-dojoDesign and Evolution of cyber-dojo
Design and Evolution of cyber-dojo
 
London Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companiesLondon Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companies
 
Programming For Designers V3
Programming For Designers V3Programming For Designers V3
Programming For Designers V3
 
Otto AI
Otto AIOtto AI
Otto AI
 
DevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web DevelopmentDevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web Development
 
Hey man, can I get a clue?
Hey man, can I get a clue?Hey man, can I get a clue?
Hey man, can I get a clue?
 
C 2
C 2C 2
C 2
 
BSides LA/PDX
BSides LA/PDXBSides LA/PDX
BSides LA/PDX
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

JavaScript Speech Recognition and Speech Synthesis Applications