SlideShare uma empresa Scribd logo
1 de 89
Baixar para ler offline
Let’s talk about voice
Hello world!
• Started in 2017
• Over 20 years experience in web
content and social media
• Social media content, strategy,
advertising, training and
campaign management
• Video editing and motion
graphics
• Chatbots and voice technology
What we’ll cover
• What is voice technology
• How does voice technology
work?
• Why should we use it?
• Who’s using it well?
• Best practice in voice design
• Practical demonstrations
What is voice technology?
• Amazon Alexa
• Google Home
• Microsoft Cortana
• Apple Siri
• Samsung Bixby
Amazon Alexa devices
Amazon Alexa devices
Google Home devices
Microsoft Cortana devices
Apple Siri
Samsung Bixby
A little bit of history
• Alan Turing was a pioneer of
modern computing
• He devised the Turing Test in
1950
MIT AI Laboratory
• Professor Marvin Minsky set
up the research group in the
early 1960s to explore
artificial intelligence, machine
learning and natural
language processing
ELIZA
• One major project that
emerged from the MIT AI
Laboratory was ELIZA in 1964
• Essentially this was an early
chatbot where individuals had
a conversational with a
computer
• They were not told they were
talking to a machine
ELIZA meet DOCTOR
• ELIZA simulated
conversations using pattern
matching and substitution
methodology, but did not
understand the context of
words
• One of the most popular
scripts ELIZA ran was
DOCTOR, that simulated a

psychotherapist
ELIZA is tested
• ELIZA attempted the Turing
Test
• It failed
Back to the future
• Bringing things back up to
date, AI, natural language
processing and technology
can now understand context
• The Turing Test has still not
been passed, but we are
getting closer
Google Duplex
• Google recently
demonstrated their Duplex
technology that links voice
technology to cloud services
such as Google Calendar
• A sophisticated DenseNet in
TensorFlow can process
complex interactions, and
understand context
Making progress
• Duplex is said to be effective
in 80% of situations so
doesn’t yet pass the Turing
Test
• Deep Learning expert
Andrew Ng predicts that
once speech recognition is
99% accurate voice will be
the primary way we interact
with computers
The final 4%
• Estimates suggest we are at
around 95% currently
• The final 4% is very
challenging!
Adding functionality
• Amazon Alexa and Google
Home devices can add new
functionality via Skills and
Actions
• These give the devices new
capabilities, and anyone can
build them
Powerfully simple
• It is fairly quick and simple to
create content for these
devices
• There are now over 40,000
Alexa Skills available with an
active developer community
How does voice technology work?
• Voice technology uses
Natural Language Processing
to understand and interpret
voice commands
• This is underpinned by
machine learning techniques
Voice technology in action
Device listens
for invocation
User gives
wake word
Device returns
welcome message
Users gives
intent
Device returns
response
Intents
• An intent is used to trigger a
response
• For example a Skill / Action
could ask where you want to
go on holiday - New York,
Paris or Tokyo?
• Each of these choices would
be a separate intent and
produce different responses
Synonyms
• Intents are really powerful
and can include synonyms,
so if users have a different
name for something this can
be handled gracefully
• Eg Pavement / sidewalk
• AI is used with NLP so
phrases don’t have to be
exact
Slots
• You can also add slots to
intents that request specific
data be captured in a set
order
• This is particularly useful for
retail / ecommerce
Explicit and implicit invocations
• Explicit invocation

Alexa open Coffee Wizard
• Implicit invocation

Alexa recommend a coffee
for a sunny day

Discoverability
• It’s not always appropriate to
use explicit intents, as it can
feel less conversational and
mechanistic
• Alexa uses HypRank, a
neural network to rank Skills
using natural language
HypRank
• It’s not always appropriate to
use explicit intents, as it can
feel less conversational and
mechanistic
• Alexa uses HypRank, a
neural network that uses
contextual signals to rank
Skills using natural language
HypRank overview
A few stats
• Voice technology will be a $601
million industry by 2019

Source: Technavio
• Over 21 million smart speakers
in the US by 2020

Source: Activate
• Google Assistant now available
to over 95% of Android devices
and majority of iOS

Source: Alpine AI
Creating Skills and Actions
• Amazon and Google provide
developer friendly tools for
building content
• AWS with Lambda
• Dialogflow with Firebase
• Work with a variety of
languages (Node.js, JAVA,
Python, Go, etc)
Using SSML
• SSML (Speech Synthesis
Markup Language) can be
used to control the
pronunciation, speed and
pitch of phrases
• For example you can make
Alexa pause, whisper or
place emphasis on specific
words
Analytics
• Both platforms offer detailed
performance measurement
tools to help monitor usage
Ambient computing
• Ubiquitous computing
• Physical interface less vital
• Cloud services
• 5G rollout
Entering a new era
Desktop era
Entering a new era
Desktop era Mobile era
Entering a new era
Desktop era Mobile era Voice era
Who’s using it well
• BBC
• Netflix
• Ocado
• HMRC
Thanks for listening
• I hope you have found this
session helpful
• Please visit dotkumo.com to
learn more

Mais conteúdo relacionado

Mais procurados

100412 webinar mobile
100412 webinar mobile100412 webinar mobile
100412 webinar mobile
Val Hoeppner
 
#MBLT14 presentation — Linko
#MBLT14 presentation — Linko#MBLT14 presentation — Linko
#MBLT14 presentation — Linko
e-Legion
 

Mais procurados (12)

Using artificial intelligence to enhance your customer experience
Using artificial intelligence to enhance your customer experienceUsing artificial intelligence to enhance your customer experience
Using artificial intelligence to enhance your customer experience
 
100412 webinar mobile
100412 webinar mobile100412 webinar mobile
100412 webinar mobile
 
Is Voice future of Computing!
Is Voice future of Computing!Is Voice future of Computing!
Is Voice future of Computing!
 
First Steps in iOS Development
First Steps in iOS DevelopmentFirst Steps in iOS Development
First Steps in iOS Development
 
Powerpoint
PowerpointPowerpoint
Powerpoint
 
Swift LA Meetup at eHarmony - What You Might Have Missed at WWDC 2015 with Ch...
Swift LA Meetup at eHarmony - What You Might Have Missed at WWDC 2015 with Ch...Swift LA Meetup at eHarmony - What You Might Have Missed at WWDC 2015 with Ch...
Swift LA Meetup at eHarmony - What You Might Have Missed at WWDC 2015 with Ch...
 
Apps on Silicon
Apps on SiliconApps on Silicon
Apps on Silicon
 
It pp presentation
It pp presentationIt pp presentation
It pp presentation
 
Non technical
Non technicalNon technical
Non technical
 
#MBLT14 presentation — Linko
#MBLT14 presentation — Linko#MBLT14 presentation — Linko
#MBLT14 presentation — Linko
 
Livescribe - Pulse Pen
Livescribe - Pulse PenLivescribe - Pulse Pen
Livescribe - Pulse Pen
 
Voice assistants: your new invisible friend
Voice assistants: your new invisible friendVoice assistants: your new invisible friend
Voice assistants: your new invisible friend
 

Semelhante a Let's talk about voice

VIRTUAL PERSONAL ASSISTANT.pdf
VIRTUAL PERSONAL ASSISTANT.pdfVIRTUAL PERSONAL ASSISTANT.pdf
VIRTUAL PERSONAL ASSISTANT.pdf
AnkushSolanki6
 

Semelhante a Let's talk about voice (20)

Conversational User Interfaces, Past and Future
Conversational User Interfaces, Past and FutureConversational User Interfaces, Past and Future
Conversational User Interfaces, Past and Future
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Delhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas MalikDelhi NCR JUG meetup - NLP - APIs - By Vikas Malik
Delhi NCR JUG meetup - NLP - APIs - By Vikas Malik
 
Nonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & TricksNonprofit Must Have Technology Tools & Tricks
Nonprofit Must Have Technology Tools & Tricks
 
Alexa user group bangalore meetup - let us build multimodal skill
Alexa user group   bangalore meetup - let us build multimodal skillAlexa user group   bangalore meetup - let us build multimodal skill
Alexa user group bangalore meetup - let us build multimodal skill
 
"Alexa, when does the keynote start?" - building VUIs for events
"Alexa, when does the keynote start?" - building VUIs for events"Alexa, when does the keynote start?" - building VUIs for events
"Alexa, when does the keynote start?" - building VUIs for events
 
Getting ready for voice
Getting ready for voiceGetting ready for voice
Getting ready for voice
 
Open source and free technologies for study skills
Open source and free technologies for study skillsOpen source and free technologies for study skills
Open source and free technologies for study skills
 
TinCan in the Wild
TinCan in the Wild TinCan in the Wild
TinCan in the Wild
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Successfully using iPads for Communication and Literacy
Successfully using iPads for Communication and LiteracySuccessfully using iPads for Communication and Literacy
Successfully using iPads for Communication and Literacy
 
CSF18 - For Your Ears Only - Sasha Kranjac
CSF18 - For Your Ears Only - Sasha KranjacCSF18 - For Your Ears Only - Sasha Kranjac
CSF18 - For Your Ears Only - Sasha Kranjac
 
Perso.na
Perso.naPerso.na
Perso.na
 
Roadshow cb
Roadshow cbRoadshow cb
Roadshow cb
 
Suparna - GDG - 27 Jan 2024 - Emerging Technologies
Suparna - GDG - 27 Jan 2024 - Emerging TechnologiesSuparna - GDG - 27 Jan 2024 - Emerging Technologies
Suparna - GDG - 27 Jan 2024 - Emerging Technologies
 
Best Of SEJ Summit: Duane Forrester on the Future of Voice Search
Best Of SEJ Summit: Duane Forrester on the Future of Voice SearchBest Of SEJ Summit: Duane Forrester on the Future of Voice Search
Best Of SEJ Summit: Duane Forrester on the Future of Voice Search
 
Webinar_ How can AI help disabled people slides.pptx
Webinar_ How can AI help disabled people slides.pptxWebinar_ How can AI help disabled people slides.pptx
Webinar_ How can AI help disabled people slides.pptx
 
VIRTUAL PERSONAL ASSISTANT.pdf
VIRTUAL PERSONAL ASSISTANT.pdfVIRTUAL PERSONAL ASSISTANT.pdf
VIRTUAL PERSONAL ASSISTANT.pdf
 
Conversational ai, conversational ui
Conversational ai, conversational uiConversational ai, conversational ui
Conversational ai, conversational ui
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Let's talk about voice

  • 2. Hello world! • Started in 2017 • Over 20 years experience in web content and social media • Social media content, strategy, advertising, training and campaign management • Video editing and motion graphics • Chatbots and voice technology
  • 3. What we’ll cover • What is voice technology • How does voice technology work? • Why should we use it? • Who’s using it well? • Best practice in voice design • Practical demonstrations
  • 4. What is voice technology? • Amazon Alexa • Google Home • Microsoft Cortana • Apple Siri • Samsung Bixby
  • 11. A little bit of history • Alan Turing was a pioneer of modern computing • He devised the Turing Test in 1950
  • 12.
  • 13. MIT AI Laboratory • Professor Marvin Minsky set up the research group in the early 1960s to explore artificial intelligence, machine learning and natural language processing
  • 14.
  • 15. ELIZA • One major project that emerged from the MIT AI Laboratory was ELIZA in 1964 • Essentially this was an early chatbot where individuals had a conversational with a computer • They were not told they were talking to a machine
  • 16. ELIZA meet DOCTOR • ELIZA simulated conversations using pattern matching and substitution methodology, but did not understand the context of words • One of the most popular scripts ELIZA ran was DOCTOR, that simulated a
 psychotherapist
  • 17. ELIZA is tested • ELIZA attempted the Turing Test • It failed
  • 18. Back to the future • Bringing things back up to date, AI, natural language processing and technology can now understand context • The Turing Test has still not been passed, but we are getting closer
  • 19. Google Duplex • Google recently demonstrated their Duplex technology that links voice technology to cloud services such as Google Calendar • A sophisticated DenseNet in TensorFlow can process complex interactions, and understand context
  • 20.
  • 21.
  • 22. Making progress • Duplex is said to be effective in 80% of situations so doesn’t yet pass the Turing Test • Deep Learning expert Andrew Ng predicts that once speech recognition is 99% accurate voice will be the primary way we interact with computers
  • 23. The final 4% • Estimates suggest we are at around 95% currently • The final 4% is very challenging!
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Adding functionality • Amazon Alexa and Google Home devices can add new functionality via Skills and Actions • These give the devices new capabilities, and anyone can build them
  • 33. Powerfully simple • It is fairly quick and simple to create content for these devices • There are now over 40,000 Alexa Skills available with an active developer community
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. How does voice technology work? • Voice technology uses Natural Language Processing to understand and interpret voice commands • This is underpinned by machine learning techniques
  • 47. Voice technology in action Device listens for invocation User gives wake word Device returns welcome message Users gives intent Device returns response
  • 48. Intents • An intent is used to trigger a response • For example a Skill / Action could ask where you want to go on holiday - New York, Paris or Tokyo? • Each of these choices would be a separate intent and produce different responses
  • 49. Synonyms • Intents are really powerful and can include synonyms, so if users have a different name for something this can be handled gracefully • Eg Pavement / sidewalk • AI is used with NLP so phrases don’t have to be exact
  • 50. Slots • You can also add slots to intents that request specific data be captured in a set order • This is particularly useful for retail / ecommerce
  • 51. Explicit and implicit invocations • Explicit invocation
 Alexa open Coffee Wizard • Implicit invocation
 Alexa recommend a coffee for a sunny day

  • 52. Discoverability • It’s not always appropriate to use explicit intents, as it can feel less conversational and mechanistic • Alexa uses HypRank, a neural network to rank Skills using natural language
  • 53. HypRank • It’s not always appropriate to use explicit intents, as it can feel less conversational and mechanistic • Alexa uses HypRank, a neural network that uses contextual signals to rank Skills using natural language
  • 55. A few stats • Voice technology will be a $601 million industry by 2019
 Source: Technavio • Over 21 million smart speakers in the US by 2020
 Source: Activate • Google Assistant now available to over 95% of Android devices and majority of iOS
 Source: Alpine AI
  • 56. Creating Skills and Actions • Amazon and Google provide developer friendly tools for building content • AWS with Lambda • Dialogflow with Firebase • Work with a variety of languages (Node.js, JAVA, Python, Go, etc)
  • 57. Using SSML • SSML (Speech Synthesis Markup Language) can be used to control the pronunciation, speed and pitch of phrases • For example you can make Alexa pause, whisper or place emphasis on specific words
  • 58. Analytics • Both platforms offer detailed performance measurement tools to help monitor usage
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68. Ambient computing • Ubiquitous computing • Physical interface less vital • Cloud services • 5G rollout
  • 69. Entering a new era Desktop era
  • 70. Entering a new era Desktop era Mobile era
  • 71. Entering a new era Desktop era Mobile era Voice era
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79. Who’s using it well • BBC • Netflix • Ocado • HMRC
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89. Thanks for listening • I hope you have found this session helpful • Please visit dotkumo.com to learn more