Let's talk about voice

Hello world!
• Started in 2017
• Over 20 years experience in web
content and social media
• Social media content, strategy,
advertising, training and
campaign management
• Video editing and motion
graphics
• Chatbots and voice technology

What we’ll cover
• What is voice technology
• How does voice technology
work?
• Why should we use it?
• Who’s using it well?
• Best practice in voice design
• Practical demonstrations

What is voice technology?
• Amazon Alexa
• Google Home
• Microsoft Cortana
• Apple Siri
• Samsung Bixby

A little bit of history
• Alan Turing was a pioneer of
modern computing
• He devised the Turing Test in
1950

MIT AI Laboratory
• Professor Marvin Minsky set
up the research group in the
early 1960s to explore
artiﬁcial intelligence, machine
learning and natural
language processing

ELIZA
• One major project that
emerged from the MIT AI
Laboratory was ELIZA in 1964
• Essentially this was an early
chatbot where individuals had
a conversational with a
computer
• They were not told they were
talking to a machine

ELIZA meet DOCTOR
• ELIZA simulated
conversations using pattern
matching and substitution
methodology, but did not
understand the context of
words
• One of the most popular
scripts ELIZA ran was
DOCTOR, that simulated a 
psychotherapist

ELIZA is tested
• ELIZA attempted the Turing
Test
• It failed

Back to the future
• Bringing things back up to
date, AI, natural language
processing and technology
can now understand context
• The Turing Test has still not
been passed, but we are
getting closer

Google Duplex
• Google recently
demonstrated their Duplex
technology that links voice
technology to cloud services
such as Google Calendar
• A sophisticated DenseNet in
TensorFlow can process
complex interactions, and
understand context

Making progress
• Duplex is said to be effective
in 80% of situations so
doesn’t yet pass the Turing
Test
• Deep Learning expert
Andrew Ng predicts that
once speech recognition is
99% accurate voice will be
the primary way we interact
with computers

The ﬁnal 4%
• Estimates suggest we are at
around 95% currently
• The ﬁnal 4% is very
challenging!

Adding functionality
• Amazon Alexa and Google
Home devices can add new
functionality via Skills and
Actions
• These give the devices new
capabilities, and anyone can
build them

Powerfully simple
• It is fairly quick and simple to
create content for these
devices
• There are now over 40,000
Alexa Skills available with an
active developer community

How does voice technology work?
• Voice technology uses
Natural Language Processing
to understand and interpret
voice commands
• This is underpinned by
machine learning techniques

Voice technology in action
Device listens
for invocation
User gives
wake word
Device returns
welcome message
Users gives
intent
Device returns
response

Intents
• An intent is used to trigger a
response
• For example a Skill / Action
could ask where you want to
go on holiday - New York,
Paris or Tokyo?
• Each of these choices would
be a separate intent and
produce different responses

Synonyms
• Intents are really powerful
and can include synonyms,
so if users have a different
name for something this can
be handled gracefully
• Eg Pavement / sidewalk
• AI is used with NLP so
phrases don’t have to be
exact

Slots
• You can also add slots to
intents that request speciﬁc
data be captured in a set
order
• This is particularly useful for
retail / ecommerce

Explicit and implicit invocations
• Explicit invocation 
Alexa open Coffee Wizard
• Implicit invocation 
Alexa recommend a coffee
for a sunny day

Discoverability
• It’s not always appropriate to
use explicit intents, as it can
feel less conversational and
mechanistic
• Alexa uses HypRank, a
neural network to rank Skills
using natural language

HypRank
• It’s not always appropriate to
use explicit intents, as it can
feel less conversational and
mechanistic
• Alexa uses HypRank, a
neural network that uses
contextual signals to rank
Skills using natural language

A few stats
• Voice technology will be a $601
million industry by 2019 
Source: Technavio
• Over 21 million smart speakers
in the US by 2020 
Source: Activate
• Google Assistant now available
to over 95% of Android devices
and majority of iOS 
Source: Alpine AI

Creating Skills and Actions
• Amazon and Google provide
developer friendly tools for
building content
• AWS with Lambda
• Dialogﬂow with Firebase
• Work with a variety of
languages (Node.js, JAVA,
Python, Go, etc)

Using SSML
• SSML (Speech Synthesis
Markup Language) can be
used to control the
pronunciation, speed and
pitch of phrases
• For example you can make
Alexa pause, whisper or
place emphasis on speciﬁc
words

Analytics
• Both platforms offer detailed
performance measurement
tools to help monitor usage

Ambient computing
• Ubiquitous computing
• Physical interface less vital
• Cloud services
• 5G rollout

Entering a new era
Desktop era

Entering a new era
Desktop era Mobile era

Entering a new era
Desktop era Mobile era Voice era

Who’s using it well
• BBC
• Netﬂix
• Ocado
• HMRC

Thanks for listening
• I hope you have found this
session helpful
• Please visit dotkumo.com to
learn more

Let's talk about voice

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (12)

Semelhante a Let's talk about voice

Semelhante a Let's talk about voice (20)

Último

Último (20)

Let's talk about voice