SlideShare uma empresa Scribd logo
1 de 30
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Amazon Polly Tips and Tricks: How to
Bring Your Text-to-Speech Voices to Life
R e m u s M o i s – A m a z o n T e x t - t o - S p e e c h
A m a z o n P o l l y
N o v e m b e r 3 0 , 2 0 1 7
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to expect from this session
• What is Amazon Polly?
• Introducing Seoyeon, Matthew, Takumi, Aditi, and Vicki
• Controlling the output of text-to-speech
• Bringing Amazon Polly voices to life: The Magic Door
• Q&A
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Amazon Polly?
• A service that converts text into lifelike speech
• Offers 52 lifelike voices and 25 languages
• Low latency responses enable developers to build
real-time systems
• Developers can store, replay, and distribute
generated speech
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to use Amazon Polly
The Amazon Polly service provides API operations for synthesizing high-
quality speech from plain text and Speech Synthesis Markup Language
(SSML), along with managing pronunciation lexicons that enable you to
get the best results for your application domain.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Text-to-speech pipeline
Text
Text normalization
Grapheme-to-phoneme
conversion
Waveform
generation
Speech
100% Recycled – 8 ½ x 11 inch 20 lb Office Paper – 3,000 count
one hundred percent recycled, eight and a half by eleven inch…
ˈwʌn ˈhʌndrəd pɚˈsɛnt riːˈsaɪkəld ˈeɪt ənd ə ˈhæf ˈɪntʃ…
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Polly TTS converts text into intelligible, accurate, and natural speech
• G2P: rough, though, through.
• Homographs: same spelling, different pronunciations.
I live in Poland
This presentation is broadcasted live from Las Vegas
Context helps 'live' disambiguation. But... I read this book.
• Text normalization: disambiguation of abbreviations, acronyms, units ‘St.’ expanded as
‘street’ or ‘saint’
Example: St. Patrick St.
• Foreign words (déjà vu), proper names (Emmanuel Macron), social media lingo (ASAP, LOL),
and so on.
Main challenges of text-to-speech
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New Amazon Polly voices
Seoyeon
Matthew
Takumi
Vicki
Aditi
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Speech Synthesis Markup Language (SSML)
• W3C recommendation, XML-based markup language for speech
synthesis applications. Amazon Polly tags are compliant with SSML 1.1
specifications.
• A powerful tool which allows customers to modify certain aspects of the
TTS speech output, such as pronunciation of words, specify expansion of
abbreviations, acronyms. It also enables modifications of pitch, speech
rate, volume, and so on.
Controlling the output of TTS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The <sub> tag
In-line aliasing – In many cases we do not want to change all instances of a
certain word.
<speak>
My favorite chemical element is
<sub alias="aluminum">Al</sub>,but Al prefers
<sub alias="magnesium">Mg</sub>.
</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The <lang> tag
Foreign words and phrases – Foreign phrases are rendered better if they
are enclosed inside the <lang> tag, as in the following example.
German in English
<speak>
<lang xml:lang=”de-DE">Sebastian Kurz,</lang> Austrian
conservative set to become world's youngest leader.
</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
<say-as interpret-as="expletive">
Bleeping undesirable content
<speak>
The longest word in English is <say-as interpret-
as="expletive">pneumonoultramicroscopicsilicovolcanoconiosis</say
-as>
</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
<say-as interpret-as="spell-out">
Read character by character
<speak>And here is how you spell handkerchief: <prosody
rate="x-slow"><say-as interpret-as="spell-
out">handkerchief</say-as></prosody>.</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Doing impersonations
<speak>
<prosody rate='-30%'>
<prosody pitch="x-low">
Alfred, did you know there is a city called Batman in Turkey?
</prosody>
</prosody>
</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Assign custom pronunciation (IPA or X-Sampa alphabets)
<lexeme><grapheme>gif</grapheme><phoneme>"dZIf</phoneme></lexeme>
<lexeme><grapheme>David</grapheme><phoneme>"dA.%vid</phoneme>
</lexeme>
<speak>I like this gif.</speak>
<speak>Here's my friend David.</speak>
Lexicons: <phoneme>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fun with SSML
'Can you make your voices sound like an auctioneer?'
<speak><prosody rate='+60%'>I’m at 500 and I want
550<prosody volume='x-loud'>550</prosody></prosody>
<prosody rate='+60%'>bid on 550 I’m at 500 would you go
550 550 for the gentleman in the corner</prosody><prosody
rate="+90%">A big black bug bit a big black bear a big
black bug bit a big black bear.</prosody> Do we get 600?
<prosody rate='+90%'>A big black bug bit a big black
bear.</prosody><prosody rate='+60%'>We got 600 for the
whole herd</prosody><prosody rate='default' volume='x-
loud'>Sold<prosody rate='+60%'>for
600.</prosody></prosody></speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THE MAGIC DOOR
Andrew Huntwork – Founder and CEO, The Magic Door
B R I N G Y O U R T E X T - T O - S P E E C H V O I C E S T O L I F E
The Magic Door
• Interactive storytelling by voice
• 8 million minutes of listening
• 900,000 players
• TTS, human voices, and sound effects
• 20 hours of original stories
• 50 characters voiced by Amazon Polly
• Thousands of handwritten SSML tags
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Talestreamer (https://talestreamer.ai)
• The Magic Door’s audio engine for your voice application
• Combine Amazon Polly voices
• Background sound
• RESTful API
• Interactive editor
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Talestreamer (https://talestreamer.ai)
• The Magic Door’s audio engine for your voice application
• Combine Amazon Polly voices
• Background sound
• RESTful API
• Interactive editor
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our process
1. Write story XML
2. Add synthesized speech with Amazon Polly
3. Add sound effects
4. Add human speech
5. Launch
6. Iterate
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
An example – Madame Faro Tells Your Fortune
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Making TTS characters sound great
1. Get to know the actors
2. Cast them
3. Direct them (with SSML)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our actors
Brian Brian, x-high Hans, slow, low
Amy Brian, x-low Russell, high
Raveena Brian, high Salli, x-low
Emma Amy, low, slow Kimberly, x-high
Russell Emma, fast, high Joey, x-low
Nicole Emma, x-high Ivy, high
Geraint Emma, x-low Raveena, high
Justin Emma, low Raveena, low
Joanna Emma, high Russell, low
Kimberly Geraint, low Justin, slow
Salli Geraint, high Geraint, high, slow
Joey Geraint, low, slow
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Directing Raveena with SSML
<speak>Ah, yes, now you: You have a passion for adventure.
And look at this: You are a good problem solver. Now, let
me see. uh huh. You are clearly very brave. You know, my
profound intuition tells me that you, yes, you two, make a
great team. Together you can do wonderful things and
journey far. And I believe you two would be perfect for a
particular challenge that needs solving. Tell me: Are you
interested in a new challenge?
</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Directing Raveena with SSML – Punctuation
<speak>Ah, yes. Now You. You have a passion for
adventure. And. look at this. You are a good problem solver.
Now. let me see. uh huh. You are clearly very
brave.</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Directing Raveena With SSML — Detailed Coaching
<speak>Ah, yes. Now <phoneme ph="%ju">you</phoneme>.
You have a passion for adventure. And. look at this. You are a
good problem solver. Now. let me see. uh huh. You are clearly
very brave.</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Directing Raveena with SSML – Breaks
<speak>Ah, yes. Now <phoneme ph=“%ju”>you.</phoneme> You have
a passion for adventure. <break time=".5s"/> And. look at this. <break
time=".4s"/> You are a good problem solver. <break time=".3s"/>Now.
let me see. <break time=".4s"/> uh huh. You are clearly very
brave.</speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Directing Raveena with SSML – Prosody
<speak><prosody rate="x-slow">Ah, yes. </prosody><prosody rate="-
50%">Now <phoneme ph=“%ju”>you.</phoneme></prosody> You
have <prosody rate="x-slow"> a passion </prosody> for adventure.
<break time=".5s"/><prosody rate="x-slow">And.</prosody> <prosody
rate="slow">look at this.</prosody><break time=".4s"/> You are
<prosody rate="x-slow"> a good</prosody> <prosody
rate="slow">problem solver. </prosody> <break time=".3s"/>Now.
<prosody rate="x-slow"> let me see. <break time=".4s"/> uh ha.
</prosody><prosody rate="slow"> You are clearly </prosody>
<prosody rate="x-slow"> very brave. </prosody></speak>
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Putting it all together
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
A M A Z O N P O L L Y

Mais conteúdo relacionado

Mais procurados

BABOK - Tasks, Input and Outputs
BABOK - Tasks, Input and OutputsBABOK - Tasks, Input and Outputs
BABOK - Tasks, Input and OutputsCBAP Master
 
How to Assess the Maturity of your PMO
How to Assess the Maturity of your PMOHow to Assess the Maturity of your PMO
How to Assess the Maturity of your PMOAmerico Pinto
 
CBAP Business analysis planning and monitoring
CBAP   Business analysis planning and monitoringCBAP   Business analysis planning and monitoring
CBAP Business analysis planning and monitoringLN Mishra CBAP
 
9.0 Project Resource Management Overview
9.0 Project Resource Management Overview9.0 Project Resource Management Overview
9.0 Project Resource Management OverviewDavidMcLachlan1
 
Relevamiento de Procesos.ppt
Relevamiento de Procesos.pptRelevamiento de Procesos.ppt
Relevamiento de Procesos.pptDiego688339
 
Project Management Proposal Template Powerpoint Presentation Slides
Project Management Proposal Template Powerpoint Presentation SlidesProject Management Proposal Template Powerpoint Presentation Slides
Project Management Proposal Template Powerpoint Presentation SlidesSlideTeam
 
Lean Six sigma Black Belt Training Part 6
Lean Six sigma Black Belt Training Part 6Lean Six sigma Black Belt Training Part 6
Lean Six sigma Black Belt Training Part 6Lean Insight
 
Department analysis of H.Nizamdin & Sons
Department analysis of H.Nizamdin & SonsDepartment analysis of H.Nizamdin & Sons
Department analysis of H.Nizamdin & SonsRameez Ahmed Shaikh
 
Change request form_template
Change request form_templateChange request form_template
Change request form_templaterac2
 
Guide for RAID Management
Guide for RAID ManagementGuide for RAID Management
Guide for RAID Managementshrud01
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk ManagementNandi Sudheer
 
Project Status Report
Project Status ReportProject Status Report
Project Status ReportSergey Chubuk
 
Project Deliverables Powerpoint Presentation Slides
Project Deliverables Powerpoint Presentation SlidesProject Deliverables Powerpoint Presentation Slides
Project Deliverables Powerpoint Presentation SlidesSlideTeam
 
Meeting For Starting New Project Powerpoint Presentation Slides
Meeting For Starting New Project Powerpoint Presentation SlidesMeeting For Starting New Project Powerpoint Presentation Slides
Meeting For Starting New Project Powerpoint Presentation SlidesSlideTeam
 
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdf
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdfProject_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdf
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdfPeter Banda
 
Project Closure Checklist
Project Closure ChecklistProject Closure Checklist
Project Closure ChecklistDemand Metric
 

Mais procurados (20)

BABOK - Tasks, Input and Outputs
BABOK - Tasks, Input and OutputsBABOK - Tasks, Input and Outputs
BABOK - Tasks, Input and Outputs
 
How to Assess the Maturity of your PMO
How to Assess the Maturity of your PMOHow to Assess the Maturity of your PMO
How to Assess the Maturity of your PMO
 
Project audit & closure
Project audit & closureProject audit & closure
Project audit & closure
 
CBAP Business analysis planning and monitoring
CBAP   Business analysis planning and monitoringCBAP   Business analysis planning and monitoring
CBAP Business analysis planning and monitoring
 
9.0 Project Resource Management Overview
9.0 Project Resource Management Overview9.0 Project Resource Management Overview
9.0 Project Resource Management Overview
 
Relevamiento de Procesos.ppt
Relevamiento de Procesos.pptRelevamiento de Procesos.ppt
Relevamiento de Procesos.ppt
 
Project Management Proposal Template Powerpoint Presentation Slides
Project Management Proposal Template Powerpoint Presentation SlidesProject Management Proposal Template Powerpoint Presentation Slides
Project Management Proposal Template Powerpoint Presentation Slides
 
Lean Six sigma Black Belt Training Part 6
Lean Six sigma Black Belt Training Part 6Lean Six sigma Black Belt Training Part 6
Lean Six sigma Black Belt Training Part 6
 
Department analysis of H.Nizamdin & Sons
Department analysis of H.Nizamdin & SonsDepartment analysis of H.Nizamdin & Sons
Department analysis of H.Nizamdin & Sons
 
Pmp exam questions
Pmp exam questionsPmp exam questions
Pmp exam questions
 
Change request form_template
Change request form_templateChange request form_template
Change request form_template
 
11.7 Monitor Risks
11.7 Monitor Risks11.7 Monitor Risks
11.7 Monitor Risks
 
Guide for RAID Management
Guide for RAID ManagementGuide for RAID Management
Guide for RAID Management
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
 
Project Status Report
Project Status ReportProject Status Report
Project Status Report
 
Project Deliverables Powerpoint Presentation Slides
Project Deliverables Powerpoint Presentation SlidesProject Deliverables Powerpoint Presentation Slides
Project Deliverables Powerpoint Presentation Slides
 
Work inefficiency
Work inefficiencyWork inefficiency
Work inefficiency
 
Meeting For Starting New Project Powerpoint Presentation Slides
Meeting For Starting New Project Powerpoint Presentation SlidesMeeting For Starting New Project Powerpoint Presentation Slides
Meeting For Starting New Project Powerpoint Presentation Slides
 
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdf
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdfProject_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdf
Project_Management_in_practice_-Samuel_J._Mantel_Jack_R._Mer_1125.pdf
 
Project Closure Checklist
Project Closure ChecklistProject Closure Checklist
Project Closure Checklist
 

Semelhante a Amazon Polly Tips and Tricks: How to Bring Your Text-to-Speech Voices to Life - MCL307 - re:Invent 2017

[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...
[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...
[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...Amazon Web Services
 
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...Adrian Hornsby
 
ALX401-Advanced Alexa Skill Building Conversation and Memory
ALX401-Advanced Alexa Skill Building Conversation and MemoryALX401-Advanced Alexa Skill Building Conversation and Memory
ALX401-Advanced Alexa Skill Building Conversation and MemoryAmazon Web Services
 
AI: State of the Union
AI: State of the UnionAI: State of the Union
AI: State of the UnionAdrian Hornsby
 
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon Polly
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon PollyMCL206-Creating Next Generation Speech-Enabled Applications with Amazon Polly
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon PollyAmazon Web Services
 
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...Amazon Web Services
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAmazon Web Services
 
Breaking language barriers with AI
Breaking language barriers with AIBreaking language barriers with AI
Breaking language barriers with AIAmazon Web Services
 
Breaking Language Barriers with AI - AWS Summit
Breaking Language Barriers with AI - AWS SummitBreaking Language Barriers with AI - AWS Summit
Breaking Language Barriers with AI - AWS SummitBoaz Ziniman
 
Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017Amazon Web Services
 
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...Amazon Web Services
 
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019Boaz Ziniman
 
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019Amazon Web Services
 
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019AWS Summits
 
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018Amazon Web Services
 
Adding a Sumerian Host to Your Scene
Adding a Sumerian Host to Your SceneAdding a Sumerian Host to Your Scene
Adding a Sumerian Host to Your SceneAmazon Web Services
 
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)Amazon Web Services
 
Ai Services on AWS - AWS IL Meetup
Ai Services on AWS - AWS IL MeetupAi Services on AWS - AWS IL Meetup
Ai Services on AWS - AWS IL MeetupBoaz Ziniman
 

Semelhante a Amazon Polly Tips and Tricks: How to Bring Your Text-to-Speech Voices to Life - MCL307 - re:Invent 2017 (20)

[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...
[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...
[REPEAT 1] How to Get the Most out of TTS for Your Alexa Skill (ALX342-R1) - ...
 
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
 
ALX401-Advanced Alexa Skill Building Conversation and Memory
ALX401-Advanced Alexa Skill Building Conversation and MemoryALX401-Advanced Alexa Skill Building Conversation and Memory
ALX401-Advanced Alexa Skill Building Conversation and Memory
 
AI: State of the Union
AI: State of the UnionAI: State of the Union
AI: State of the Union
 
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon Polly
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon PollyMCL206-Creating Next Generation Speech-Enabled Applications with Amazon Polly
MCL206-Creating Next Generation Speech-Enabled Applications with Amazon Polly
 
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...
Bringing Characters to Life with Amazon Polly Text-to-Speech - AWS Online Tec...
 
AWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AIAWS 機器學習 I ─ 人工智慧 AI
AWS 機器學習 I ─ 人工智慧 AI
 
Breaking language barriers with AI
Breaking language barriers with AIBreaking language barriers with AI
Breaking language barriers with AI
 
Breaking Language Barriers with AI - AWS Summit
Breaking Language Barriers with AI - AWS SummitBreaking Language Barriers with AI - AWS Summit
Breaking Language Barriers with AI - AWS Summit
 
Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017Conversation and Memory - ALX401-R - re:Invent 2017
Conversation and Memory - ALX401-R - re:Invent 2017
 
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...
NEW LAUNCH! Introducing Amazon Transcribe – Now in Preview - MCL215 - re:Inve...
 
AI: State of the Union
AI: State of the UnionAI: State of the Union
AI: State of the Union
 
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019
AIM301 - Breaking Language Barriers With AI - Tel Aviv Summit 2019
 
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
 
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019Breaking language barriers with AI | AWS Summit Tel Aviv 2019
Breaking language barriers with AI | AWS Summit Tel Aviv 2019
 
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018
Amazon Alexa Skills Empower my Business with Voice - AWS Summit Sydney 2018
 
AI State of the Union
AI State of the UnionAI State of the Union
AI State of the Union
 
Adding a Sumerian Host to Your Scene
Adding a Sumerian Host to Your SceneAdding a Sumerian Host to Your Scene
Adding a Sumerian Host to Your Scene
 
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)
 
Ai Services on AWS - AWS IL Meetup
Ai Services on AWS - AWS IL MeetupAi Services on AWS - AWS IL Meetup
Ai Services on AWS - AWS IL Meetup
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Amazon Polly Tips and Tricks: How to Bring Your Text-to-Speech Voices to Life - MCL307 - re:Invent 2017

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Amazon Polly Tips and Tricks: How to Bring Your Text-to-Speech Voices to Life R e m u s M o i s – A m a z o n T e x t - t o - S p e e c h A m a z o n P o l l y N o v e m b e r 3 0 , 2 0 1 7
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What to expect from this session • What is Amazon Polly? • Introducing Seoyeon, Matthew, Takumi, Aditi, and Vicki • Controlling the output of text-to-speech • Bringing Amazon Polly voices to life: The Magic Door • Q&A
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is Amazon Polly? • A service that converts text into lifelike speech • Offers 52 lifelike voices and 25 languages • Low latency responses enable developers to build real-time systems • Developers can store, replay, and distribute generated speech
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How to use Amazon Polly The Amazon Polly service provides API operations for synthesizing high- quality speech from plain text and Speech Synthesis Markup Language (SSML), along with managing pronunciation lexicons that enable you to get the best results for your application domain.
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Text-to-speech pipeline Text Text normalization Grapheme-to-phoneme conversion Waveform generation Speech 100% Recycled – 8 ½ x 11 inch 20 lb Office Paper – 3,000 count one hundred percent recycled, eight and a half by eleven inch… ˈwʌn ˈhʌndrəd pɚˈsɛnt riːˈsaɪkəld ˈeɪt ənd ə ˈhæf ˈɪntʃ…
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Polly TTS converts text into intelligible, accurate, and natural speech • G2P: rough, though, through. • Homographs: same spelling, different pronunciations. I live in Poland This presentation is broadcasted live from Las Vegas Context helps 'live' disambiguation. But... I read this book. • Text normalization: disambiguation of abbreviations, acronyms, units ‘St.’ expanded as ‘street’ or ‘saint’ Example: St. Patrick St. • Foreign words (déjà vu), proper names (Emmanuel Macron), social media lingo (ASAP, LOL), and so on. Main challenges of text-to-speech
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. New Amazon Polly voices Seoyeon Matthew Takumi Vicki Aditi
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Speech Synthesis Markup Language (SSML) • W3C recommendation, XML-based markup language for speech synthesis applications. Amazon Polly tags are compliant with SSML 1.1 specifications. • A powerful tool which allows customers to modify certain aspects of the TTS speech output, such as pronunciation of words, specify expansion of abbreviations, acronyms. It also enables modifications of pitch, speech rate, volume, and so on. Controlling the output of TTS
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The <sub> tag In-line aliasing – In many cases we do not want to change all instances of a certain word. <speak> My favorite chemical element is <sub alias="aluminum">Al</sub>,but Al prefers <sub alias="magnesium">Mg</sub>. </speak>
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The <lang> tag Foreign words and phrases – Foreign phrases are rendered better if they are enclosed inside the <lang> tag, as in the following example. German in English <speak> <lang xml:lang=”de-DE">Sebastian Kurz,</lang> Austrian conservative set to become world's youngest leader. </speak>
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. <say-as interpret-as="expletive"> Bleeping undesirable content <speak> The longest word in English is <say-as interpret- as="expletive">pneumonoultramicroscopicsilicovolcanoconiosis</say -as> </speak>
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. <say-as interpret-as="spell-out"> Read character by character <speak>And here is how you spell handkerchief: <prosody rate="x-slow"><say-as interpret-as="spell- out">handkerchief</say-as></prosody>.</speak>
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Doing impersonations <speak> <prosody rate='-30%'> <prosody pitch="x-low"> Alfred, did you know there is a city called Batman in Turkey? </prosody> </prosody> </speak>
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Assign custom pronunciation (IPA or X-Sampa alphabets) <lexeme><grapheme>gif</grapheme><phoneme>"dZIf</phoneme></lexeme> <lexeme><grapheme>David</grapheme><phoneme>"dA.%vid</phoneme> </lexeme> <speak>I like this gif.</speak> <speak>Here's my friend David.</speak> Lexicons: <phoneme>
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fun with SSML 'Can you make your voices sound like an auctioneer?' <speak><prosody rate='+60%'>I’m at 500 and I want 550<prosody volume='x-loud'>550</prosody></prosody> <prosody rate='+60%'>bid on 550 I’m at 500 would you go 550 550 for the gentleman in the corner</prosody><prosody rate="+90%">A big black bug bit a big black bear a big black bug bit a big black bear.</prosody> Do we get 600? <prosody rate='+90%'>A big black bug bit a big black bear.</prosody><prosody rate='+60%'>We got 600 for the whole herd</prosody><prosody rate='default' volume='x- loud'>Sold<prosody rate='+60%'>for 600.</prosody></prosody></speak>
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THE MAGIC DOOR Andrew Huntwork – Founder and CEO, The Magic Door B R I N G Y O U R T E X T - T O - S P E E C H V O I C E S T O L I F E
  • 17. The Magic Door • Interactive storytelling by voice • 8 million minutes of listening • 900,000 players • TTS, human voices, and sound effects • 20 hours of original stories • 50 characters voiced by Amazon Polly • Thousands of handwritten SSML tags
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Talestreamer (https://talestreamer.ai) • The Magic Door’s audio engine for your voice application • Combine Amazon Polly voices • Background sound • RESTful API • Interactive editor
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Talestreamer (https://talestreamer.ai) • The Magic Door’s audio engine for your voice application • Combine Amazon Polly voices • Background sound • RESTful API • Interactive editor
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Our process 1. Write story XML 2. Add synthesized speech with Amazon Polly 3. Add sound effects 4. Add human speech 5. Launch 6. Iterate
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. An example – Madame Faro Tells Your Fortune
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Making TTS characters sound great 1. Get to know the actors 2. Cast them 3. Direct them (with SSML)
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Our actors Brian Brian, x-high Hans, slow, low Amy Brian, x-low Russell, high Raveena Brian, high Salli, x-low Emma Amy, low, slow Kimberly, x-high Russell Emma, fast, high Joey, x-low Nicole Emma, x-high Ivy, high Geraint Emma, x-low Raveena, high Justin Emma, low Raveena, low Joanna Emma, high Russell, low Kimberly Geraint, low Justin, slow Salli Geraint, high Geraint, high, slow Joey Geraint, low, slow
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Directing Raveena with SSML <speak>Ah, yes, now you: You have a passion for adventure. And look at this: You are a good problem solver. Now, let me see. uh huh. You are clearly very brave. You know, my profound intuition tells me that you, yes, you two, make a great team. Together you can do wonderful things and journey far. And I believe you two would be perfect for a particular challenge that needs solving. Tell me: Are you interested in a new challenge? </speak>
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Directing Raveena with SSML – Punctuation <speak>Ah, yes. Now You. You have a passion for adventure. And. look at this. You are a good problem solver. Now. let me see. uh huh. You are clearly very brave.</speak>
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Directing Raveena With SSML — Detailed Coaching <speak>Ah, yes. Now <phoneme ph="%ju">you</phoneme>. You have a passion for adventure. And. look at this. You are a good problem solver. Now. let me see. uh huh. You are clearly very brave.</speak>
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Directing Raveena with SSML – Breaks <speak>Ah, yes. Now <phoneme ph=“%ju”>you.</phoneme> You have a passion for adventure. <break time=".5s"/> And. look at this. <break time=".4s"/> You are a good problem solver. <break time=".3s"/>Now. let me see. <break time=".4s"/> uh huh. You are clearly very brave.</speak>
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Directing Raveena with SSML – Prosody <speak><prosody rate="x-slow">Ah, yes. </prosody><prosody rate="- 50%">Now <phoneme ph=“%ju”>you.</phoneme></prosody> You have <prosody rate="x-slow"> a passion </prosody> for adventure. <break time=".5s"/><prosody rate="x-slow">And.</prosody> <prosody rate="slow">look at this.</prosody><break time=".4s"/> You are <prosody rate="x-slow"> a good</prosody> <prosody rate="slow">problem solver. </prosody> <break time=".3s"/>Now. <prosody rate="x-slow"> let me see. <break time=".4s"/> uh ha. </prosody><prosody rate="slow"> You are clearly </prosody> <prosody rate="x-slow"> very brave. </prosody></speak>
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Putting it all together
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! A M A Z O N P O L L Y