Easily add intelligence to your applications using pre-trained AI services for computer vision, speech, translation, transcription, natural language processing, and conversational chatbots. No machine learning skills required.
Our mission is to enable customers to transform their businesses by putting ML in the hand of every developer. [Note: adding a customer-focused angle to our mission will be most impactful with the analyst audience]
[Cover at high level + highlight our customer-focused approach – no need to address everything on this slide]
Our approach for ML is similar to how we approach other areas of the AWS business:
We focus on customers’ business needs and developer capabilities
Innovate rapidly on behalf of our customers
Offer a broad and deep set of services for our customers
We are also focused on providing customers with choices – which is why we support the most popular frameworks.
INTEL INTEL INTEL INTEL
SageMaker/C5 instances have been fully optimized to quickly and easily build and deploy machine learning (ML) models. Amazon SageMaker is pre-configured with the latest Intel-optimized versions of TensorFlow* and Apache MXNet* for maximum performance on C5 instances running on Intel Xeon Platinum Processors, so developers can quickly train deep learning models.
Amazon SageMaker RL includes built-in, fully-managed RL algorithms. SageMaker supports RL in multiple frameworks, including TensorFlow and MXNet, as well as custom developed frameworks designed from the ground up for reinforcement learning, such as Intel RL Coach, and Ray RLlib.
For developers, the integration of Reinforcement Learning Coach with Amazon SageMaker is a recipe for success. Now they can take advantage of AWS C5 instances based on Intel® Xeon® Scalable Processors to run advanced compute-heavy workloads – like reinforcement learning models.
We see the Machine Learning stack having three key layers.
ML Frameworks:
The bottom layer is for expert machine learning practitioners—researchers and developers.
These are people who are comfortable building models, tuning models, training models, figuring out how to deploy into production, and manage them themselves.
And the vast majority of machine learning in the cloud today at this layer is being down through Amazon SageMaker which provides a managed experience for frameworks, or the AWS Deep Learning AMI that we built that effectively embeds all the major frameworks.
Infrastructure:
AWS offers a broad array of compute options for training and inference with powerful GPU-based instances, compute and memory optimized instances, and even FPGAs.
Our P3 instances provide up to 14 times better performance than previous-generation Amazon EC2 GPU compute instances.
C5 instances offer higher memory to vCPU ratio and deliver 25% improvement in price/performance compared to C4 instances, and are ideal for demanding inference applications.
We also have Amazon EC2 F1, a compute instance with field programmable gate arrays (FPGAs) that you can program to create custom hardware accelerations for your machine learning applications. F1 instances are easy to program and come with everything you need to develop, simulate, debug, and compile your hardware acceleration code. You can reuse your designs as many times, and across as many F1 instances as you like.
The new Amazon EC2 P3dn instance has four-times the networking bandwidth and twice the GPU memory of the largest P3 instance, P3dn is ideal for large scale distributed training. No one else has anything close.
P3dn.24xlarge instances offer 96vCPUs of Intel Skylake processors to reduce preprocessing time of data required for machine learning training.
The enhanced networking of the P3n instance allows GPUs to be used more efficiently in multi-node configurations so training jobs complete faster.
Finally, the extra GPU memory allows developers to easily handle more advanced machine learning models such as holding and processing multiple batches of 4k images for image classification and object detection systems
ML Services:
But, if you want to enable most enterprises and companies to be able to scale machine learning, we’ve solved that problem for organizations by making ML accessible for everyday developers and scientists. Amazon SageMaker removes the heavy lifting, complexity, and guesswork from each step of the machine learning process.
SageMaker makes model building and training easier by providing pre-built development notebooks, popular machine learning algorithms optimized for petabyte-scale datasets, and automatic model tuning, enabling developers to build, train, and deploy models in a single click.
SageMaker is already helping thousands of developers easily get started with building, training, and deploying models.
AI Services:
At the top layer are AI services which are ready-made for all developers—no ML skills.
For example, customers say here is an object, tell me what's in it, or here's a face, tell me if it's part of this facial group using Amazon Rekognition
Or let me translate text to speech using Amazon Polly
Or let’s build conversational apps with Amazon Lex.
Convert speech to text with Amazon Transcribe
Translate text between languages using Amazon Translate
Understand relationships and find insights from unstructured text using Amazon Comprehend
What we have done to make it easier for developers to get started is provide tools such as AWS DeepLens
Learn the basics of machine learning through hands on examples and sample projects
7 sample projects of varying difficulty available for use: object detection, artistic style transfer, face recognition, hot dog Not hot dog, cat vs dog, license plate detection
Use existing sample projects or extend the sample project with your own custom functionality (example detect when your dog is sitting on the couch and send an sms) or create your own project
Go deeper through integrations with Sage Maker, Greengrass, and other AWS services
[2 minutes]
AWS DeepRacer is a 1/18th scale robotic car which gives you an exciting and fun way to get started with reinforcement learning (RL) by applying it to autonomous racing. You can pre-order your AWS DeepRacer from Amazon today.
AWS DeepRacer has a virtual racing simulator that allows you to train, evaluate, and iterate on your RL models in a racing environment, quickly and easily.
And if you get really good, and want to showcase your machine learning skills in a competitive environment, there is the AWS DeepRacer league. You can compete in a global championship - racing the car - for a chance to win several prizes and advance to the AWS DeepRacer Grand Final. Throughout 2019 there will be in person events, that will be announced at a later date, and the online simulator will also give developers the opportunity to compete, virtually.
AI Services:
AI Services are intentionally easy to use. They can be accessed via a simple API call.
We’ve pulled the best and most targeted capabilities into ready-made services--for example image recognition or transcription.
The focus here is really on enabling any developer—no ML skills required—to be able to develop AI applications using one of our services.
These API services, used in conjunction, create compelling solutions that really target business problems and use cases.
Customers can build these capabilities into their new and existing applications to reduce costs, increase speed, improve customer satisfaction and insight, and build ‘modern’ intelligent applications
What is your use case? What are the capabilities you might need? There’s an AI Service, or a pairing of services that will address the need.
AI Services descriptions for color:
Amazon Rekognition:
Rekognition makes it easy to add image and video analysis to your applications. You just provide an image or video to the Rekognition API, and the service can identify the objects, people, text, scenes, and activities, as well as detect any inappropriate content.
Amazon Rekognition also provides highly accurate facial analysis and facial recognition on images and video that you provide. You can detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.Rekognition is a simple and easy to use API that can quickly analyze any image or video file stored in Amazon S3. Amazon Rekognition is always learning from new data, and we are continually adding new labels and facial recognition features to the service.
More info: https://aws.amazon.com/rekognition/
Amazon Polly:
Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products.
Polly is a text to speech service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.
With dozens of lifelike voices across a variety of languages, you can select the ideal voice and build speech-enabled applications that work in many different countries.
More info: https://aws.amazon.com/polly/
Amazon Transcribe:
Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capability to their applications.
Using the Amazon Transcribe API, you can analyze audio files stored in Amazon S3 and have the service return a text file of the transcribed speech.
Amazon Transcribe can be used for lots of common applications, including the transcription of customer service calls and generating subtitles on audio and video content.
The service can transcribe audio files stored in common formats, like WAV and MP3, with time stamps for every word so that you can easily locate the audio in the original source by searching for the text. Amazon Transcribe is continually learning and improving to keep pace with the evolution of language.
More info: https://aws.amazon.com/transcribe/
Amazon Translate:
Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation.
Neural machine translation is a form of language translation automation that uses deep learning models to deliver more accurate and more natural sounding translation than traditional statistical and rule-based translation algorithms.
Amazon Translate allows you to localize content - such as websites and applications - for international users, and to easily translate large volumes of text efficiently.
More info: https://aws.amazon.com/translate/
Amazon Comprehend:
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text.
The service identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic.
Using these APIs, you can analyze text and apply the results in a wide range of applications including voice of customer analysis, intelligent document search, and content personalization for web applications.
More info: https://aws.amazon.com/comprehend
Amazon Lex:
Amazon Lex is a service for building conversational interfaces into any application using voice and text.
Amazon Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text, to enable you to build applications with highly engaging user experiences and lifelike conversational interactions.
With Amazon Lex, the same deep learning technologies that power Amazon Alexa are now available to any developer, enabling you to quickly and easily build sophisticated, natural language, conversational bots
More info: https://aws.amazon.com/lex
Amazon Rekognition is a service that applies machine learning to extract information from images and video
You can use the ‘MinConfidence’ parameter in your API requests to balance detection of content (recall) vs the accuracy of detection (precision).
Amazon Rekognition is a service that applies machine learning to extract information from images and video
1/ Excited to introduce Amazon Textract, an OCR++ service to easily extract text and data from virtually any document.
2/ No ML experience required
3/ TRANSITION: Let’s take a look at how this works…
Language not important as long as latin
Amazon Translate
Easily extract text and data from virtually any document
We also announced Amazon Polly at last year’s re:Invent. Polly is our text-to-speech service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.
Key challenges: (1) low latency rendering and delivery, and (2) content ownership (store it on S3 or anywhere, it's your content)
54 voices across 26 languages
54 voices across 26 languages
Optionally, also sample rate, text type if SSML (Speech Synthesis Markup Language)
Spoken language crucial for language learning
Accurate pronunciation matters
Faster iteration thanks to TTS
As good as natural human speech
When teaching a foreign language, accurate pronunciation matters. If exposed to incorrect pronunciation, learners develop their listening and speaking skills poorly, which compromises their ability to communicate effectively. Duolingo uses text-to-speech (TTS) to provide high-quality language education. To some, this approach might seem counterintuitive: shouldn’t people learn by listening to a native speaker?
Find a company that records audio in the language: The company must find a voice actor who not only speaks the language, but also who speaks with good pronunciation and clarity.
Find someone to evaluate the quality of pronunciation: We need an independent party from the recording company to create a small sample of sentences, which this party uses to evaluate pronunciation quality of the recordings.
Record and evaluate the quality of the sample sentences.
Set up a contract with the recording company.
Record all sentences.
Evaluate recordings, providing a data quality assurance check. For example, we need to check if all files are in the proper format and correctly separated. This step is necessary because the industry standard is to record all sentences in a single session and separate them later.
Easily extract text and data from virtually any document
English and Spanish are supported today, with more languages to come
Amazon Translate
A lot of unstructured text in the world -> no time to read it all
Amazon Comprehend helps you comprehend unstructured text by extracting structured information from it.
- extract positive, negative, neutral and mixed sentiment
- extract entities like people, organizations, numbers, dates
- Key phrases gives you the important phrases in the text like “beautiful views” in a hotel review
- English and Spanish for Entities, Sentiment and Keyphrases
- language detection with capability to detect over 100 languages
- The topic modeling API helps you detect topics in a corpus of text. This is an unsupervised algorithm and will work on text in any domain
- We use Deep Learning to power our APIs and this results in higher accuracy and continuous improvement over time with usage.
Syntax:
This API allows customers to tokenize (find word boundaries) words and gives each word its part of speech (noun, pronoun, verb, adjective)
For example, let’s say you’re using the Comprehend Sentiment API and you notice a set of sentences are negative. Using the new Comprehend Syntax API, you can break those sentences into words and their part of speech – from here you can now look the nouns mentioned and the verbs and adjectives used to describe those nouns, allowing you to find out exactly what customers were saying – maybe they didn’t like the color or the price.
1/ Comprehend also has the unique ability to not just look at a single document at a time but to look at millions in order to identify the topics within these docs—we call this TOPIC MODELING
2/ Publisher org articles by subject matter; healthcare by symptom or diagnosis
3/ Comprehend does this in an incredibly efficient manner…For ex, for 300 docs, each around 1MB in size, Comprehend can build a custom topic model in 45 mins for $1.80
4/ Makes it much easier and cost effective to build more intelligent models and actions out of all this data sitting in text
1/ Comprehend also has the unique ability to not just look at a single document at a time but to look at millions in order to identify the topics within these docs—we call this TOPIC MODELING
2/ Publisher org articles by subject matter; healthcare by symptom or diagnosis
3/ Comprehend does this in an incredibly efficient manner…For ex, for 300 docs, each around 1MB in size, Comprehend can build a custom topic model in 45 mins for $1.80
4/ Makes it much easier and cost effective to build more intelligent models and actions out of all this data sitting in text
Can identify complex medical stuff
Can identify PHI and allow you to remove it from data
Easily extract text and data from virtually any document