SlideShare a Scribd company logo
1 of 27
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
NLP in Healthcare to Predict
Adverse Events with Amazon
SageMaker
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
A I M 3 4 6
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Goal
Learn how to apply machine learning methods to predict adverse
events from reported patient data
… and much more
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Background
• Pharmacovigilance and patient safety programs
• Adverse events and FDA regulations
• FAERS
• Workable data
• Call center recording / summaries
• Emails / faxes
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Adverse event detection – The challenge
• Disparate data types
• Unstructured data
• Understanding semantic
dispositions
• Synonyms, spelling mistakes
• Sentiment detection
• Categorizing interactions
• Various data sources
• Meeting compliance
objectives
• True positives, “sleeping doctor”
• Scale, enormous scale
• Cost efficiency
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning to the rescue
• Improve accuracy and
reliability
• Doesn’t replace humans – aids
humans
• Offload repetitive work – humans
can handle edge cases
• Decrease costs
• Repurpose human workforce for
‘value-adding’ endeavors
• Keep up with ongoing
research
• Incorporate published articles at
scale
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning – The process
Fetch data
Clean &
format data
Prepare &
transform
data
Train model
Evaluate
model
Integrate
with prod
Monitor /
debug /
refresh
Data wrangling
• Set up and manage Notebook
environments
• Get data to notebooks securely
Experimentation
• Set up and manage clusters
• Scale/distribute ML algorithms
Deployment
• Set up and manage
inference clusters
• Manage and auto scale
inference APIs
• Testing, versioning, and
monitoring 6-18
months
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A managed service
that provides one of the quickest and easiest ways for
your data scientists and developers to get
ML models from idea to production
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing Amazon SageMaker
End-to-end
machine learning
platform
Zero setup Flexible model
training - bring
your own deep
learning script
Pay by the
second
Or your custom
algorithm
Docker image
One step
deployment
A/B testing Low latency,
high
throughput,
high reliability
Choice of several
ML algorithms
Train faster, in
a single pass
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing Amazon SageMaker
Choice of several
ML algorithms
XGBoost, FM,
and Linear for
classification
and regression
K-means and
PCA for
clustering and
dimensionality
reduction
LDA and NTM
for topic
modeling,
seq2seq for
translation
Image
classification
with
convolutional
neural
networks
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Natural language processing methods
• Dataset preprocessing - feature generators
• Latent Dirichlet Analysis (LDA)
• Comprehend topic modeling
• BlazingText word embeddings
• Classification - algorithms utilized
• K-nearest neighbors
• Logistic regression
• XGBoost
• Amazon SageMaker BlazingText Classifier
• Deep convolutional neural network running on TensorFlow and Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Preprocessing
• Data sources
• Call center summaries
• Stored in Amazon Simple Storage Service (Amazon S3)
• Preprocessing
• Lemmatization with Natural Language Toolkit (NLTK)
• BlazingText with Amazon SageMaker
Using BlazingText, reduced the preprocessing time by 10x
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker tooling
• TensorFlow and Keras
• “Bring your own model”
• Convolutional neural network
• Built-in algorithms
• Automatic model tuning
• Spinning out many jobs simultaneously
• Amazon CloudWatch and TensorBoard
• Monitoring instances and accuracy metrics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Architecture
VPC
Private subnet
AWS Cloud
Availability zone 1
AWS Region Raw data and
model artifacts
Production
data
Availability zone 2
Private subnet
Training Deployment
Training Deployment
Auto
Scaling
group
Auto
Scaling
group
Endpoint
Endpoint
Results by algorithm
Feature generator Classifier Accuracy AUC
False
positive
rate
False
negative rate
Precision Recall Sensitivity Specificity
LDA
(Latent Dirichlet Allocation)
kNN 0.775 0.767 0.182 0.288 0.729 0.712 0.712 0.818
Logistic regression 0.728 0.787 0.277 0.257 0.485 0.743 0.743 0.723
XGBoost 0.812 0.905 0.152 0.240 0.774 0.759 0.759 0.848
Comprehend topic modeling
kNN 0.759 0.718 0.254 0.189 0.516 0.811 0.811 0.742
Logistic regression 0.516 0.892 0.395 0.602 0.433 0.398 0.398 0.605
XGBoost 0.855 0.936 0.069 0.230 0.908 0.769 0.769 0.931
Amazon SageMaker
BlazingText
BlazingText Classifier 0.979 0.997 0.023 0.020 0.980 0.985 0.985 0.970
Amazon SageMaker Deep
CNN
0.978 0.998 0.021 0.020 0.978 0.982 0.982 0.972
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings
overview
• Plot the top 5000 most common terms
• Terms overlap with semantically similar
terms
• Models leverage these semantics for
computation and performance
• Will look at terms in two sections of the
word embedding space
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings: Zoomed in – Part 1
• Model has learned
important familial
and patient
relationships,
including caregivers
and reporters
• Robust to typos:
Patient, Pateint, Pt
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BlazingText embeddings: Zoomed in – Part 2
• Model has learned
important side effects
and adverse drug
reactions
• Types of reactions are
even clustered
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost
Service Resources used Pricing dimension Cost
Amazon S3 50 GB for one month $0.023 per GB-month $1.15
Amazon EFS Storage $3
$1.3
$0.0714 per instance-minute $8.55
$0.021 per instance-minute $0.084
Total $14.08 ($0.11 per 1000 predictions)*
What does it cost to run this model?
Amazon SageMaker on-demand ML instances let you pay for machine learning compute capacity by the second, with a one-minute minimum, with no long-term
commitments.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
To learn more…
• Amazon SageMaker here
• Blogs:
• Enhanced text classification and word vectors using Amazon SageMaker BlazingText
• https://tinyurl.com/sagemaker-blazingtext
• Bring your own pre-trained MXNet or TensorFlow models into Amazon SageMaker
• https://tinyurl.com/sagemaker-byom
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Questions?
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Garin Kessler
Data Scientist
AWS Machine Learning Solutions Lab
Mayank Thakkar
Life Sciences Specialist
AWS Solutions Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Reporting Methods _ Global Pharmacovigilance1
Reporting Methods _ Global Pharmacovigilance1Reporting Methods _ Global Pharmacovigilance1
Reporting Methods _ Global Pharmacovigilance1
Hafsa Hafeez
 
Pharmacovigilance in real life may 12
Pharmacovigilance in real life   may 12Pharmacovigilance in real life   may 12
Pharmacovigilance in real life may 12
Doctors.net.uk
 
Planning for the New Individual Case Safety Report (ICSR) International Stand...
Planning for the New Individual Case Safety Report (ICSR) International Stand...Planning for the New Individual Case Safety Report (ICSR) International Stand...
Planning for the New Individual Case Safety Report (ICSR) International Stand...
Perficient
 
Best Practices on Medical Coding in MedDRA
Best Practices on Medical Coding in MedDRABest Practices on Medical Coding in MedDRA
Best Practices on Medical Coding in MedDRA
Perficient
 

What's hot (20)

Clinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol LabsClinical Data Management Training @ Gratisol Labs
Clinical Data Management Training @ Gratisol Labs
 
Reporting Methods _ Global Pharmacovigilance1
Reporting Methods _ Global Pharmacovigilance1Reporting Methods _ Global Pharmacovigilance1
Reporting Methods _ Global Pharmacovigilance1
 
Pharmacovigilance in real life may 12
Pharmacovigilance in real life   may 12Pharmacovigilance in real life   may 12
Pharmacovigilance in real life may 12
 
Clinical Trial Management System Implementation Guide
Clinical Trial Management System Implementation GuideClinical Trial Management System Implementation Guide
Clinical Trial Management System Implementation Guide
 
Pharmacovigilance Process Work Flow - Katalyst HLS
Pharmacovigilance Process Work Flow - Katalyst HLSPharmacovigilance Process Work Flow - Katalyst HLS
Pharmacovigilance Process Work Flow - Katalyst HLS
 
clinical data management
clinical data managementclinical data management
clinical data management
 
Signal Detection & Management Strategies
Signal Detection & Management StrategiesSignal Detection & Management Strategies
Signal Detection & Management Strategies
 
Planning for the New Individual Case Safety Report (ICSR) International Stand...
Planning for the New Individual Case Safety Report (ICSR) International Stand...Planning for the New Individual Case Safety Report (ICSR) International Stand...
Planning for the New Individual Case Safety Report (ICSR) International Stand...
 
CLINICAL DATA MANGEMENT (CDM)
CLINICAL DATA MANGEMENT(CDM)CLINICAL DATA MANGEMENT(CDM)
CLINICAL DATA MANGEMENT (CDM)
 
Importance of aggregate reporting in pharmacovigilance
Importance of aggregate reporting in pharmacovigilanceImportance of aggregate reporting in pharmacovigilance
Importance of aggregate reporting in pharmacovigilance
 
Clinical data management
Clinical data management Clinical data management
Clinical data management
 
Clinical trial protocol: strategy for success
Clinical trial protocol: strategy for successClinical trial protocol: strategy for success
Clinical trial protocol: strategy for success
 
Good Clinical Practice
Good Clinical PracticeGood Clinical Practice
Good Clinical Practice
 
define_xml_tutorial .ppt
define_xml_tutorial .pptdefine_xml_tutorial .ppt
define_xml_tutorial .ppt
 
Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation
 
Best Practices on Medical Coding in MedDRA
Best Practices on Medical Coding in MedDRABest Practices on Medical Coding in MedDRA
Best Practices on Medical Coding in MedDRA
 
Efficient Data Reviews and Quality in Clinical Trials - Kelci Miclaus
Efficient Data Reviews and Quality in Clinical Trials - Kelci MiclausEfficient Data Reviews and Quality in Clinical Trials - Kelci Miclaus
Efficient Data Reviews and Quality in Clinical Trials - Kelci Miclaus
 
Introduction to clinical sas
Introduction to clinical sasIntroduction to clinical sas
Introduction to clinical sas
 
Introduction to clinical data management
Introduction to clinical data managementIntroduction to clinical data management
Introduction to clinical data management
 
MySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELKMySQL Audit using Percona audit plugin and ELK
MySQL Audit using Percona audit plugin and ELK
 

Similar to NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018

AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
Amazon Web Services Korea
 

Similar to NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018 (20)

Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 
Enabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the CloudEnabling Sustainable Research Platforms in the Cloud
Enabling Sustainable Research Platforms in the Cloud
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Introducing Amazon SageMaker
Introducing Amazon SageMakerIntroducing Amazon SageMaker
Introducing Amazon SageMaker
 
Serverless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best PracticesServerless on AWS: Architectural Patterns and Best Practices
Serverless on AWS: Architectural Patterns and Best Practices
 
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
NEW LAUNCH! Introducing Amazon SageMaker - MCL365 - re:Invent 2017
 
Fraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWSFraud Prevention and Detection on AWS
Fraud Prevention and Detection on AWS
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

NLP in Healthcare to Predict Adverse Events with Amazon SageMaker (AIM346) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. NLP in Healthcare to Predict Adverse Events with Amazon SageMaker Garin Kessler Data Scientist AWS Machine Learning Solutions Lab A I M 3 4 6 Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Goal Learn how to apply machine learning methods to predict adverse events from reported patient data … and much more
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Background • Pharmacovigilance and patient safety programs • Adverse events and FDA regulations • FAERS • Workable data • Call center recording / summaries • Emails / faxes
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Adverse event detection – The challenge • Disparate data types • Unstructured data • Understanding semantic dispositions • Synonyms, spelling mistakes • Sentiment detection • Categorizing interactions • Various data sources • Meeting compliance objectives • True positives, “sleeping doctor” • Scale, enormous scale • Cost efficiency
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning to the rescue • Improve accuracy and reliability • Doesn’t replace humans – aids humans • Offload repetitive work – humans can handle edge cases • Decrease costs • Repurpose human workforce for ‘value-adding’ endeavors • Keep up with ongoing research • Incorporate published articles at scale
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning – The process Fetch data Clean & format data Prepare & transform data Train model Evaluate model Integrate with prod Monitor / debug / refresh Data wrangling • Set up and manage Notebook environments • Get data to notebooks securely Experimentation • Set up and manage clusters • Scale/distribute ML algorithms Deployment • Set up and manage inference clusters • Manage and auto scale inference APIs • Testing, versioning, and monitoring 6-18 months
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A managed service that provides one of the quickest and easiest ways for your data scientists and developers to get ML models from idea to production Amazon SageMaker
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon SageMaker End-to-end machine learning platform Zero setup Flexible model training - bring your own deep learning script Pay by the second Or your custom algorithm Docker image One step deployment A/B testing Low latency, high throughput, high reliability Choice of several ML algorithms Train faster, in a single pass
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon SageMaker Choice of several ML algorithms XGBoost, FM, and Linear for classification and regression K-means and PCA for clustering and dimensionality reduction LDA and NTM for topic modeling, seq2seq for translation Image classification with convolutional neural networks
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Natural language processing methods • Dataset preprocessing - feature generators • Latent Dirichlet Analysis (LDA) • Comprehend topic modeling • BlazingText word embeddings • Classification - algorithms utilized • K-nearest neighbors • Logistic regression • XGBoost • Amazon SageMaker BlazingText Classifier • Deep convolutional neural network running on TensorFlow and Amazon SageMaker
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Preprocessing • Data sources • Call center summaries • Stored in Amazon Simple Storage Service (Amazon S3) • Preprocessing • Lemmatization with Natural Language Toolkit (NLTK) • BlazingText with Amazon SageMaker Using BlazingText, reduced the preprocessing time by 10x
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker tooling • TensorFlow and Keras • “Bring your own model” • Convolutional neural network • Built-in algorithms • Automatic model tuning • Spinning out many jobs simultaneously • Amazon CloudWatch and TensorBoard • Monitoring instances and accuracy metrics
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Architecture VPC Private subnet AWS Cloud Availability zone 1 AWS Region Raw data and model artifacts Production data Availability zone 2 Private subnet Training Deployment Training Deployment Auto Scaling group Auto Scaling group Endpoint Endpoint
  • 19. Results by algorithm Feature generator Classifier Accuracy AUC False positive rate False negative rate Precision Recall Sensitivity Specificity LDA (Latent Dirichlet Allocation) kNN 0.775 0.767 0.182 0.288 0.729 0.712 0.712 0.818 Logistic regression 0.728 0.787 0.277 0.257 0.485 0.743 0.743 0.723 XGBoost 0.812 0.905 0.152 0.240 0.774 0.759 0.759 0.848 Comprehend topic modeling kNN 0.759 0.718 0.254 0.189 0.516 0.811 0.811 0.742 Logistic regression 0.516 0.892 0.395 0.602 0.433 0.398 0.398 0.605 XGBoost 0.855 0.936 0.069 0.230 0.908 0.769 0.769 0.931 Amazon SageMaker BlazingText BlazingText Classifier 0.979 0.997 0.023 0.020 0.980 0.985 0.985 0.970 Amazon SageMaker Deep CNN 0.978 0.998 0.021 0.020 0.978 0.982 0.982 0.972
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings overview • Plot the top 5000 most common terms • Terms overlap with semantically similar terms • Models leverage these semantics for computation and performance • Will look at terms in two sections of the word embedding space
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings: Zoomed in – Part 1 • Model has learned important familial and patient relationships, including caregivers and reporters • Robust to typos: Patient, Pateint, Pt
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BlazingText embeddings: Zoomed in – Part 2 • Model has learned important side effects and adverse drug reactions • Types of reactions are even clustered
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cost Service Resources used Pricing dimension Cost Amazon S3 50 GB for one month $0.023 per GB-month $1.15 Amazon EFS Storage $3 $1.3 $0.0714 per instance-minute $8.55 $0.021 per instance-minute $0.084 Total $14.08 ($0.11 per 1000 predictions)* What does it cost to run this model? Amazon SageMaker on-demand ML instances let you pay for machine learning compute capacity by the second, with a one-minute minimum, with no long-term commitments.
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. To learn more… • Amazon SageMaker here • Blogs: • Enhanced text classification and word vectors using Amazon SageMaker BlazingText • https://tinyurl.com/sagemaker-blazingtext • Bring your own pre-trained MXNet or TensorFlow models into Amazon SageMaker • https://tinyurl.com/sagemaker-byom
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Questions? Garin Kessler Data Scientist AWS Machine Learning Solutions Lab Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Garin Kessler Data Scientist AWS Machine Learning Solutions Lab Mayank Thakkar Life Sciences Specialist AWS Solutions Architecture
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.