Transforming Insurance
Analytics with Big Data and
Automated Machine Learning
A formula for higher ROI
Agenda
Mihaela Risca
Sr. Solutions Marketing Manager
Financial Services
Cloudera
Unlocking the Value of Insurance Data
Satadru Sengupta
Gen Mgr. Insurance
DataRobot
Automated Machine Learning – A Formula
for Higher ROI for Insurers
There are two different
alignments of these components
in the market:
• When data and analytics
capability are bundled with
capital, we have an insurance
company.
• When it is bundled with
demand, we have an advisor or
broker
Data is at the center of the Insurance market
Explosion of Data
Why Machine Learning?
• Analytics return $13 for every $1 invested (Nucleus Research)
• Only 12% of data is leveraged for analytics (Forrester)
What is Machine Learning?
Why Big Data + Machine Learning?
• Machine learning thrives on
growing data sets
• Bring disparate data
sources together
• Real time streaming
Machine Learning Use Cases in Insurance
Pricing
Customer Acquisition Underwriting
Marketing, customer
retention, prioritization.
Equating risk and price,
driving life-time value
(LTV)
Prevent Claim Fraud
Underwriting triage:
select the top 10% of the
available risk for further
analysis .
Identifying claims with
highest likelihood of being
fraudulent.
Poll the Audience
Where in your organization you see the most value for introducing
machine learning?
1. Customer acquisition and retention
2. Underwriting/Actuarial
3. Quoting/Claims management
4. Fraud detection and prevention
5. Other
Key Data Management Challenges for Insurers
Fragmented Systems
and Data Silos
Limited Access to
Right Data at the
Right Time
Strategic Decisions
Based on Subsets of
Data
Unable to Tap into
New Data Sources or
Correlate Data from
Multiple Sources
Simultaneously
Disparate View of
Customers, Markets
and Risks
Poor Data Quality
and Lack of
Governance
One Data Platform for Many Applications
Handle real-time
data ingest from
diverse sources
Governance and
Security
Data Streams
Deployment Flexibility
Machine Learning
Capabilities
Diverse Analytical
Options
Combine Data from Different Sources
Data Mgmt. Hub
Scale easily & Cost
effectively
Batch or Real- time
Data Streams
Data Sources
Data Sources
Data Storage &
Processing
Reporting, Analytics &
Auditing
Data Ingest
Other
Data Governance (Data Lineage, Data Protection)
Fitness Car Telematics
Applications
"New technology is transforming the
way we work, and it is allowing the
competition to do better than what we
can. The strange thing is we know the
urgency, and yet there is inertia."
Inga Beale, CEO of Lloyd's of London
February 2017
1. Technology
2. Consumer & Market Economics
3. Data Science & Machine Learning
… and they are interconnected.
Three Strategic Areas of Focus
Machine Learning Applications in Insurance
1. Risk Selection & Pricing
2. Claims, Fraud and Litigation Management
3. Operations and Expenses Management
“machine learning is the secret sauce for the product of
tomorrow.” Google, 2015
Profitable Growth & Managing Expenses
Becoming a 21st Century Insurance Company
Life Insurance Example 1
Underwriting Triage
• Predicted low risk to fast track
process
• Predicted high risk to traditional
underwriting for manual review
Business Impact
• Cost reduction through automation of
reviews of applicants
• Increased likelihood of acquisition
due to fast track underwriting
• Higher underwriting profitability by
targeting the review process on
underwriting loss avoidance
Specific examples from clients
• Predict the likelihood of an insured being in a preferred class or not – as
determined by risk factors such as smoking status, existing condition, terminal
disease
• Predict the most likely class among several classes
Predict mortality risks among patients in remission of cancer:
○ Simplify Underwriting Process: Patients with good health prospects don’t need to go
through a manual medical verification and avoid adverse selection
○ Reduce Costs of Claim by identifying high-risk patients and create more accurate
underwriting rules
ML model predicts patients with
a very high risk of mortality
● 5 times more risky than
average
● Around 10% of patients
Life Insurance Example 2
… InsurTech and Future of Insurance
Machine Learning Strategy: Where It Is Failing?
• A lack of data vision
• Hiring and retaining good data scientists is impossible
• Lack of Inclusiveness: Targeted end-users are not included in
the machine learning problem solving process.
HBR Article : “Stop searching for that elusive Data Scientist”
New Technology Opens Up New Possibilities To Executives
Artificial Intelligence & Automation
makes Machine Learning Affordable,
Pervasive and Inclusive
Poll the Audience
How do you primarily develop and deploy machine learning solutions
in your organization today?
1. Multiple, small data science teams
2. One, big enterprise data science team
3. Outsource to consulting
4. We use automated machine learning
5. We currently don’t use machine learning
Elements of Automated Machine Learning
Smart
● Accurate
● Appeal to experienced data scientists
● Control buttons are accessible to the users
Easy to Use
● Intuitive, fully automated workflow
● Needs minimum inputs but has guardrails
● Interpretable & transparent
● Deployment focused
A 10 min journey to Automated Machine
Learning (AML) using DataRobot Platform
can we predict which patient is coming back to
hospital within the first 30 days?
Demo
What capabilities for DataRobot on Cloudera?
HDFS ingest: DR can utilize data stored in HDFS directly
Hadoop Modeling: Train ML models on the Cloudera data nodes
directly
Hadoop scoring: Any model can then be deployed on Hadoop directly
Distributed (each node scores a data split)
Uses Spark
Cloudera/DataRobot Integration Details
DataRobot has the highest level of integration with Cloudera
Cloudera Parcels A few click to install DR in Cloudera
Manager!
Cloudera CSDs Can use all the functionalities of Cloudera
Manager (monitoring, resource mgmt…)
Kerberos / Sentry Secured authentication
YARN All the resources consumed by DataRobot
are managed by YARN
Spark DataRobot uses Spark for Hadoop scoring
Cloudera/DataRobot Integration Details
Apache Spark Ecosystem with Spark ML lib
Spark MLlib API is available in Scala, Java, and Python programming
languages
Training from Cloudera and DataRobot
● Introduction to Machine Learning - Cloudera Training
https://www.cloudera.com/more/training/courses/intro-machine-learning.html
● Data Science for Executives - DataRobot Training
https://www.datarobot.com/education/for-executives/
● Machine Learning with DataRobot - DataRobot Training
https://www.datarobot.com/education/for-business-analysts/
Learn More & Contact Us
https://www.cloudera.com/solutions/insurance.html
Cloudera
Follow us: @Cloudera
mihaela@cloudera.com
Taneja Group Spark Market Adoption Report : LINK
DataRobot Overview: LINK
https://www.datarobot.com/go/insurance/
Follow us: @DataRobot
satadru@datarobot.com
DataRobot
Executive Briefing: LINK
The Machine Learning Renaissance: LINK
Register for Wrangle Conference: July 20, San Francisco
http://wrangleconf.com/
Thank you
Appendices
Some screenshots
Cloudera - DataRobot Integration
DataRobot - Ease of Deployment on Cloudera
● Deployment
● Mgmt/Monitoring
The DataRobot Service on Cloudera
DataRobot – HDFS Ingest
Copyright © DataRobot, Inc. - All Rights Reserved
DataRobot Modeling on Hadoop
Storage
Application
DR Edge Node
… …
Worker 2
Worker 1
Worker 3
Hadoop Data Node 1
Hadoop Data Node 2
YARN
container
60GB
(Worker 2)
YARN
container
60GB
(Worker 3)
YARN
container
60GB
(Worker 1)
• YARN allocates memory on a data node when a worker wants to train a model
• Each model is trained in memory on an available data node
DataRobot – Cloudera “in-place” Scoring
DataRobot & Cloudera – Seamless LDAP Authentication

Transforming Insurance Analytics with Big Data and Automated Machine Learning


  • 1.
    Transforming Insurance Analytics withBig Data and Automated Machine Learning A formula for higher ROI
  • 2.
    Agenda Mihaela Risca Sr. SolutionsMarketing Manager Financial Services Cloudera Unlocking the Value of Insurance Data Satadru Sengupta Gen Mgr. Insurance DataRobot Automated Machine Learning – A Formula for Higher ROI for Insurers
  • 3.
    There are twodifferent alignments of these components in the market: • When data and analytics capability are bundled with capital, we have an insurance company. • When it is bundled with demand, we have an advisor or broker Data is at the center of the Insurance market
  • 4.
  • 6.
    Why Machine Learning? •Analytics return $13 for every $1 invested (Nucleus Research) • Only 12% of data is leveraged for analytics (Forrester)
  • 7.
    What is MachineLearning?
  • 8.
    Why Big Data+ Machine Learning? • Machine learning thrives on growing data sets • Bring disparate data sources together • Real time streaming
  • 9.
    Machine Learning UseCases in Insurance Pricing Customer Acquisition Underwriting Marketing, customer retention, prioritization. Equating risk and price, driving life-time value (LTV) Prevent Claim Fraud Underwriting triage: select the top 10% of the available risk for further analysis . Identifying claims with highest likelihood of being fraudulent.
  • 10.
    Poll the Audience Wherein your organization you see the most value for introducing machine learning? 1. Customer acquisition and retention 2. Underwriting/Actuarial 3. Quoting/Claims management 4. Fraud detection and prevention 5. Other
  • 11.
    Key Data ManagementChallenges for Insurers Fragmented Systems and Data Silos Limited Access to Right Data at the Right Time Strategic Decisions Based on Subsets of Data Unable to Tap into New Data Sources or Correlate Data from Multiple Sources Simultaneously Disparate View of Customers, Markets and Risks Poor Data Quality and Lack of Governance
  • 12.
    One Data Platformfor Many Applications Handle real-time data ingest from diverse sources Governance and Security Data Streams Deployment Flexibility Machine Learning Capabilities Diverse Analytical Options Combine Data from Different Sources Data Mgmt. Hub Scale easily & Cost effectively Batch or Real- time Data Streams Data Sources Data Sources Data Storage & Processing Reporting, Analytics & Auditing Data Ingest Other Data Governance (Data Lineage, Data Protection) Fitness Car Telematics Applications
  • 13.
    "New technology istransforming the way we work, and it is allowing the competition to do better than what we can. The strange thing is we know the urgency, and yet there is inertia." Inga Beale, CEO of Lloyd's of London February 2017
  • 14.
    1. Technology 2. Consumer& Market Economics 3. Data Science & Machine Learning … and they are interconnected. Three Strategic Areas of Focus
  • 15.
    Machine Learning Applicationsin Insurance 1. Risk Selection & Pricing 2. Claims, Fraud and Litigation Management 3. Operations and Expenses Management “machine learning is the secret sauce for the product of tomorrow.” Google, 2015
  • 16.
    Profitable Growth &Managing Expenses Becoming a 21st Century Insurance Company
  • 17.
    Life Insurance Example1 Underwriting Triage • Predicted low risk to fast track process • Predicted high risk to traditional underwriting for manual review Business Impact • Cost reduction through automation of reviews of applicants • Increased likelihood of acquisition due to fast track underwriting • Higher underwriting profitability by targeting the review process on underwriting loss avoidance Specific examples from clients • Predict the likelihood of an insured being in a preferred class or not – as determined by risk factors such as smoking status, existing condition, terminal disease • Predict the most likely class among several classes
  • 18.
    Predict mortality risksamong patients in remission of cancer: ○ Simplify Underwriting Process: Patients with good health prospects don’t need to go through a manual medical verification and avoid adverse selection ○ Reduce Costs of Claim by identifying high-risk patients and create more accurate underwriting rules ML model predicts patients with a very high risk of mortality ● 5 times more risky than average ● Around 10% of patients Life Insurance Example 2
  • 19.
    … InsurTech andFuture of Insurance
  • 20.
    Machine Learning Strategy:Where It Is Failing? • A lack of data vision • Hiring and retaining good data scientists is impossible • Lack of Inclusiveness: Targeted end-users are not included in the machine learning problem solving process. HBR Article : “Stop searching for that elusive Data Scientist”
  • 21.
    New Technology OpensUp New Possibilities To Executives Artificial Intelligence & Automation makes Machine Learning Affordable, Pervasive and Inclusive
  • 22.
    Poll the Audience Howdo you primarily develop and deploy machine learning solutions in your organization today? 1. Multiple, small data science teams 2. One, big enterprise data science team 3. Outsource to consulting 4. We use automated machine learning 5. We currently don’t use machine learning
  • 23.
    Elements of AutomatedMachine Learning Smart ● Accurate ● Appeal to experienced data scientists ● Control buttons are accessible to the users Easy to Use ● Intuitive, fully automated workflow ● Needs minimum inputs but has guardrails ● Interpretable & transparent ● Deployment focused
  • 24.
    A 10 minjourney to Automated Machine Learning (AML) using DataRobot Platform can we predict which patient is coming back to hospital within the first 30 days? Demo
  • 25.
    What capabilities forDataRobot on Cloudera? HDFS ingest: DR can utilize data stored in HDFS directly Hadoop Modeling: Train ML models on the Cloudera data nodes directly Hadoop scoring: Any model can then be deployed on Hadoop directly Distributed (each node scores a data split) Uses Spark
  • 26.
    Cloudera/DataRobot Integration Details DataRobothas the highest level of integration with Cloudera Cloudera Parcels A few click to install DR in Cloudera Manager! Cloudera CSDs Can use all the functionalities of Cloudera Manager (monitoring, resource mgmt…) Kerberos / Sentry Secured authentication YARN All the resources consumed by DataRobot are managed by YARN Spark DataRobot uses Spark for Hadoop scoring
  • 27.
  • 28.
    Apache Spark Ecosystemwith Spark ML lib Spark MLlib API is available in Scala, Java, and Python programming languages
  • 29.
    Training from Clouderaand DataRobot ● Introduction to Machine Learning - Cloudera Training https://www.cloudera.com/more/training/courses/intro-machine-learning.html ● Data Science for Executives - DataRobot Training https://www.datarobot.com/education/for-executives/ ● Machine Learning with DataRobot - DataRobot Training https://www.datarobot.com/education/for-business-analysts/
  • 30.
    Learn More &Contact Us https://www.cloudera.com/solutions/insurance.html Cloudera Follow us: @Cloudera mihaela@cloudera.com Taneja Group Spark Market Adoption Report : LINK DataRobot Overview: LINK https://www.datarobot.com/go/insurance/ Follow us: @DataRobot satadru@datarobot.com DataRobot Executive Briefing: LINK The Machine Learning Renaissance: LINK Register for Wrangle Conference: July 20, San Francisco http://wrangleconf.com/
  • 31.
  • 32.
  • 33.
    DataRobot - Easeof Deployment on Cloudera ● Deployment ● Mgmt/Monitoring
  • 34.
  • 36.
  • 37.
    Copyright © DataRobot,Inc. - All Rights Reserved DataRobot Modeling on Hadoop Storage Application DR Edge Node … … Worker 2 Worker 1 Worker 3 Hadoop Data Node 1 Hadoop Data Node 2 YARN container 60GB (Worker 2) YARN container 60GB (Worker 3) YARN container 60GB (Worker 1) • YARN allocates memory on a data node when a worker wants to train a model • Each model is trained in memory on an available data node
  • 38.
    DataRobot – Cloudera“in-place” Scoring
  • 39.
    DataRobot & Cloudera– Seamless LDAP Authentication