SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
Krish Rajaram & Bryn Clarke, Nationwide Insurance
Deploying Enterprise
Scale Deep Learning in
Actuarial Modeling
#UnifiedAnalytics #SparkAISummit
Agenda
• About Nationwide
• About Enterprise Data Office
• Nationwide’s journey with Databricks
• Use case deep dive
3#UnifiedAnalytics #SparkAISummit
in 457 retirement plans, based
on number of plans
PLANSPONSOR, 2017 Recordkeeping Survey
Total small business insurer
Conning, 2014; Conning Strategic Study: The Small Business Sector for
Property-Casualty Insurance: Market Shift Coming
Writer of farms and ranches
A.M. Best, 2016 DWP
#1
8th largest auto insurer
A.M. Best, 2016 DWP
2nd largest
Domestic specialty (Excess & Surplus)
commercial lines insurer
A.M. Best, 2016 DWP
Nationwide is committing more than
$100 million
Of venture capital to invent and reinvent
customer-centric solutions.
#9 provider
of defined
contribution
retirement plans
PLANSPONSOR, 2017 Recordkeeping Survey 7th largest
homeowners insurer
A.M. Best, 2016 DWP
#1 pet
insurer
North American Pet Health
Insurance Assn., 2016
7th largest
writer of variable
annuities
Morningstar, YE 2016, Based
on total flows
#1 writer
of corporate life
IBIS Associates, Inc.,
February 2018
7th largest
commercial
lines insurer
A.M. Best, 2016 DWP
8th largest
life insurer
LIMRA, YE 2016.
Based on total premiums
Nationwide Ranks with the Best
FORTUNE 100 Best Companies to Work For
Black Enterprise 50 Best Companies for Diversity
2018 Catalyst Award honoree
Human Rights Campaign Best Place to Work for
LGBTQ Equality
$49 billion in total sales/direct written premium
$26.9 billion in net operating revenue
$1.2 billion in net operating income
$225.5 billion in total assets
A+A.M. Best
received 10/17/2002
affirmed 10/2/2017
A+Standard & Poor’s
received 12/22/2008
affirmed 5/24/2017
A1Moody’s
received 3/10/2009
affirmed 11/7/2017
Fortune 100 Company
FINANCIAL SERVICES COMMERCIAL LINES PERSONAL LINES
Individual Life
Annuities
Retirement Plans
Corporate Life
Mutual Funds
Banking
Standard Commercial
Farm and Ranch
Commercial Agribusiness
Excess and
Surplus/Specialty
Standard Auto
Homeowners and Renters
Pet
Sport Vehicles
Personal Liability
Lines of Business
Manages relationships with
our IT and business
partners
Oversees and optimizes data
integrity, availability, usability,
and trustworthiness
Owns the One Nationwide
data strategy, enabling the
Enterprise’s ability to leverage
data as a competitive asset
Manages Enterprise data
allowing for insights into
business activities, enabling
achievement of business
goals
Deploys data and analytics
tools and processes to solve
complex business problems.
Enterprise Data Office
Chief Data Officer
(CDO)
Data Advisory
Services
Purpose:
Give data a voice
Mission:
The EDO is dedicated to empowering the business of
Nationwide by delivering trusted solutions through
complete data & analytics services.
Data Governance
and Quality
Assurance
Data Architecture
and Strategy
Data Management
Data Analytics
and Decision
Sciences
Databricks
Notebook
8
Read Data
EDA/ Data
Prep
Model
training
Validate/
re-train
Feature
Engineer Communicate/
Export model
Terminate
the cluster
Start/Create
Clusters
SSO
Visualize
IT Architecture team identified the
growing need for a minimally
governed, self-provisioned scalable
environment for conducting
analytical experiments.
Analytical Lab
Databricks deployment at Nationwide
Data Plane
Web Frontend w/ SSO
AWS Account
Control Plane
Support
(Access Genie)
On Prem Data sources
Hadoop,
SQL databases
Users (admin,
Data Scientist,
Engineers)
VGW
Business
Partner
Extranet
O/B Access
Download packages
3rd party datasets
Databricks
CLI
AWS Data sources
RedShift, RDS,
DynamoDB etc
MPLS
WAN
Data in S3 buckets
10
Databricks adoption at Nationwide
Information worker Data Analyst Data Engineer Data Scientist
R & R Studio Python & Jupyter SAS/ SAS Grid
IBM SPSS H2O, DriverlessAI Tensorflow
Hadoop/Hive/Spark/Zeppelin SQL
Excel Access SAS
Tableau Paxata Python/R
SQL
51
2
Efficiency gain
De-risk
Revenue Generating
Use Cases
Not applicable
Databricks adoption Databricks adoption
Experiment Dev Prod
Databricks adoption
Experiment Dev Prod
Databricks adoption
Experiment Dev Prod
Well Known Variable
Data Sources Data Sources Data Sources Data Sources
Well Known Variable Well Known Variable Well Known Variable
Standard Emerging Specialized
Tools Tools Tools Tools
Standard Emerging Specialized Standard Emerging Specialized Standard Emerging Specialized
1-3 4-5 6+
Number of tools Number of tools Number of tools Number of tools
1-3 4-5 6+ 1-3 4-5 6+ 1-3 4-5 6+
11
Utilizing methodologies to accelerate decision-making...
...by leveraging cutting edge data & technology
• Ensembled Machine
Learning
• Traditional Statistical
Learning
• Deep Learning
• Time Series Forecasting
Statistical
Modeling
AI &
Machine Learning
• Text & Speech Analytics
• GPU Acceleration
• Recommender
• Regression
Experimental
Design
Modeling &
Forecasting
• Bayesian Hierarchical
Modeling
• Segmentation Modeling
• Survivor Modeling
• Model-as-Service
Data Technology
• Nationwide internal
data
• Social
• Demographic
• Geographic
• Financial
• Macro-economic
• R
• Python
• H20
• Tableau
• Java
• SPSS
Modeler
• Tensorflow
What is the tangible benefit of our data
product solutions?
• Tailored support combing business
knowledge & statistical expertise
• Easy understanding of the data for instant &
actionable usability
• Automated & seamless access that
integrates with your processes
• Scalable utility solving advanced analytical
problems across domains
We deliver wisdom in data
by interacting with partners
to translate problems into
analytical solutions
Enterprise Analytics Office
Focus Use Case
• Predict insurance claims frequency and
severity (average cost of claims)
• Large dataset (100s of millions of records)
• Volatile data
– Insurance claims are infrequent
– Most often arise due to chance
12#UnifiedAnalytics #SparkAISummit
Traditional Approach
• Batch (1-5 years) of data aggregated across
linear predictors (state, vehicle model year,
driver age, etc.)
• Trained actuary fits a Generalized Linear Model
(GLM) to determine slope/intercept for each
linear predictor
• Result is a multiplicative “rating plan”
13#UnifiedAnalytics #SparkAISummit
Novel Approach
• Deep learning (hierarchical neural network)
• Adequately models non-linearity of latent
variables
• Multiple heads
– Frequency & Severity
– Coverage Type & Cause of Loss
• Compare to traditional GLM
14#UnifiedAnalytics #SparkAISummit
Performance Evaluation
• Custom loss functions
– Poisson, Gamma negative loglikelihood
• Custom metric functions
– Normalized Gini index / AUC
• Online monitoring using TensorBoard
15#UnifiedAnalytics #SparkAISummit
Model Search Space is Vast
• Size and number of layers
• Embedding dimensionality
• Activation functions (ReLU, tanh, linear)
• Regularization (L1/L2, dropout)
• Many others (autoencoder, combining levels of
prediction, skip connections, etc.)
16#UnifiedAnalytics #SparkAISummit
Why Spark?
• Many aspects of preprocessing are
embarrassingly parallel
– Conversion between data formats (SAS, CSV,
Parquet, TFRecords)
– Encoding of category labels
• Scoring is also embarrassingly parallel
• Primary limitation is hyperparameter/model
configuration search
17#UnifiedAnalytics #SparkAISummit
pandas
Why Spark?
18#UnifiedAnalytics #SparkAISummit
sas7bdat
csv
parquet
spark-sas7bdat
memmapNumPy
TFRecords
featurized
memmapPython
featurized
parquet
Spark SQL
PySpark
LOCAL BATCHES DISTRIBUTED
Benchmark Timings
* Utilizing Spark we are able to test many model configurations concurrently. In a local workstation
environment, each configuration needs to be tested consecutively.
19#UnifiedAnalytics #SparkAISummit
Local Workstation Spark
CSV Conversion 10-12 hrs < 5 mins
Random Shuffling ~ 8 hrs < 5 mins
Featurization ~ 5 hrs 20 mins
TFRecords Examples ~ 5 hrs < 5 mins
Model Training ~ 6 hrs ~ 3 hrs (single node*)
Model Scoring ~ 3 hrs < 5 mins
Lessons Learned
• Loading/exporting data
• Conversion of the model from Keras to TensorFlow
• Initializing TensorFlow models on individual nodes
• Using goofys mounts
• Syncing with DBFS to store model checkpoints
• Utilizing Databricks Jobs/notebook parameters
20#UnifiedAnalytics #SparkAISummit
Conclusion & Next Steps
• Utilizing Spark on Databricks with ML runtime
we reduced the modeling pipeline timings from ~
34 hours to less than 4hrs
• Further opportunity exists in utilizing Horovod for
multi-GPU training
– Reduce time needed to evaluate individual model
configurations
21#UnifiedAnalytics #SparkAISummit
General Observation
• Using Databricks ML Runtime, Notebooks, scalable
compute instances and scheduling features we were
able to rapidly prototype the methodology.
• Work in progress for Path to production and
integration with current Model deployment
framework.
• Challenging to predict DBU consumption by different
business units; hence difficult to forecast cost.
• No automatic integration with GitHub enterprise.
22#UnifiedAnalytics #SparkAISummit
Questions?
23#UnifiedAnalytics #SparkAISummit
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

Mais conteúdo relacionado

Mais procurados

Retail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha JarettRetail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha JarettDaniel Zivkovic
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDenodo
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Scaling Data Quality @ Netflix
Scaling Data Quality @ NetflixScaling Data Quality @ Netflix
Scaling Data Quality @ NetflixMichelle Ufford
 
[AWS Migration Workshop] 데이터센터의 SAP를 AWS로 마이그레이션 하기
[AWS Migration Workshop]  데이터센터의 SAP를 AWS로 마이그레이션 하기[AWS Migration Workshop]  데이터센터의 SAP를 AWS로 마이그레이션 하기
[AWS Migration Workshop] 데이터센터의 SAP를 AWS로 마이그레이션 하기Amazon Web Services Korea
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
21- Self-Hosted Integration Runtime in Azure Data Factory.pptx
21- Self-Hosted Integration Runtime in Azure Data Factory.pptx21- Self-Hosted Integration Runtime in Azure Data Factory.pptx
21- Self-Hosted Integration Runtime in Azure Data Factory.pptxBRIJESH KUMAR
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services MarketplaceDenodo
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 

Mais procurados (20)

Retail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha JarettRetail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data Organizations
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
 
Scaling Data Quality @ Netflix
Scaling Data Quality @ NetflixScaling Data Quality @ Netflix
Scaling Data Quality @ Netflix
 
Informatica Cloud Overview
Informatica Cloud OverviewInformatica Cloud Overview
Informatica Cloud Overview
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
[AWS Migration Workshop] 데이터센터의 SAP를 AWS로 마이그레이션 하기
[AWS Migration Workshop]  데이터센터의 SAP를 AWS로 마이그레이션 하기[AWS Migration Workshop]  데이터센터의 SAP를 AWS로 마이그레이션 하기
[AWS Migration Workshop] 데이터센터의 SAP를 AWS로 마이그레이션 하기
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
21- Self-Hosted Integration Runtime in Azure Data Factory.pptx
21- Self-Hosted Integration Runtime in Azure Data Factory.pptx21- Self-Hosted Integration Runtime in Azure Data Factory.pptx
21- Self-Hosted Integration Runtime in Azure Data Factory.pptx
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services Marketplace
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 

Semelhante a Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide

ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityDATAVERSITY
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Databricks
 
Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataDenodo
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBMongoDB
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...Amazon Web Services Korea
 
Virtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleVirtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleDenodo
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Denodo
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)Vishal Bamba
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin RobbinsData Con LA
 

Semelhante a Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide (20)

Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
 
Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP Data
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
 
Virtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise ScaleVirtual Sandbox for Data Scientists at Enterprise Scale
Virtual Sandbox for Data Scientists at Enterprise Scale
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Big Data Analyst at BankofAmerica
Big Data Analyst at BankofAmericaBig Data Analyst at BankofAmerica
Big Data Analyst at BankofAmerica
 
Meetup Spark UDF performance
Meetup Spark UDF performanceMeetup Spark UDF performance
Meetup Spark UDF performance
 
S&OP as a Service.pdf
S&OP as a Service.pdfS&OP as a Service.pdf
S&OP as a Service.pdf
 
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)Customer Intelligence_ Harnessing Elephants at Transamerica    Presentation (1)
Customer Intelligence_ Harnessing Elephants at Transamerica Presentation (1)
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 

Mais de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityDatabricks
 

Mais de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 

Último

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 

Último (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide

  • 1. WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
  • 2. Krish Rajaram & Bryn Clarke, Nationwide Insurance Deploying Enterprise Scale Deep Learning in Actuarial Modeling #UnifiedAnalytics #SparkAISummit
  • 3. Agenda • About Nationwide • About Enterprise Data Office • Nationwide’s journey with Databricks • Use case deep dive 3#UnifiedAnalytics #SparkAISummit
  • 4. in 457 retirement plans, based on number of plans PLANSPONSOR, 2017 Recordkeeping Survey Total small business insurer Conning, 2014; Conning Strategic Study: The Small Business Sector for Property-Casualty Insurance: Market Shift Coming Writer of farms and ranches A.M. Best, 2016 DWP #1 8th largest auto insurer A.M. Best, 2016 DWP 2nd largest Domestic specialty (Excess & Surplus) commercial lines insurer A.M. Best, 2016 DWP Nationwide is committing more than $100 million Of venture capital to invent and reinvent customer-centric solutions. #9 provider of defined contribution retirement plans PLANSPONSOR, 2017 Recordkeeping Survey 7th largest homeowners insurer A.M. Best, 2016 DWP #1 pet insurer North American Pet Health Insurance Assn., 2016 7th largest writer of variable annuities Morningstar, YE 2016, Based on total flows #1 writer of corporate life IBIS Associates, Inc., February 2018 7th largest commercial lines insurer A.M. Best, 2016 DWP 8th largest life insurer LIMRA, YE 2016. Based on total premiums Nationwide Ranks with the Best
  • 5. FORTUNE 100 Best Companies to Work For Black Enterprise 50 Best Companies for Diversity 2018 Catalyst Award honoree Human Rights Campaign Best Place to Work for LGBTQ Equality $49 billion in total sales/direct written premium $26.9 billion in net operating revenue $1.2 billion in net operating income $225.5 billion in total assets A+A.M. Best received 10/17/2002 affirmed 10/2/2017 A+Standard & Poor’s received 12/22/2008 affirmed 5/24/2017 A1Moody’s received 3/10/2009 affirmed 11/7/2017 Fortune 100 Company
  • 6. FINANCIAL SERVICES COMMERCIAL LINES PERSONAL LINES Individual Life Annuities Retirement Plans Corporate Life Mutual Funds Banking Standard Commercial Farm and Ranch Commercial Agribusiness Excess and Surplus/Specialty Standard Auto Homeowners and Renters Pet Sport Vehicles Personal Liability Lines of Business
  • 7. Manages relationships with our IT and business partners Oversees and optimizes data integrity, availability, usability, and trustworthiness Owns the One Nationwide data strategy, enabling the Enterprise’s ability to leverage data as a competitive asset Manages Enterprise data allowing for insights into business activities, enabling achievement of business goals Deploys data and analytics tools and processes to solve complex business problems. Enterprise Data Office Chief Data Officer (CDO) Data Advisory Services Purpose: Give data a voice Mission: The EDO is dedicated to empowering the business of Nationwide by delivering trusted solutions through complete data & analytics services. Data Governance and Quality Assurance Data Architecture and Strategy Data Management Data Analytics and Decision Sciences
  • 8. Databricks Notebook 8 Read Data EDA/ Data Prep Model training Validate/ re-train Feature Engineer Communicate/ Export model Terminate the cluster Start/Create Clusters SSO Visualize IT Architecture team identified the growing need for a minimally governed, self-provisioned scalable environment for conducting analytical experiments. Analytical Lab
  • 9. Databricks deployment at Nationwide Data Plane Web Frontend w/ SSO AWS Account Control Plane Support (Access Genie) On Prem Data sources Hadoop, SQL databases Users (admin, Data Scientist, Engineers) VGW Business Partner Extranet O/B Access Download packages 3rd party datasets Databricks CLI AWS Data sources RedShift, RDS, DynamoDB etc MPLS WAN Data in S3 buckets
  • 10. 10 Databricks adoption at Nationwide Information worker Data Analyst Data Engineer Data Scientist R & R Studio Python & Jupyter SAS/ SAS Grid IBM SPSS H2O, DriverlessAI Tensorflow Hadoop/Hive/Spark/Zeppelin SQL Excel Access SAS Tableau Paxata Python/R SQL 51 2 Efficiency gain De-risk Revenue Generating Use Cases Not applicable Databricks adoption Databricks adoption Experiment Dev Prod Databricks adoption Experiment Dev Prod Databricks adoption Experiment Dev Prod Well Known Variable Data Sources Data Sources Data Sources Data Sources Well Known Variable Well Known Variable Well Known Variable Standard Emerging Specialized Tools Tools Tools Tools Standard Emerging Specialized Standard Emerging Specialized Standard Emerging Specialized 1-3 4-5 6+ Number of tools Number of tools Number of tools Number of tools 1-3 4-5 6+ 1-3 4-5 6+ 1-3 4-5 6+
  • 11. 11 Utilizing methodologies to accelerate decision-making... ...by leveraging cutting edge data & technology • Ensembled Machine Learning • Traditional Statistical Learning • Deep Learning • Time Series Forecasting Statistical Modeling AI & Machine Learning • Text & Speech Analytics • GPU Acceleration • Recommender • Regression Experimental Design Modeling & Forecasting • Bayesian Hierarchical Modeling • Segmentation Modeling • Survivor Modeling • Model-as-Service Data Technology • Nationwide internal data • Social • Demographic • Geographic • Financial • Macro-economic • R • Python • H20 • Tableau • Java • SPSS Modeler • Tensorflow What is the tangible benefit of our data product solutions? • Tailored support combing business knowledge & statistical expertise • Easy understanding of the data for instant & actionable usability • Automated & seamless access that integrates with your processes • Scalable utility solving advanced analytical problems across domains We deliver wisdom in data by interacting with partners to translate problems into analytical solutions Enterprise Analytics Office
  • 12. Focus Use Case • Predict insurance claims frequency and severity (average cost of claims) • Large dataset (100s of millions of records) • Volatile data – Insurance claims are infrequent – Most often arise due to chance 12#UnifiedAnalytics #SparkAISummit
  • 13. Traditional Approach • Batch (1-5 years) of data aggregated across linear predictors (state, vehicle model year, driver age, etc.) • Trained actuary fits a Generalized Linear Model (GLM) to determine slope/intercept for each linear predictor • Result is a multiplicative “rating plan” 13#UnifiedAnalytics #SparkAISummit
  • 14. Novel Approach • Deep learning (hierarchical neural network) • Adequately models non-linearity of latent variables • Multiple heads – Frequency & Severity – Coverage Type & Cause of Loss • Compare to traditional GLM 14#UnifiedAnalytics #SparkAISummit
  • 15. Performance Evaluation • Custom loss functions – Poisson, Gamma negative loglikelihood • Custom metric functions – Normalized Gini index / AUC • Online monitoring using TensorBoard 15#UnifiedAnalytics #SparkAISummit
  • 16. Model Search Space is Vast • Size and number of layers • Embedding dimensionality • Activation functions (ReLU, tanh, linear) • Regularization (L1/L2, dropout) • Many others (autoencoder, combining levels of prediction, skip connections, etc.) 16#UnifiedAnalytics #SparkAISummit
  • 17. Why Spark? • Many aspects of preprocessing are embarrassingly parallel – Conversion between data formats (SAS, CSV, Parquet, TFRecords) – Encoding of category labels • Scoring is also embarrassingly parallel • Primary limitation is hyperparameter/model configuration search 17#UnifiedAnalytics #SparkAISummit
  • 19. Benchmark Timings * Utilizing Spark we are able to test many model configurations concurrently. In a local workstation environment, each configuration needs to be tested consecutively. 19#UnifiedAnalytics #SparkAISummit Local Workstation Spark CSV Conversion 10-12 hrs < 5 mins Random Shuffling ~ 8 hrs < 5 mins Featurization ~ 5 hrs 20 mins TFRecords Examples ~ 5 hrs < 5 mins Model Training ~ 6 hrs ~ 3 hrs (single node*) Model Scoring ~ 3 hrs < 5 mins
  • 20. Lessons Learned • Loading/exporting data • Conversion of the model from Keras to TensorFlow • Initializing TensorFlow models on individual nodes • Using goofys mounts • Syncing with DBFS to store model checkpoints • Utilizing Databricks Jobs/notebook parameters 20#UnifiedAnalytics #SparkAISummit
  • 21. Conclusion & Next Steps • Utilizing Spark on Databricks with ML runtime we reduced the modeling pipeline timings from ~ 34 hours to less than 4hrs • Further opportunity exists in utilizing Horovod for multi-GPU training – Reduce time needed to evaluate individual model configurations 21#UnifiedAnalytics #SparkAISummit
  • 22. General Observation • Using Databricks ML Runtime, Notebooks, scalable compute instances and scheduling features we were able to rapidly prototype the methodology. • Work in progress for Path to production and integration with current Model deployment framework. • Challenging to predict DBU consumption by different business units; hence difficult to forecast cost. • No automatic integration with GitHub enterprise. 22#UnifiedAnalytics #SparkAISummit
  • 24. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT