Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
1. Learn how insurers predict risk and
how you can apply it to your
predictive analytics project
Pawel Brzeminski, Founder & CEO
pawel@kirbatulabs.com
May 15, 2013
Analytics, Big Data, and The Cloud II
Edmonton
2. The
Company
KIRIBATULABS
Discovering Knowledge Assets
Kiribatu is a predictive analytics company, founded in
2009 / 6 employees
We serve the Canadian financial sector, predominantly
Property & Casualty insurance
3. Predic1ve
analy1cs,
huh?
KIRIBATULABS
Discovering Knowledge Assets
Goal-driven ANALYSIS of a large data set to
PREDICT human behavior
4. If
speed
was
important
to
you…
KIRIBATULABS
Discovering Knowledge Assets
YOUR insurance premium is calculated by methods
designed 40-50 years ago
VS.
5. Risk
assessment
in
Insurance
KIRIBATULABS
Discovering Knowledge Assets
A vast majority of Canadian insurers (May 2013) still use
outdated premium rating formulas created in 1960-1970s
Only a handful of Canadian insurance companies are
sophisticated predictive analytics users
Leaders are decimating their competition
6. Where
to
start?
KIRIBATULABS
Discovering Knowledge Assets
Source: By Phil McElhinney from London (Jeremy Wariner) (http://creativecommons.org/licenses/by-sa/2.0)
How to identify an opportunity for a predictive
analytics project?
7. Ques1ons
to
ask
while
star1ng
KIRIBATULABS
Discovering Knowledge Assets
Data is already collected (or can be easily acquired)
Transactional data, customer data, sensor-generated data, usage data, etc.
There is a clear objective to predict something
Future price, failure rate, customer risk, customer profitability, customer retention, etc.
Well-defined functional settings are a great place to start
We focused on a Risk Sharing Pool (RSP) problem optimization
Typically the SMEs (Subject Matter Experts) are making
decisions based on their experience and “gut feeling”
Senior underwriters in our case
Significant ROI is expected
Investment in analytics can be small but usually it is not trivial
8. Example
KIRIBATULABS
Discovering Knowledge Assets
Risk Sharing Pool is a construct used by Canadian
insurers to optimize their risk assessment
Insurers put their highest risks (primary driver and a
vehicle) in the pool to avoid paying for the claims
But they forfeit the premium
Insurers retain the risks they deem profitable on their
book of business
They can collect the premium and make a profit
9. Challenge
KIRIBATULABS
Discovering Knowledge Assets
Can we effectively predict future claims on policies?
The model would need to predict claims that will occur up to 12 months in advance
10. Introducing
Underwri1ng
Score
KIRIBATULABS
Discovering Knowledge Assets
The predictive model generates an Underwriting (UW)
Score
The UW Score is a number between 1 to 1000
High UW Score = high profitability = low risk
Low UW Score = low profitability = high risk
Highly accurate predictor of future claims on a policy
UW Score will be used to assess which risks are placed
in the pool and which risks are not placed in the pool
11. Data
Prepara1on
Ra1ng
Factor
Analysis
Model
Development
Gain
Assessment
KIRIBATULABS
Discovering Knowledge Assets
4
Key
Modeling
Steps
12. Data
Prepara1on
• Policy
&
claims
data
profiling,
understanding
and
verifica1on
• Data
cleansing
(filling
missing
values,
outliers
removal)
• Data
transforma1on
• Data
normaliza1on
(infla1on
&
claim
development
factors)
• Data
enrichment
with
3rd
party
data
(demographic,
econometric
–
Census
Canada,
VICC,
CLEAR,
etc.)
Data
Prepara1on
KIRIBATULABS
Discovering Knowledge Assets
13. Ra1ng
Factor
Analysis
KIRIBATULABS
Discovering Knowledge Assets
• Sta1s1cal
analysis
of
each
data
element
for
its
propensity
to
claim
• Ra1ng
factors
with
high
correla1ons
are
included
in
the
final
predic1ve
model(s)
• OYen,
new
powerful
ra1ng
factors
are
discovered
in
this
step
(very
useful
for
Underwri1ng)
Ra1ng
Factor
Analysis
Data
Prepara1on
14. Model
Development
KIRIBATULABS
Discovering Knowledge Assets
• Algorithm
selec1on
(gene1c
algorithms,
neural
networks,
logis1c
regression,
SVM)
• Time-‐wise
training
and
tes1ng
data
set
split
• Model
parameteriza1on,
genera1on
and
evalua1on
Data
Prepara1on
Ra1ng
Factor
Analysis
Model
Development
15.
• Calcula1on
of
UW
Scores
on
test
data
set
• Retrospec1ve
underwri1ng
gain
assessment
on
historical
data
sets
Data
Prepara1on
Ra1ng
Factor
Analysis
Model
Development
Gain
Assessment
KIRIBATULABS
Discovering Knowledge Assets
RSP
Gain
Assessment
16. Results
KIRIBATULABS
Discovering Knowledge Assets
Source: “Improving P&C Insurance Risk Management and Policy Pricing with Predictive Analytics”, Pawel Brzeminski,
September 2011, http://www.kiribatulabs.com/resources.php.
UW Score = 1000 – Risk Score
17. 4
Key
Challenges
KIRIBATULABS
Discovering Knowledge Assets
Extremely low correlations / Data set imbalance
98% of policy transactions do not have any claims, 2% have claims
Bad, bad data
Drivers driving 200,000 km per year (that's driving over 500 km per day for 365 days a year)
Over-fitting
Certain features do not generalize very well in a time-wise data split
Data sparcity
Motor Vehicle Abstract (MVA) data that contains convictions, suspensions and reinstatement
is not always available
18. 5
Key
Breakthroughs
KIRIBATULABS
Discovering Knowledge Assets
Policy transactions collapsed into single vectors
Individual risk assessment for each vehicle on policy
Instance sampling and weighting
Dealing with dataset imbalance and bad data
Custom model quality metric
Aggregation of the highest claims in the top 5% of all transactions really moved the needle
Risk Assessment per insurance coverage
Different data elements are important for each coverage, for instance liability coverage and
comprehensive coverage are completely different products behave very differently
Prediction of Profitability
Include written premiums in 2nd level model
19. Homework
KIRIBATULABS
Discovering Knowledge Assets
Where can I apply predictive analytics in my
business?
Questions? Always happy to have a coffee
Pawel Brzeminski, Founder & CEO
pawel@kirbatulabs.com
780-232-2634
http://ca.linkedin.com/pub/pawel-brzeminski/0/523/555
@pawelwb