SlideShare uma empresa Scribd logo
1 de 35
Fraud Detection in Insurance
with Machine Learning
for WARTA (TALANX Group)
Artur Suchwalko, Ph.D., QuantUp, CEO
artur@quantup.pl
1
Introduction
2
What do you think about…
…when you think about fraud detection with ML?
• Deep Learning, xgboost, Autoencoder, severe class imbalance?
• How to apply the model to real decision taking, sample
representativeness, having historical frauds identified and
marked, potential features, goal function?
3
Case Study: Warta (high level)
4
About Warta
• 2nd biggest insurer in Poland
• Full offering: Life and non-life insurances
• Member of Talanx Group
• Award winning innovator, e.g.
• InsurTech Congress award for implementing anti-fraud
solution (http://media.warta.pl/pr/356330/warta-doceniona-
za-wdrozenie-platformy-datawalk)
• First comprehensive mobile app for claim handling process.
5
Project scope
Warta’s reasons to start:
• Looking for comprehensive anti-fraud solution for non-life insurances
• Convinced to improve anti-fraud KPIs
• Readiness to replace existing technologies: IBM, Statistica.
Chosen solution:
• DataWalk – data gathering, data linking, expert scoring and fraud investigations.
• QuantUp – Machine Learning algorithms to improve suspicious claim selection.
6
Integration with DataWalk
DataWalk is a Big Data software platform for connecting numerous
large data sets, both external and internal, into a single repository
for fast visual analysis.
DataWalk can be used for:
• Fast data modelling
• Fraud hypothesis prototyping
• Fraud scoring
• Fraud investigations
Analyze your data
10x faster
Increasing the
effectiveness of
anti-fraud rules up
to 80%
Pre-configured rules
and scores
30-90 days return on investment
Demo movie: https://youtu.be/h45mheDH4uU 7
Integration with DataWalk
• DataWalk enables easy to use, graphical
interface (Universe Viewer) to interpret and
link data, as well as to create and maintain
ABTs.
• Well prepared and easy to update data model
is a fundamental issue in ABT creation and
predictive model credibility.
8
Case Study: Warta (auto insurance & ML only)
9
Goal & results
• Goal:
• Improving of detection of probable frauds for further investigation
• Probable / doubtful / suspicious claim: suspected to be a fraud but not proven to be one
• Finding and proving are two different things
• Result: improvement of order of 30% (comparing to past simple models)
10
Important business questions
• How to choose claims for investigation to:
• detect highest number of fraud attempts?
• detect highest amount of fraudulent claims?
• detect highest amount with limited resources and time for detection?
• be able to prove highest number / amount of fraud attempts?
• This requirement is translated into a suitable goal function for a model
• and should affect the optimization criterion.
11
Claim case 1: Rules / human
• Description: A driver hit the rear side of a victim's car. The car was pushed to the crossroads
area and there was a collision with a third car (Mercedes). The police was called.
• Rules:
• airbags inflated
• similar age of both drivers
• difference of cars' age >=11 years
• historical loss coefficient >=5
• Result: Not refused to pay because of fraud attempt: the description was consistent with
the damages
12
Claim case 2: Model
• Description: I (victim) was driving a left lane. The second driver (a culprit) was driving a right
lane (the same direction). He wanted to change the lane, haven't seen my car and hit my
car. Its rear left side damaged my car's right front side.
• Analysis: no clear evidence
• only one year of cars' age difference
• no age information for the second driver
• insurance policy was not new
• no claim history for drivers and cars
• Result: Refused to pay because of fraud attempt: no correlation between description and
damages – not possible to be a real claim (verified)
13
How to build a model?
• Preparation of the predictors (can be complex because of aggregation of data from many
sources) in a form of ABT
• Having the target variable in the historical data
• Build a predictive model
14
Important
• Checking if modeling is possible (the process of claim handling influences the historical
data): 0% vs. 100% checked
• Definition of new predictors
• Detection of false predictors
• Data enhancement: historical aggregates, textual, external
15
Inside
• Historical information about all collission parties
• Extraction of information from text notes
• Avoiding false predictors
• Boosted trees
• with a non-standard goal function
• and careful hyperparameter optimization
• Reduction of number of predictors to make the model more simple and robust
• Handling new values, e.g. car model
16
Pure analytics vs. business
ROC for less and more complex models
These results don’t reflect the real values and are used for illustrative purposes17
Numbers, amounts & preprocessing
ROC for simple and complex models
18
Numbers, amounts & preprocessing
ROC for amounts for simple and complex models
19
Non-standard goal functions
• Ranking model: checking claims basing on potential profit
• Simple classification models base on counts
• Replication of results
ROC curve
20
Non-standard goal functions
Profit accordingly to the model ordering 21
Non-standard goal functions
• Claim amount turned out to be a strong predictor
• The amount could decide about verification: high claims first
• Even independently of predictors / model!
Amount acordingly to the model ordering
22
False predictors
Ranking (VIP-alike): iterative removing of the best feature and rebuilding of the model:
1. Active features: all
2. Build a model using active features
3. Calculate AUC and a features ranking
4. Deactivate the best feature accordingly to the rating
5. Go to 2 until all features are inactive.
6. Plot and conclude
23
False predictors
False predictors (red) can be anywhere! 24
Summary
25
Project results
First quarter of using the full-scope solution
• Detection Rate Improvement in 1st quarter: +60%
• True Positives > 80%
• ROI = less than 2 months (!)
• Predictive models responsible for 30-40% of the
final fiscal results.
https://m.bankier.pl/wiadomosc/Polowanie-na-
dawcow-polis-czyli-na-nas-7599390.html
BENEFICIARIES
OF DATAWALK & R IMPLEMENTATION
Vice President Claims
• Extremely positive project ROI.
• Reduction of technology providers
• Results accomplished 6x faster and ~20x cheaper
than similar project at key competitor.
• Warta strengthens position of market innovator
in claim handling area.
Head of Anti-Fraud Department
• Impressively improved business results.
• Higher satisfaction and trust in analytics among
team members.
• Knowledge accommodation and knowledge
sharing within the team.
Head Analysts
• Full control over analytical environment.
• Access to all data without engaging IT.
• Expert scoring, machine learning and
investigations in one place.
• Possibility to test new fraud schemas.
26
Summary
• Predictive models alone gave a fraction of the total ROI
• The business goal is not always just directly maximizing losses, income etc.
• It’s pretty common for DS/ML projects to get additional profit as a side effect
• ROI for such projects should be measurable and high (but not neccessarily fast) for carefully
chosen business cases
• Predictive models can be significantly improved not spending much (hyperparameters
tuning, goal function, methods etc.)
• There are pitfalls to avoid!
• Usually you don’t need fancy hardware / software (PCs + R!)
27
What’s important in fraud detection with ML?
28
Business & Analytics
• Find a good business case (volume big enough)
• State the business goal and carefully translate it into analytics: use the right goal function
• Correct process of model building
• Controlled implementation
• Measuring model effectiveness comparing to no model / previous situation – using right
KPIs (not always simple, not always possible)
29
Process & Data
• Check if modeling is possible with supervised models (fraud flags stored; correct, and
representative sample; good data coverage)
• Data preparation is the most important factor
• Use many data sources
• Data enhancement: aggregates from historical data, textual, external
• Cost of data preparation!
• Detection of false predictors: if not detected then the model is degraded in production (it is
arduous for wide data)
30
Data sources
• ”Plain” data: basic
• Complete data related to the loss, claim, and parties involved
• Flags of historical frauds
• ”Plain” data: enhanced
• Using ZIP codes and additional statistics, e.g. fraction of forest area, unemployment rate
• Weather data
• Analysis of connections (SNA)
• Tekst (words from a list, n-grams, others)
• Analysis of neighbourhood using maps
31
What influences model quality?
• Solving the right business problem
• Sample representativeness
• Goal function in line with the business goal
• Right model complexity and the correct model building process
• Costs of misclassifications, e.g. false alarm rate
• Black box predictions explanations  proving fraud attempts  improving of actionability
• It’s pretty hard to get everything in a single model
• Validation of the model and carefully testing its implementation
32
Methods
• Commitees / ensembles of trees / boosted trees – good results, possible to use different
goal functions, variable importance, handling NA’s  use this!
• Deep Neural Networks – for data complex enough but still having the same structure
• Manual feature extraction not neccessary
• Any (almost) goal function
• Recurrent Neural Networks – working directly on events not on aggregates from ABT
• Using black box model’s prediction explanations (LIME and its friends) – to improve
actionability
33
How to improve a model?
• Average model
• vs. human / rules: +10-30%
• Good model
• vs. average model: +10-50% (depending on measurement)
• predictive power driven by data
• Incorrect model
• vs. human / rules: +0% (or losses)
• works in a computer only
Assuming that the goal function and actionability remain unchanged
35
About me
• Commercial experience in DS / ML: > 20 years, ~ 100 projects, ~ 3,000 hours of workshops
• Translating a business problem into an analytics problem + choosing adequate means to
solve the latter
• Founder & owner of QuantUp DS / ML firm
• Contact me if you need:
• During the conference
• After the conference: artur@quantup.pl
36

Mais conteúdo relacionado

Mais procurados

Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection MLMaatougSelim
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber SecurityRishi Kant
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning AlgorithmsDezyreAcademy
 
How is ai important to the future of cyber security
How is ai important to the future of cyber security How is ai important to the future of cyber security
How is ai important to the future of cyber security Robert Smith
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AIMostafa Elsheikh
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionGianluca Bontempi
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine LearningJoel Graff
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningShao-Chuan Wang
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
 
Machine learning in Cyber Security
Machine learning in Cyber SecurityMachine learning in Cyber Security
Machine learning in Cyber SecurityRajathV2
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches Venkat Projects
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
Machine Learning project presentation
Machine Learning project presentationMachine Learning project presentation
Machine Learning project presentationRamandeep Kaur Bagri
 

Mais procurados (20)

Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection ML
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber Security
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 
How is ai important to the future of cyber security
How is ai important to the future of cyber security How is ai important to the future of cyber security
How is ai important to the future of cyber security
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AI
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
Credit card fraud dection
Credit card fraud dectionCredit card fraud dection
Credit card fraud dection
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Machine learning in Cyber Security
Machine learning in Cyber SecurityMachine learning in Cyber Security
Machine learning in Cyber Security
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
Machine Learning project presentation
Machine Learning project presentationMachine Learning project presentation
Machine Learning project presentation
 

Semelhante a Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko

Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsAkin Osman Kazakci
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxpatilaniket2418
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)DevGAMM Conference
 
Do we all react in the same way? Influence of People’s Personality Traits on ...
Do we all react in the same way? Influence of People’s Personality Traits on ...Do we all react in the same way? Influence of People’s Personality Traits on ...
Do we all react in the same way? Influence of People’s Personality Traits on ...Andrej Gustin
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detectionjagan477830
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life CycleSrujanaMerugu1
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikhNikhat Fatma Mumtaz Husain Shaikh
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Data Science Society
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenLuis Willumsen
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class QuantUniversity
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxyatintaneja6
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 

Semelhante a Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko (20)

Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analytics
 
Deep learning
Deep learningDeep learning
Deep learning
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)
ML game metrics monitoring system launch / Aleksandr Tolmachev (Xsolla)
 
Do we all react in the same way? Influence of People’s Personality Traits on ...
Do we all react in the same way? Influence of People’s Personality Traits on ...Do we all react in the same way? Influence of People’s Personality Traits on ...
Do we all react in the same way? Influence of People’s Personality Traits on ...
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life Cycle
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsen
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
Model Factory at ING Bank
Model Factory at ING BankModel Factory at ING Bank
Model Factory at ING Bank
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 

Mais de Institute of Contemporary Sciences

Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Institute of Contemporary Sciences
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicInstitute of Contemporary Sciences
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena PekezInstitute of Contemporary Sciences
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovInstitute of Contemporary Sciences
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Institute of Contemporary Sciences
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Institute of Contemporary Sciences
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Institute of Contemporary Sciences
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Institute of Contemporary Sciences
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicInstitute of Contemporary Sciences
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicInstitute of Contemporary Sciences
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionInstitute of Contemporary Sciences
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentInstitute of Contemporary Sciences
 

Mais de Institute of Contemporary Sciences (20)

First 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip PanjevicFirst 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip Panjevic
 
Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena Pekez
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar Dilov
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...
 
From Zero to ML Hero for Underdogs - Amir Tabakovic
From Zero to ML Hero for Underdogs  - Amir TabakovicFrom Zero to ML Hero for Underdogs  - Amir Tabakovic
From Zero to ML Hero for Underdogs - Amir Tabakovic
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
The price is right - Tomislav Krizan
The price is right - Tomislav KrizanThe price is right - Tomislav Krizan
The price is right - Tomislav Krizan
 
When it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela CulibrkWhen it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela Culibrk
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos Solujic
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir Brusic
 
Improving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity SearchImproving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity Search
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognition
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local government
 
Geospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and ClimateGeospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and Climate
 

Último

➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 

Último (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 

Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko

  • 1. Fraud Detection in Insurance with Machine Learning for WARTA (TALANX Group) Artur Suchwalko, Ph.D., QuantUp, CEO artur@quantup.pl 1
  • 3. What do you think about… …when you think about fraud detection with ML? • Deep Learning, xgboost, Autoencoder, severe class imbalance? • How to apply the model to real decision taking, sample representativeness, having historical frauds identified and marked, potential features, goal function? 3
  • 4. Case Study: Warta (high level) 4
  • 5. About Warta • 2nd biggest insurer in Poland • Full offering: Life and non-life insurances • Member of Talanx Group • Award winning innovator, e.g. • InsurTech Congress award for implementing anti-fraud solution (http://media.warta.pl/pr/356330/warta-doceniona- za-wdrozenie-platformy-datawalk) • First comprehensive mobile app for claim handling process. 5
  • 6. Project scope Warta’s reasons to start: • Looking for comprehensive anti-fraud solution for non-life insurances • Convinced to improve anti-fraud KPIs • Readiness to replace existing technologies: IBM, Statistica. Chosen solution: • DataWalk – data gathering, data linking, expert scoring and fraud investigations. • QuantUp – Machine Learning algorithms to improve suspicious claim selection. 6
  • 7. Integration with DataWalk DataWalk is a Big Data software platform for connecting numerous large data sets, both external and internal, into a single repository for fast visual analysis. DataWalk can be used for: • Fast data modelling • Fraud hypothesis prototyping • Fraud scoring • Fraud investigations Analyze your data 10x faster Increasing the effectiveness of anti-fraud rules up to 80% Pre-configured rules and scores 30-90 days return on investment Demo movie: https://youtu.be/h45mheDH4uU 7
  • 8. Integration with DataWalk • DataWalk enables easy to use, graphical interface (Universe Viewer) to interpret and link data, as well as to create and maintain ABTs. • Well prepared and easy to update data model is a fundamental issue in ABT creation and predictive model credibility. 8
  • 9. Case Study: Warta (auto insurance & ML only) 9
  • 10. Goal & results • Goal: • Improving of detection of probable frauds for further investigation • Probable / doubtful / suspicious claim: suspected to be a fraud but not proven to be one • Finding and proving are two different things • Result: improvement of order of 30% (comparing to past simple models) 10
  • 11. Important business questions • How to choose claims for investigation to: • detect highest number of fraud attempts? • detect highest amount of fraudulent claims? • detect highest amount with limited resources and time for detection? • be able to prove highest number / amount of fraud attempts? • This requirement is translated into a suitable goal function for a model • and should affect the optimization criterion. 11
  • 12. Claim case 1: Rules / human • Description: A driver hit the rear side of a victim's car. The car was pushed to the crossroads area and there was a collision with a third car (Mercedes). The police was called. • Rules: • airbags inflated • similar age of both drivers • difference of cars' age >=11 years • historical loss coefficient >=5 • Result: Not refused to pay because of fraud attempt: the description was consistent with the damages 12
  • 13. Claim case 2: Model • Description: I (victim) was driving a left lane. The second driver (a culprit) was driving a right lane (the same direction). He wanted to change the lane, haven't seen my car and hit my car. Its rear left side damaged my car's right front side. • Analysis: no clear evidence • only one year of cars' age difference • no age information for the second driver • insurance policy was not new • no claim history for drivers and cars • Result: Refused to pay because of fraud attempt: no correlation between description and damages – not possible to be a real claim (verified) 13
  • 14. How to build a model? • Preparation of the predictors (can be complex because of aggregation of data from many sources) in a form of ABT • Having the target variable in the historical data • Build a predictive model 14
  • 15. Important • Checking if modeling is possible (the process of claim handling influences the historical data): 0% vs. 100% checked • Definition of new predictors • Detection of false predictors • Data enhancement: historical aggregates, textual, external 15
  • 16. Inside • Historical information about all collission parties • Extraction of information from text notes • Avoiding false predictors • Boosted trees • with a non-standard goal function • and careful hyperparameter optimization • Reduction of number of predictors to make the model more simple and robust • Handling new values, e.g. car model 16
  • 17. Pure analytics vs. business ROC for less and more complex models These results don’t reflect the real values and are used for illustrative purposes17
  • 18. Numbers, amounts & preprocessing ROC for simple and complex models 18
  • 19. Numbers, amounts & preprocessing ROC for amounts for simple and complex models 19
  • 20. Non-standard goal functions • Ranking model: checking claims basing on potential profit • Simple classification models base on counts • Replication of results ROC curve 20
  • 21. Non-standard goal functions Profit accordingly to the model ordering 21
  • 22. Non-standard goal functions • Claim amount turned out to be a strong predictor • The amount could decide about verification: high claims first • Even independently of predictors / model! Amount acordingly to the model ordering 22
  • 23. False predictors Ranking (VIP-alike): iterative removing of the best feature and rebuilding of the model: 1. Active features: all 2. Build a model using active features 3. Calculate AUC and a features ranking 4. Deactivate the best feature accordingly to the rating 5. Go to 2 until all features are inactive. 6. Plot and conclude 23
  • 24. False predictors False predictors (red) can be anywhere! 24
  • 26. Project results First quarter of using the full-scope solution • Detection Rate Improvement in 1st quarter: +60% • True Positives > 80% • ROI = less than 2 months (!) • Predictive models responsible for 30-40% of the final fiscal results. https://m.bankier.pl/wiadomosc/Polowanie-na- dawcow-polis-czyli-na-nas-7599390.html BENEFICIARIES OF DATAWALK & R IMPLEMENTATION Vice President Claims • Extremely positive project ROI. • Reduction of technology providers • Results accomplished 6x faster and ~20x cheaper than similar project at key competitor. • Warta strengthens position of market innovator in claim handling area. Head of Anti-Fraud Department • Impressively improved business results. • Higher satisfaction and trust in analytics among team members. • Knowledge accommodation and knowledge sharing within the team. Head Analysts • Full control over analytical environment. • Access to all data without engaging IT. • Expert scoring, machine learning and investigations in one place. • Possibility to test new fraud schemas. 26
  • 27. Summary • Predictive models alone gave a fraction of the total ROI • The business goal is not always just directly maximizing losses, income etc. • It’s pretty common for DS/ML projects to get additional profit as a side effect • ROI for such projects should be measurable and high (but not neccessarily fast) for carefully chosen business cases • Predictive models can be significantly improved not spending much (hyperparameters tuning, goal function, methods etc.) • There are pitfalls to avoid! • Usually you don’t need fancy hardware / software (PCs + R!) 27
  • 28. What’s important in fraud detection with ML? 28
  • 29. Business & Analytics • Find a good business case (volume big enough) • State the business goal and carefully translate it into analytics: use the right goal function • Correct process of model building • Controlled implementation • Measuring model effectiveness comparing to no model / previous situation – using right KPIs (not always simple, not always possible) 29
  • 30. Process & Data • Check if modeling is possible with supervised models (fraud flags stored; correct, and representative sample; good data coverage) • Data preparation is the most important factor • Use many data sources • Data enhancement: aggregates from historical data, textual, external • Cost of data preparation! • Detection of false predictors: if not detected then the model is degraded in production (it is arduous for wide data) 30
  • 31. Data sources • ”Plain” data: basic • Complete data related to the loss, claim, and parties involved • Flags of historical frauds • ”Plain” data: enhanced • Using ZIP codes and additional statistics, e.g. fraction of forest area, unemployment rate • Weather data • Analysis of connections (SNA) • Tekst (words from a list, n-grams, others) • Analysis of neighbourhood using maps 31
  • 32. What influences model quality? • Solving the right business problem • Sample representativeness • Goal function in line with the business goal • Right model complexity and the correct model building process • Costs of misclassifications, e.g. false alarm rate • Black box predictions explanations  proving fraud attempts  improving of actionability • It’s pretty hard to get everything in a single model • Validation of the model and carefully testing its implementation 32
  • 33. Methods • Commitees / ensembles of trees / boosted trees – good results, possible to use different goal functions, variable importance, handling NA’s  use this! • Deep Neural Networks – for data complex enough but still having the same structure • Manual feature extraction not neccessary • Any (almost) goal function • Recurrent Neural Networks – working directly on events not on aggregates from ABT • Using black box model’s prediction explanations (LIME and its friends) – to improve actionability 33
  • 34. How to improve a model? • Average model • vs. human / rules: +10-30% • Good model • vs. average model: +10-50% (depending on measurement) • predictive power driven by data • Incorrect model • vs. human / rules: +0% (or losses) • works in a computer only Assuming that the goal function and actionability remain unchanged 35
  • 35. About me • Commercial experience in DS / ML: > 20 years, ~ 100 projects, ~ 3,000 hours of workshops • Translating a business problem into an analytics problem + choosing adequate means to solve the latter • Founder & owner of QuantUp DS / ML firm • Contact me if you need: • During the conference • After the conference: artur@quantup.pl 36