SlideShare uma empresa Scribd logo
1 de 24
E.D.A
By
Adithi – E19002
Bhaswani – E19009
Neha – E19018
BRIEF OVERVIEW:
 To identify the attributes having influential power in
decision making to either reject or accept loan application.
 Context of the data set: The original dataset contains 1000
entries with 20 categorical/symbolic attributes. In this
dataset, each entry represents a person who takes a credit
by a bank. Each person is classified as good or bad credit
risks according to the set of attributes.
S.No Variable Description Data type
1 Credibility 1 : credit-worthy; [good risk]
0 : not credit-worthy [ bad risk ]
Categorical
2 Balance of
current
account
no running account - 1
No balance or debit -2;
0 <= ... < 200 DM – 3;
... >= 200 DM or checking account for at least 1 year-4;
Categorical
3 Duration in
months
(metric)
[<=12] – up to 1 year
[12< ... <= 24] – 1-2 years
[24 < ... <= 36] – 2-3 years
[36 < ... <= 48] – 3- 4 years
[48< ... <= 60] – 4-5 years
[60 < ... <= 72] – 5-6 years
NUMERICAL
4 Payment of
previous
credits
no previous credits / paid back all previous credits - 2
paid back previous credits at this bank - 4
no problems with current credits at this bank - 3
problematic running account / there are further credits running but
at other banks – 1
hesitant payment of previous credits - 0
CATEGORICAL
5 Purpose of
credit
new car - 1
used car - 2
items of furniture - 3
radio / television - 4
household appliances- 5
Repair -6
Education - 7
Vacation- 8
Retraining -9
Business- 10
Other -0
CATEGORICAL
ATTRIBUTES:
S.No Variable Description Data type
6 Amount of credit in DM [<=1500 ] - 1;
[1500 < ... <= 4500] - 2;
[4500 < ... <= 7500] - 3;
[7500 < ... <= 10500] - 4;
[10500 < ... <=13500] - 5;
[13500 < ... <= 16500] - 6;
[> 16500] - 7
Numerical
7 Value of savings or stocks not available / no savings - 1
[< 100], - 2
[100,- <= ... < 500], - 3
[500,- <= ... < 1000], - 4
[>= 1000], - 5
Categorical
8 Has been employed by
current employer
For Unemployed - 1
[<= 1] - 2
[1 <= ... < 4 ] - 3
[4 <= ... < 7]- 4
[>= 7] - 5
Categorical
9 rate Instalment in % of
available income
[>= 35] - 1
[25 <= ... < 35] - 2
[20 <= ... < 25] - 3
[< 20] - 4
Categorical
10 Marital Status / Sex male: divorced / living apart – 1; male: single- 2
male: married / widowed – 3; female: 4
Categorical
11 Further debtors /
Guarantors
None – 1; Co-Applicant – 2; Guarantor - 3 Categorical
12 Living in current household
for
[< 1 year] - 1
[1 <= ... < 4 ] years - 2
[4 <= ... < 7] years - 3
[ >= 7 ] years - 4
Categorical
S.No Variable Description Data type
13 Most valuable available assets Ownership of house or land - 4
Savings contract with a building society / life
insurance - 3
Car / other - 2
Not available / no assets -1
Categorical
14 Age in years (categorized) [0 <= ... <= 25] - 1
[ 26 <= ... <= 39 ] - 2
[ 40 <= ... <= 59] - 3
[ 60 <= ... <= 64 ] - 4
[ >= 65 ] - 5
Numerical
15 Further running credits At other banks – 1
At department store or mail order house - 2
No further running credits – 3
Categorical
16 Type of apartment Rented-1; owned – 2 ; free - 3 Categorical
17 Number of previous credits at
this bank (including the
running one)
One- 1; two or three – 2; four or five –
3; six and above - 4
Categorical
18 Occupation Unemployed / unskilled with no permanent
residence - 1
Unskilled with permanent residence - 2
Skilled worker / skilled employee / minor civil
servant - 3
Executive / self-employed / higher civil servant
- 4
Categorical
19 Number of persons entitled to
maintenance
0 to 2 – 2 ; 3 and more - 1 Numerical
20 Telephone No- 1 ; yes - 2 Categorical
21 Foreign worker Yes- 1; no - 2 Categorical
• We have the population
distribution in
proposition of 70:30 risk
wise
• We have 4 numeric and
16 categorical features.
• Few non influencing
variables which may
not contribute for
decision making
• To find, which is the
most influencing
variable, we adapted a
techniques – WOE-IV
From the data:
WEIGHT OF
EVIDENCE-
INFORMATION VALUE
WOE - IV
WOE & IV are simple,
yet powerful
techniques to
perform variable
transformation and
selection.
It is widely used in
credit scoring to
measure the
separation of good vs
bad customers.
COMPUTATION
&
INTERPRETATION…!
Age Group
Total
Number of
Loans
Number of
Bad Loans
Numbef of
Good
Loans
% Bad
Loans
Name of
Group
Distibution
Bad (DB)
Distibution
Good (DG)
WOE DG - DB
(DG - DB)*
WOE
21 - 30 4821 206 4615 4.3% G1 0.135 0.078 -0.553 -0.057 0.0318
30 - 36 10266 357 9909 3.5% G2 0.235 0.167 -0.339 -0.067 0.0228
36 - 48 32926 776 32150 2.4% G3 0.510 0.542 0.062 0.032 0.0020
48 - 60 12788 183 12605 1.4% G4 0.120 0.213 0.570 0.092 0.0527
Total 60801 1522 59279 Information Value --> 0.1093
Higher the age higher
the credibility
But above sixty years
i.e., after retirement the
credibility is reduced
IV : 0.093
Weak predictive Power
Female have good
credibility
Among male married
have high credibility
IV : 0.045
Weak predictive Power
Higher the balance in
account more the
probability to fall in good
risk
IV :
Savings Account: 0.196
Medium predictive Power
Current Account:0.666
Suspicious Predictive
Power / Too good to rely
on
Predictive Power Of:
CA>SB
Duration In
Months
Lower the duration
lower the bad risk
IV : 0.166
Medium predictive
Power
Amount of credit
Lower the amount
lower the bad risk
<=1500 also have slight
increase in bad risk
IV : 0.165
Medium predictive
Power
PURPOSE OF
CREDIT
If the purpose of the loan
is to create an asset good
risk should be high
Where as the purpose is
an expenditure , bad risk
should be high.
But for vacation it shows
high good risk.
On Further observation,
the no of loan given for
the purpose of vacation
are just 9 not even 1%
(0.9 %)
Hence ignored..!
IV : 0.166
Medium predictive Power
PURPOSE 0 1 2 3 4 5 6 8 9 10
NOT CREDIBLE 89 17 58 62 4 8 22 1 34 5
CREDIBLE 145 86 123 218 8 14 28 8 63 7
Grand Total 234 103 181 280 12 22 50 9 97 12
Higher the no of years
employment , Higher the
credibility
IV : 0.086
Weak predictive Power
People with no assets are
having high probability of
falling into credible
category
IV : 0.113
Medium predictive Power
Payment Of
Previous Credits
Bad risk is observed in
people who are hesitant
to pay previous credits
IV : 0.293
Medium predictive Power
Bad risk is observed in
people whose instalment
is lower in % of the
income.
Which is contrary…!
Though the pattern is
almost resembling the
population.
IV : 0.026
Weak predictive Power
Higher the no of credits
availed higher the
credibility.
But not more than 6
credit facilities.
IV : 0.013
Not useful for prediction
People with no current
credits are having high
credibility.
IV : 0.085
Weak predictive Power
If the loan is secured by a
guarantor it shows high
credibility.
IV : 0.032
Weak predictive Power
People work abroad are
given high credibility
IV : 0.087
Weak predictive Power
For people who have
Rented housing as got
high credibility..!
IV : 0.085
Weak predictive Power
17.9% 71.4% 10.7%
96.3 % 3.7 %
Not influencing
variables as they are
representing the
population distribution
of 70:30 propositionIV VALUES:
Further analysis…!
ATTRIBUTE IV INTERPRETATION
Current Account Balance 0.666 Suspicious Predictive Power
Payment Status Of Previous Credit 0.293 Medium predictive Power
Value Savings/Stocks 0.196 Medium predictive Power
Purpose 0.166 Medium predictive Power
Duration Of Credit (Month) 0.165 Medium predictive Power
Credit Amount 0.119 Medium predictive Power
Most Valuable Available Asset 0.113 Medium predictive Power
Age 0.093 Weak predictive Power
Foreign Worker 0.087 Weak predictive Power
Length Of Current Employment 0.086 Weak predictive Power
Housing 0.085 Weak predictive Power
Concurrent Credit 0.058 Weak predictive Power
Sex & Marital Status 0.045 Weak predictive Power
Guarantor /Debtor 0.032 Weak predictive Power
Instalment Per Cent 0.026 Weak predictive Power
No Of Credits 0.013 Not useful for prediction
Telephone 0.01 Not useful for prediction
Occupation 0.009 Not useful for prediction
Duration In Current House 0.004 Not useful for prediction
Dependents 0.00004 Not useful for prediction
CHOOSING MODEL
 when customer applies for a loan, the bank accepts or rejects the
application based on predicted risk -probability of default- for the
application.
 Considering this is an objective segmentation, we need to have a
target/dependent variable. In this case it will be whether a
customer has Bad or good risk over the loan.
 If we are working on an objective segmentation problem, our aim
is to find conditions which help us find a segment which is very
similar on target variable value.
 Decision Tree is one of the commonly used as objective
segmentation techniques.
 Based on the WOE – IV we have chosen the variables with good
predictive power for building a decision tree
DECISION TREE:
 Interpretation:
 Train-test split : 70:30
 Class1 : credible
 Class 0: not credible
 Depth : 3
 Accuracy: 0.76
 Precision: 0.77
 Sensitivity: 0.92
 Specificity: 35
 F1 score: 0.84
 Interpretation:
 Train-test split : 70:30
 Class1 : credible
 Class 0: not credible
 Depth :4
 Accuracy: 0.74
 Precision: 0.77
 Sensitivity: 0.89
 Specificity: 37
 F1 score: 0.83
FURTHER ANALYSIS TO BE CONTD..
THANK YOU…!
Queries..?

Mais conteúdo relacionado

Mais procurados

Machine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskMachine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskQuantUniversity
 
Lead scoring case study
Lead scoring case studyLead scoring case study
Lead scoring case studyShreya Solanki
 
Lead scoring case study presentation
Lead scoring case study presentationLead scoring case study presentation
Lead scoring case study presentationMithul Murugaadev
 
EDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxEDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxAmitDas125851
 
Telecom Churn Prediction Presentation
Telecom Churn Prediction PresentationTelecom Churn Prediction Presentation
Telecom Churn Prediction PresentationPinintiHarishReddy
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval Venkata Reddy Konasani
 
Credit Default Models
Credit Default ModelsCredit Default Models
Credit Default ModelsSwati Mital
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Gunvansh Khanna
 
Credit risk management presentation
Credit risk management presentationCredit risk management presentation
Credit risk management presentationharsh raj
 
Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Vatsal N Shah
 
Lead Scoring Case Study
Lead Scoring Case StudyLead Scoring Case Study
Lead Scoring Case StudyLumbiniSardare
 

Mais procurados (20)

Machine Learning Applications in Credit Risk
Machine Learning Applications in Credit RiskMachine Learning Applications in Credit Risk
Machine Learning Applications in Credit Risk
 
Lead scoring case study
Lead scoring case studyLead scoring case study
Lead scoring case study
 
Unit 2
Unit 2Unit 2
Unit 2
 
Lead scoring case study presentation
Lead scoring case study presentationLead scoring case study presentation
Lead scoring case study presentation
 
EDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxEDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptx
 
Jntu credit risk-management
Jntu credit risk-managementJntu credit risk-management
Jntu credit risk-management
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Telecom Churn Prediction Presentation
Telecom Churn Prediction PresentationTelecom Churn Prediction Presentation
Telecom Churn Prediction Presentation
 
Credit EDA case study
Credit EDA case studyCredit EDA case study
Credit EDA case study
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Telecom Churn Prediction
Telecom Churn PredictionTelecom Churn Prediction
Telecom Churn Prediction
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
 
Credit Default Models
Credit Default ModelsCredit Default Models
Credit Default Models
 
Customer churn prediction in banking
Customer churn prediction in bankingCustomer churn prediction in banking
Customer churn prediction in banking
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt
 
Credit risk management presentation
Credit risk management presentationCredit risk management presentation
Credit risk management presentation
 
Modified Duration
Modified DurationModified Duration
Modified Duration
 
Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients Machine Learning Project - Default credit card clients
Machine Learning Project - Default credit card clients
 
Lead Scoring Case Study
Lead Scoring Case StudyLead Scoring Case Study
Lead Scoring Case Study
 

Semelhante a exploratory data analysis on german credit data

profiling creditworthiness &entrepreneurship using psychometric tools
profiling creditworthiness &entrepreneurship using psychometric toolsprofiling creditworthiness &entrepreneurship using psychometric tools
profiling creditworthiness &entrepreneurship using psychometric toolsRaj Dravid
 
How To Score With Credit
How To  Score With  CreditHow To  Score With  Credit
How To Score With Credittrustintiff
 
Microloan PowerPoint Presentation
Microloan PowerPoint PresentationMicroloan PowerPoint Presentation
Microloan PowerPoint Presentationmdesmond
 
Receivable management and Factoring (By BU AIS 2nd Batch)
Receivable management and Factoring (By BU AIS 2nd Batch)Receivable management and Factoring (By BU AIS 2nd Batch)
Receivable management and Factoring (By BU AIS 2nd Batch)Jessic Sharif
 
Credit Training Presentation
Credit Training PresentationCredit Training Presentation
Credit Training Presentationguest7f7d4
 
Get the Credit You Deserve
Get the Credit You DeserveGet the Credit You Deserve
Get the Credit You Deservekyliehatch
 
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)Saiful Islam
 
212013 14398 f013_credit rating
212013 14398 f013_credit rating212013 14398 f013_credit rating
212013 14398 f013_credit ratingSumit Sharma
 
Emerging manager renaissance
Emerging manager renaissanceEmerging manager renaissance
Emerging manager renaissancePeter Urbani
 

Semelhante a exploratory data analysis on german credit data (20)

profiling creditworthiness &entrepreneurship using psychometric tools
profiling creditworthiness &entrepreneurship using psychometric toolsprofiling creditworthiness &entrepreneurship using psychometric tools
profiling creditworthiness &entrepreneurship using psychometric tools
 
Credit bureau
Credit bureauCredit bureau
Credit bureau
 
Credit defaulter analysis
Credit defaulter analysisCredit defaulter analysis
Credit defaulter analysis
 
C C G Welcome Package
C C G  Welcome  PackageC C G  Welcome  Package
C C G Welcome Package
 
C C G Welcome Package
C C G  Welcome  PackageC C G  Welcome  Package
C C G Welcome Package
 
How To Score With Credit
How To  Score With  CreditHow To  Score With  Credit
How To Score With Credit
 
Microloan PowerPoint Presentation
Microloan PowerPoint PresentationMicroloan PowerPoint Presentation
Microloan PowerPoint Presentation
 
EXIM Bank
EXIM Bank EXIM Bank
EXIM Bank
 
Receivable management and Factoring (By BU AIS 2nd Batch)
Receivable management and Factoring (By BU AIS 2nd Batch)Receivable management and Factoring (By BU AIS 2nd Batch)
Receivable management and Factoring (By BU AIS 2nd Batch)
 
Credit 101 2014
Credit 101 2014Credit 101 2014
Credit 101 2014
 
CIBIL PPT.pptx
CIBIL PPT.pptxCIBIL PPT.pptx
CIBIL PPT.pptx
 
Credit Training Presentation
Credit Training PresentationCredit Training Presentation
Credit Training Presentation
 
Get the Credit You Deserve
Get the Credit You DeserveGet the Credit You Deserve
Get the Credit You Deserve
 
Credit 101 presentation
Credit 101 presentationCredit 101 presentation
Credit 101 presentation
 
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)
AIBB 202 Lesson 2.6: CRG & Internal Credit Risk Rating Systems (ICRRS)
 
212013 14398 f013_credit rating
212013 14398 f013_credit rating212013 14398 f013_credit rating
212013 14398 f013_credit rating
 
Unsecured Financing You Should Know About
Unsecured Financing You Should Know AboutUnsecured Financing You Should Know About
Unsecured Financing You Should Know About
 
Emerging manager renaissance
Emerging manager renaissanceEmerging manager renaissance
Emerging manager renaissance
 
Credit Score and Debt basics 2nd edition
Credit Score and Debt basics 2nd editionCredit Score and Debt basics 2nd edition
Credit Score and Debt basics 2nd edition
 
What Predicts the Growth of Small- and Medium-Sized Firms? Evidence from Tanz...
What Predicts the Growth of Small- and Medium-Sized Firms? Evidence from Tanz...What Predicts the Growth of Small- and Medium-Sized Firms? Evidence from Tanz...
What Predicts the Growth of Small- and Medium-Sized Firms? Evidence from Tanz...
 

Último

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 

Último (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 

exploratory data analysis on german credit data

  • 1. E.D.A By Adithi – E19002 Bhaswani – E19009 Neha – E19018
  • 2. BRIEF OVERVIEW:  To identify the attributes having influential power in decision making to either reject or accept loan application.  Context of the data set: The original dataset contains 1000 entries with 20 categorical/symbolic attributes. In this dataset, each entry represents a person who takes a credit by a bank. Each person is classified as good or bad credit risks according to the set of attributes.
  • 3. S.No Variable Description Data type 1 Credibility 1 : credit-worthy; [good risk] 0 : not credit-worthy [ bad risk ] Categorical 2 Balance of current account no running account - 1 No balance or debit -2; 0 <= ... < 200 DM – 3; ... >= 200 DM or checking account for at least 1 year-4; Categorical 3 Duration in months (metric) [<=12] – up to 1 year [12< ... <= 24] – 1-2 years [24 < ... <= 36] – 2-3 years [36 < ... <= 48] – 3- 4 years [48< ... <= 60] – 4-5 years [60 < ... <= 72] – 5-6 years NUMERICAL 4 Payment of previous credits no previous credits / paid back all previous credits - 2 paid back previous credits at this bank - 4 no problems with current credits at this bank - 3 problematic running account / there are further credits running but at other banks – 1 hesitant payment of previous credits - 0 CATEGORICAL 5 Purpose of credit new car - 1 used car - 2 items of furniture - 3 radio / television - 4 household appliances- 5 Repair -6 Education - 7 Vacation- 8 Retraining -9 Business- 10 Other -0 CATEGORICAL ATTRIBUTES:
  • 4. S.No Variable Description Data type 6 Amount of credit in DM [<=1500 ] - 1; [1500 < ... <= 4500] - 2; [4500 < ... <= 7500] - 3; [7500 < ... <= 10500] - 4; [10500 < ... <=13500] - 5; [13500 < ... <= 16500] - 6; [> 16500] - 7 Numerical 7 Value of savings or stocks not available / no savings - 1 [< 100], - 2 [100,- <= ... < 500], - 3 [500,- <= ... < 1000], - 4 [>= 1000], - 5 Categorical 8 Has been employed by current employer For Unemployed - 1 [<= 1] - 2 [1 <= ... < 4 ] - 3 [4 <= ... < 7]- 4 [>= 7] - 5 Categorical 9 rate Instalment in % of available income [>= 35] - 1 [25 <= ... < 35] - 2 [20 <= ... < 25] - 3 [< 20] - 4 Categorical 10 Marital Status / Sex male: divorced / living apart – 1; male: single- 2 male: married / widowed – 3; female: 4 Categorical 11 Further debtors / Guarantors None – 1; Co-Applicant – 2; Guarantor - 3 Categorical 12 Living in current household for [< 1 year] - 1 [1 <= ... < 4 ] years - 2 [4 <= ... < 7] years - 3 [ >= 7 ] years - 4 Categorical
  • 5. S.No Variable Description Data type 13 Most valuable available assets Ownership of house or land - 4 Savings contract with a building society / life insurance - 3 Car / other - 2 Not available / no assets -1 Categorical 14 Age in years (categorized) [0 <= ... <= 25] - 1 [ 26 <= ... <= 39 ] - 2 [ 40 <= ... <= 59] - 3 [ 60 <= ... <= 64 ] - 4 [ >= 65 ] - 5 Numerical 15 Further running credits At other banks – 1 At department store or mail order house - 2 No further running credits – 3 Categorical 16 Type of apartment Rented-1; owned – 2 ; free - 3 Categorical 17 Number of previous credits at this bank (including the running one) One- 1; two or three – 2; four or five – 3; six and above - 4 Categorical 18 Occupation Unemployed / unskilled with no permanent residence - 1 Unskilled with permanent residence - 2 Skilled worker / skilled employee / minor civil servant - 3 Executive / self-employed / higher civil servant - 4 Categorical 19 Number of persons entitled to maintenance 0 to 2 – 2 ; 3 and more - 1 Numerical 20 Telephone No- 1 ; yes - 2 Categorical 21 Foreign worker Yes- 1; no - 2 Categorical
  • 6. • We have the population distribution in proposition of 70:30 risk wise • We have 4 numeric and 16 categorical features. • Few non influencing variables which may not contribute for decision making • To find, which is the most influencing variable, we adapted a techniques – WOE-IV From the data:
  • 8. WOE & IV are simple, yet powerful techniques to perform variable transformation and selection. It is widely used in credit scoring to measure the separation of good vs bad customers.
  • 10. Age Group Total Number of Loans Number of Bad Loans Numbef of Good Loans % Bad Loans Name of Group Distibution Bad (DB) Distibution Good (DG) WOE DG - DB (DG - DB)* WOE 21 - 30 4821 206 4615 4.3% G1 0.135 0.078 -0.553 -0.057 0.0318 30 - 36 10266 357 9909 3.5% G2 0.235 0.167 -0.339 -0.067 0.0228 36 - 48 32926 776 32150 2.4% G3 0.510 0.542 0.062 0.032 0.0020 48 - 60 12788 183 12605 1.4% G4 0.120 0.213 0.570 0.092 0.0527 Total 60801 1522 59279 Information Value --> 0.1093
  • 11. Higher the age higher the credibility But above sixty years i.e., after retirement the credibility is reduced IV : 0.093 Weak predictive Power Female have good credibility Among male married have high credibility IV : 0.045 Weak predictive Power
  • 12. Higher the balance in account more the probability to fall in good risk IV : Savings Account: 0.196 Medium predictive Power Current Account:0.666 Suspicious Predictive Power / Too good to rely on Predictive Power Of: CA>SB
  • 13. Duration In Months Lower the duration lower the bad risk IV : 0.166 Medium predictive Power Amount of credit Lower the amount lower the bad risk <=1500 also have slight increase in bad risk IV : 0.165 Medium predictive Power
  • 14. PURPOSE OF CREDIT If the purpose of the loan is to create an asset good risk should be high Where as the purpose is an expenditure , bad risk should be high. But for vacation it shows high good risk. On Further observation, the no of loan given for the purpose of vacation are just 9 not even 1% (0.9 %) Hence ignored..! IV : 0.166 Medium predictive Power PURPOSE 0 1 2 3 4 5 6 8 9 10 NOT CREDIBLE 89 17 58 62 4 8 22 1 34 5 CREDIBLE 145 86 123 218 8 14 28 8 63 7 Grand Total 234 103 181 280 12 22 50 9 97 12
  • 15. Higher the no of years employment , Higher the credibility IV : 0.086 Weak predictive Power People with no assets are having high probability of falling into credible category IV : 0.113 Medium predictive Power
  • 16. Payment Of Previous Credits Bad risk is observed in people who are hesitant to pay previous credits IV : 0.293 Medium predictive Power Bad risk is observed in people whose instalment is lower in % of the income. Which is contrary…! Though the pattern is almost resembling the population. IV : 0.026 Weak predictive Power
  • 17. Higher the no of credits availed higher the credibility. But not more than 6 credit facilities. IV : 0.013 Not useful for prediction People with no current credits are having high credibility. IV : 0.085 Weak predictive Power
  • 18. If the loan is secured by a guarantor it shows high credibility. IV : 0.032 Weak predictive Power People work abroad are given high credibility IV : 0.087 Weak predictive Power For people who have Rented housing as got high credibility..! IV : 0.085 Weak predictive Power 17.9% 71.4% 10.7% 96.3 % 3.7 %
  • 19. Not influencing variables as they are representing the population distribution of 70:30 propositionIV VALUES:
  • 20. Further analysis…! ATTRIBUTE IV INTERPRETATION Current Account Balance 0.666 Suspicious Predictive Power Payment Status Of Previous Credit 0.293 Medium predictive Power Value Savings/Stocks 0.196 Medium predictive Power Purpose 0.166 Medium predictive Power Duration Of Credit (Month) 0.165 Medium predictive Power Credit Amount 0.119 Medium predictive Power Most Valuable Available Asset 0.113 Medium predictive Power Age 0.093 Weak predictive Power Foreign Worker 0.087 Weak predictive Power Length Of Current Employment 0.086 Weak predictive Power Housing 0.085 Weak predictive Power Concurrent Credit 0.058 Weak predictive Power Sex & Marital Status 0.045 Weak predictive Power Guarantor /Debtor 0.032 Weak predictive Power Instalment Per Cent 0.026 Weak predictive Power No Of Credits 0.013 Not useful for prediction Telephone 0.01 Not useful for prediction Occupation 0.009 Not useful for prediction Duration In Current House 0.004 Not useful for prediction Dependents 0.00004 Not useful for prediction
  • 21. CHOOSING MODEL  when customer applies for a loan, the bank accepts or rejects the application based on predicted risk -probability of default- for the application.  Considering this is an objective segmentation, we need to have a target/dependent variable. In this case it will be whether a customer has Bad or good risk over the loan.  If we are working on an objective segmentation problem, our aim is to find conditions which help us find a segment which is very similar on target variable value.  Decision Tree is one of the commonly used as objective segmentation techniques.  Based on the WOE – IV we have chosen the variables with good predictive power for building a decision tree
  • 22. DECISION TREE:  Interpretation:  Train-test split : 70:30  Class1 : credible  Class 0: not credible  Depth : 3  Accuracy: 0.76  Precision: 0.77  Sensitivity: 0.92  Specificity: 35  F1 score: 0.84
  • 23.  Interpretation:  Train-test split : 70:30  Class1 : credible  Class 0: not credible  Depth :4  Accuracy: 0.74  Precision: 0.77  Sensitivity: 0.89  Specificity: 37  F1 score: 0.83
  • 24. FURTHER ANALYSIS TO BE CONTD.. THANK YOU…! Queries..?