2. RELEVANT INFORMATION
The data is related with direct marketing campaigns of a Portuguese banking institution.
The marketing campaigns were based on phone calls. Often, more than one contact to
the same client was required, in order to access if the product (bank term deposit)
would be (or not) subscribed.
3. DATA ATTRIBUTES
Number of Instances: 45211
Number of Attributes: 16 + output attribute.
Attribute information:
Input variables:
# bank client data:
1 - age (numeric)
2 - job : type of job (categorical: admin.", "unknown", "unemployed", "management",
"housemaid", “entrepreneur", "student", "blue-collar", "self-employed",“ retired", "technician",
"services")
3 - marital : marital status (categorical: "married",“ divorced", "single"; note: "divorced" means
divorced or widowed)
4 - education (categorical: "unknown", "secondary", "primary", "tertiary")
5 - default: has credit in default? (binary: "yes", "no")
6 - balance: average yearly balance, in euros (numeric)
7 - housing: has housing loan? (binary: "yes", "no")
8 - loan: has personal loan? (binary: "yes","no")
4. DATA ATTRIBUTES…
# related with the last contact of the current campaign:
9 - contact: contact communication type (categorical: "unknown","telephone","cellular")
10 - day: last contact day of the month (numeric)
11 - month: last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec")
12 - duration: last contact duration, in seconds (numeric)
# other attributes:
13 - campaign: number of contacts performed during this campaign and for this client
(numeric, includes last contact)
14 - pdays: number of days that passed by after the client was last contacted from a previous
campaign (numeric, -1 means client was not previously contacted)
15 - previous: number of contacts performed before this campaign and for this client
(numeric)
16 - poutcome: outcome of the previous marketing campaign (categorical:
"unknown","other","failure","success")
5. EXPECTED OUTPUT
Output variable (desired target):
17 - y - has the client subscribed a term deposit? (binary: "yes", "no")
16. FEATURE SELECTION – BEST PREDICTORS
Using Feature Selection and
Variable Screening - Statistica
17. Tree 1 graph for y
Num. of non-terminal nodes: 9, Num. of terminal nodes: 10
ID=1 N=45211
no
ID=2 N=40238
no
ID=4 N=37936
no
ID=7 N=36800
no
ID=9 N=12160
no
ID=13N=10417
no
ID=14 N=7410
no
ID=16 N=6073
no
ID=3 N=4973
no
ID=6 N=1136
no
ID=8 N=24640
no
ID=12 N=1743
no
ID=18 N=6071
no
ID=19 N=2
yes
ID=17 N=1337
no
ID=15 N=3007
no
ID=5 N=2302
no
ID=24 N=3191
no
ID=25 N=1782
yes
duration
<= 521.500000 > 521.500000
Year
<= 2009.500000 > 2009.500000
month
= oct , ... = Other(s)
Year
<= 2008.500000 > 2008.500000
month
= jun , ... = Other(s)
housing
= yes ... = Other(s)
duration
<= 306.500000 > 306.500000
age
<= 68.500000 > 68.500000
duration
<= 827.500000 > 827.500000
no
yes
18. ACCURACY
Observed Predicted No Predicted Yes Row Total
Number no 39176 746 39922
Column % 90.21% 41.82%
Row % 98.13% 1.87%
Total % 86.65% 1.65% 88.30%
Number yes 4251 1038 5289
Column % 9.79% 58.18%
Row % 80.37% 19.63%
Total % 9.40% 2.30% 11.70%
Count All Groups 43427 1784 45211
Total % 96.05% 3.95%
Accuracy = (39176+1038)/45211) = 0.8894
19. RECOMMENDATION
The business need to optimize the Marketing Campaign
With the classification model – the conditions to predict the client subscribing a
term loan
• Last contact duration, in sec should be more than 521.5 sec and Year
should be 2009 /2010
• Contact Type should be Cellular
• Contact month to be Oct, Dec, Mar, Sep
• Age less than 60.5 Years
• Conversion rate with Marital Status – Single (17.5%), Divorced (13.5%) and
Married (11.2%)
26. CLASSIFICATION DONE TO CHECK THE DEPENDENCY AND IMPORTANCE OF VARIABLES RELATED WITH THE LAST
CONTACT OF THE CURRENT CAMPAIGN VARAIBLES (DURATION IS THE SIGNIFICANT VARIABLE)
27. CLASSIFICATION DONE TO CHECK THE DEPENDENCY AND IMPORTANCE OF VARIABLES RELATED
WITH THE OTHER ATTRIBUTES
28. CLASSIFICATION DONE TO CHECK THE DEPENDENCY AND IMPORTANCE OF VARIABLES RELATED
WITH THE OTHER ATTRIBUTES (EXCLUDING POUTCOME)
29.
30. Based on the Previous campaign; the response effectiveness