SlideShare uma empresa Scribd logo
1 de 27
Master the Art of Analytics
A Simplistic Explainer Series For Citizen Data Scientists
J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
Hierarchical Clustering
Parameter Tuning & Use cases
Introduction with example
Standard tuning parameters
Sample UI for input variables & parameters
selection & output
Other sample output formats
Interpretation of Output
Limitations
Business use cases
What Are
All Covered
What is it used for…
It’s a process by which objects are classified
into number of groups so that they are as
much dissimilar as possible from one group
to another group and as much similar as
possible within each group
Thus it’s simply a grouping of similar things
/data points
For example ,objects within group
1(clusterc1) shown in image above should be
as similar as possible in terms of attributes of
the objects.
Objects in group 1 & group 2 should be as
dissimilar as possible
Some Examples
Let’s take
a few
examples
for more
clarity :
Movie tickets booking website users can be
grouped into movie freaks/moderate
watchers/ rare watchers based on their past
movie tickets purchase behavior, such as
#days from last movie seen , average number
of tickets booked each time , frequency of
tickets booking per month , etc.
Retail customers can be clubbed into loyal /
infrequent / rare customer groups based on
their retail outlet/website visits per month ,
purchase amount per month , purchase
frequency per month etc.
 It is used to find natural groups which
have not been explicitly labeled in the
data. This can be used to confirm
business assumptions about what types
of groups exist or to identify unknown
groups in complex data sets.
 Once the algorithm has been run and the
groups are defined, any new data can be
easily assigned to the relevant group
We divide the cluster data into two clusters such
that the data points in same cluster are as much
similar/closer as possible but data points in
different clusters are as much dissimilar/far as
possible
All observations start in one cluster
For each cluster we repeat the process until
specified number of clusters is reached
How it Works
Standard Tuning
Parameters
Standard Tuning Parameters
oNumber of clusters:
Suggested range : 3 to 5,
though this depends on the
business demand
The number of clusters must
be less than the number of
data points in the data set
Iterations:
The number of iterations
generate clusters
Default value should be set
to 20
Sample UI For Input
Variables & Parameters
Selection & Output
Sample UI For Selecting Predictors And Applying Tuning
Parameters: For Two Predictors
The silhouette score is another useful criterion for assessing the natural and optimal number of clusters as well as for
checking overall quality of partition
The largest silhouette score, over different K, indicates the best number of clusters
Select the variables you
would like to use as
predictors to build
clusters
Height
Weight
BMI
21
Tuning parameters
Number of clusters
Maximum iterations
Height(cm) Weight(Kg) BMI
Cluster
Number
158 60 23 1
160 65 25 2
170 70 26 2
149 50 21 1
180 80 27 3
165 80 28 3
200 90 23 1
Each customer is assigned a cluster membership as shown in the table in left
Height
Weight
silhouette = 0.7
Indicating good quality clusters
Output UI: For two
predictors
As clusters are built using only 2 predictors
here , scatter plot axis will reflect actual
predictors i.e. Height and Weight instead of
principle components .
Again, lesser the overlap in cluster outlines ,
better the clusters’ assignment
Alternatively silhouette score can be checked
to evaluate how clusters are partitioned -
Closer this value to 1 , better the partition
quality.
Sample UI for selecting predictors and applying
tuning parameters: For four predictors•
Select the variables you
would like to use as
predictors to build
clusters
Purchase amount
Purchase frequency
Total purchase quantity
Annual income
Website visits
21
Tuning parameters
Number of clusters
Maximum iterations
 The silhouette score is another useful criterion for assessing the natural and optimal
number of clusters as well as for checking overall quality of partition
 The largest silhouette score, over different K, indicates the best number of clusters
Customer ID
Purchase
amount(per
month)
Purchase
frequency
(per month)
Total
purchase
quantity
Annual
Income slab
Website
visits (per
month)
Cluster
Number
1 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2
2 1k to 5k 1 4 1 lac to 2 lac <3 1
3 10 to 15k 3 8 4 lac to 6 lac 6 to 10 3
4 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2
5 10 to 15k 3 8 4 lac to 6 lac 6 to 10 3
6 1k to 5k 1 2 1 lac to 2 lac <3 1
7 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2
8 1k to 5k 1 2 1 lac to 2 lac <3 1
9 1k to 5k 1 2 1 lac to 2 lac <3 1
10 >20k 4 32
10 lac to 15
lac
>10 3
First principal component
Secondprincipalcomponent
Output UI: For four
predictors
• In the 2D cluster plot shown in right , clusters
distribution is plotted.
• In this case, axis will reflect first two principle
components (check definition below ) instead of
actual predictors as number of predictors is >2. In
case of 3D plot , first three principal components
will be shown. Lesser the overlap between clusters
, better the clusters’ assignment.
• Also silhouette score can be checked to evaluate
how clusters are partitioned - Closer this value to 1
, better the partition quality.
Whenever there are more than 2 predictors as input ,
axis will reflect principle components instead of actual
predictors
Principle components are linear combination of original
predictors which captures the maximum variance in
data set.
Most of the variance in data is explained by first three
principle components so we can ignore remaining
components.
Each customer is assigned a cluster membership as shown in the table in left
Other Sample Output
Formats
Other Sample Output
Formats
Clusters with silhouette
score closer to 1 are
more desirable. So this
index can be used to
measure the goodness
of clusters.
silhouette = -0.5
Indicating poor quality clusters
Limitations
Limitations
Relatively time consuming
compared to other clustering
methods
The distance measure
used to separate the
data points in a cluster
is affected by the scale
of variables, hence
scaling/normalization of
variables is necessary
prior to clustering
Based on the distance
calculation method
chosen, the algorithm
sometimes may get
influenced by outliers
Applications & Business
Use Cases
General applications
Some examples of use cases are:
Behavioral segmentation:
Segment customers by purchase
history
Segment users by activities on
application, website, or platform
Define personas based on interests
Create consumer profiles based on
activity monitoring
Some other general applications :
Pattern recognitions, Market segmentation, Classification analysis, Artificial intelligence, agriculture and many others.
Use Case 1
Business benefit:
• Once segments are identified ,
bank will have a loan applicants’
dataset with each applicant labeled
as high/medium/low risk.
• Based on this labels , bank can
easily make a decision on whether
to give loan to an applicant or not
and if yes then how much credit
limit and interest rate each
applicant is eligible for based on
the amount of risk involved.
Business problem :
• Grouping loan applicants into
high/medium/low risk applicants
based on attributes such as Loan
amount , Monthly installment,
Employment tenure , Times
delinquent, Annual income, Debt to
income ratio etc.
Use case 1 : Input dataset
Customer ID Loan amount
Monthly
installment
Annual income
Debt to
income ratio
Times delinquent Employment tenure
1039153 21000 701.73 105000 9 5 4
1069697 15000 483.38 92000 11 5 2
1068120 25600 824.96 110000 10 9 2
563175 23000 534.94 80000 9 2 12
562842 19750 483.65 57228 11 3 21
562681 25000 571.78 113000 10 0 9
562404 21250 471.2 31008 12 1 12
700159 14400 448.99 82000 20 6 6
696484 10000 241.33 45000 18 8 2
702598 11700 381.61 45192 20 7 3
702470 10000 243.29 38000 17 9 7
702373 4800 144.77 54000 19 8 2
701975 12500 455.81 43560 15 8 4
Use case 1 : Output : Each record will have the
cluster (segment) assignment as shown below
Customer ID Loan amount
Monthly
installment
Annual
income
Debt to
income ratio
Times delinquent Employment tenure Cluster
1039153 21000 701.73 105000 9 5 4 Medium
1069697 15000 483.38 92000 11 5 2 Medium
1068120 25600 824.96 110000 10 9 2 Medium
563175 23000 534.94 80000 9 2 12 Low
562842 19750 483.65 57228 11 3 21 Low
562681 25000 571.78 113000 10 0 9 Low
562404 21250 471.2 31008 12 1 12 Low
700159 14400 448.99 82000 20 6 6 High
696484 10000 241.33 45000 18 8 2 High
702598 11700 381.61 45192 20 7 3 High
702470 10000 243.29 38000 17 9 7 High
702373 4800 144.77 54000 19 8 2 High
701975 12500 455.81 43560 15 8 4 High
Use case 1: Output : Cluster profiles
• As can be seen in the table above, there are distinctive characteristics of high /medium and low risk segments
• High risk segment has high likelihood to be delinquent, highest debt to income ratio and lowest employment tenure as compared to other
two segments
• Whereas low risk segment exhibits exactly the opposite pattern i.e. lowest debt to income ratio, lowest delinquency and highest employment
tenure as compared to other two segments
• Hence , delinquency , employment tenure and debt to income ratio are the determinant factors when it comes to segmenting loan applicants
Cluster
Loan
amount
Monthly
installment
Annual
income
Debt to
income
ratio
Times
delinquent
Employment
tenure
Risk
Segment
1 10447.30 304.87 66467.74 9.58 1.69 16.82 Low
2 21391.58 598.54 94912.59 12.37 5.98 4.58 Medium
3 7521.32 227.43 60935.28 16.55 6.91 4.01 High
Use case 1: Output : Cluster distribution
• In the cluster distribution plot ,
there is negligible overlap in
cluster outlines so we can say that
cluster assignments is good in our
case
• Clusters with silhouette width
average closer to 1 are more
desirable. So this index can be
used to test the quality of clusters’
distribution.
Use case 2
Business benefit:
• Once segments are identified
, marketing messages and
even products can be
customized for each segment.
• The better the segment(s)
chosen for targeting by a
particular organization , the
more successful it is assumed
to be in the market place.
Business problem :
• Organizing customers into
groups/segments based on
similar traits, product
preferences and expectations
• Segments are constructed on
basis of the customers’
demographic characteristics,
psychographics, past behavior
and product use behaviors
Use case 3
Business benefit:
• Business marketing team can focus
on risky customer segments in
efficient way in order to avert them
from churning/leaving
• Sales team segments which are
facing challenges based on current
discounting strategy can be
identified and deal negotiation
strategy can be improved
/optimized for them.
Business problem :
• Discount Analysis and Customer
Retention – Visualize ‘segments of
sales group based on discount
behavior’ and ‘customer churn -
segments of customers on verge of
leaving’
Want to Learn
More?
Get in touch with us @
support@Smarten.com
And Do Checkout the Learning section
on
Smarten.com
June 2018

Mais conteúdo relacionado

Mais procurados

What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...Smarten Augmented Analytics
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values Smarten Augmented Analytics
 
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...Smarten Augmented Analytics
 
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...Smarten Augmented Analytics
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...Smarten Augmented Analytics
 
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?Smarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenSmarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?Smarten Augmented Analytics
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning ProjectAbhishek Singh
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...Smarten Augmented Analytics
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...PAPIs.io
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using RANURAG SINGH
 
Step by Step guide to executing an analytics project
Step by Step guide to executing an analytics projectStep by Step guide to executing an analytics project
Step by Step guide to executing an analytics projectRamkumar Ravichandran
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...Smarten Augmented Analytics
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROAnthony Kilili
 

Mais procurados (20)

What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
 
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...
What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Ent...
 
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...What is the Multinomial-Logistic Regression Classification Algorithm and How ...
What is the Multinomial-Logistic Regression Classification Algorithm and How ...
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
 
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - Smarten
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning Project
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
 
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
Machine Learning Performance Evaluation: Tips and Pitfalls - Jose Hernandez O...
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using R
 
Step by Step guide to executing an analytics project
Step by Step guide to executing an analytics projectStep by Step guide to executing an analytics project
Step by Step guide to executing an analytics project
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
 

Semelhante a What is Hierarchical Clustering and How Can an Organization Use it to Analyze Data?

customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedinAsoka Korale
 
Cross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customersCross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customersSaurabh Singh
 
Principal Component Analysis and Clustering
Principal Component Analysis and ClusteringPrincipal Component Analysis and Clustering
Principal Component Analysis and ClusteringUsha Vijay
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxpatilaniket2418
 
A Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningA Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningIRJET Journal
 
A02610104
A02610104A02610104
A02610104theijes
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
A high level overview of all that is Analytics
A high level overview of all that is AnalyticsA high level overview of all that is Analytics
A high level overview of all that is AnalyticsRamkumar Ravichandran
 
SQL for Business Problems.pptx
SQL for Business Problems.pptxSQL for Business Problems.pptx
SQL for Business Problems.pptxMustafa Ahmed
 
Report 190804110930
Report 190804110930Report 190804110930
Report 190804110930udara12345
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationVishva Abeyrathne
 
Stock Market Trends Prediction after Earning Release.pptx
Stock Market Trends Prediction after Earning Release.pptxStock Market Trends Prediction after Earning Release.pptx
Stock Market Trends Prediction after Earning Release.pptxChen Qian
 
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYPROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYIJITCA Journal
 
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesCredit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesZhongmin Luo
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal
 

Semelhante a What is Hierarchical Clustering and How Can an Organization Use it to Analyze Data? (20)

customer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedincustomer_profiling_based_on_fuzzy_principals_linkedin
customer_profiling_based_on_fuzzy_principals_linkedin
 
Cross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customersCross selling credit card to existing debit card customers
Cross selling credit card to existing debit card customers
 
Principal Component Analysis and Clustering
Principal Component Analysis and ClusteringPrincipal Component Analysis and Clustering
Principal Component Analysis and Clustering
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
A Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningA Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data Mining
 
07 learning
07 learning07 learning
07 learning
 
A02610104
A02610104A02610104
A02610104
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
A high level overview of all that is Analytics
A high level overview of all that is AnalyticsA high level overview of all that is Analytics
A high level overview of all that is Analytics
 
QQ Plot.pptx
QQ Plot.pptxQQ Plot.pptx
QQ Plot.pptx
 
SQL for Business Problems.pptx
SQL for Business Problems.pptxSQL for Business Problems.pptx
SQL for Business Problems.pptx
 
Report 190804110930
Report 190804110930Report 190804110930
Report 190804110930
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
 
Stock Market Trends Prediction after Earning Release.pptx
Stock Market Trends Prediction after Earning Release.pptxStock Market Trends Prediction after Earning Release.pptx
Stock Market Trends Prediction after Earning Release.pptx
 
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYPROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
 
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning TechniquesCredit Default Swap (CDS) Rate Construction by Machine Learning Techniques
Credit Default Swap (CDS) Rate Construction by Machine Learning Techniques
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim Prediction
 

Mais de Smarten Augmented Analytics

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenSmarten Augmented Analytics
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...Smarten Augmented Analytics
 
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?Smarten Augmented Analytics
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...Smarten Augmented Analytics
 
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...Smarten Augmented Analytics
 
What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?Smarten Augmented Analytics
 
What is Outlier Analysis and How Can It Improve Analysis?
What is Outlier Analysis and How Can It Improve Analysis?What is Outlier Analysis and How Can It Improve Analysis?
What is Outlier Analysis and How Can It Improve Analysis?Smarten Augmented Analytics
 
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...What is the Decision Tree Analysis and How Does it Help a Business to Analyze...
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...Smarten Augmented Analytics
 

Mais de Smarten Augmented Analytics (14)

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – Smarten
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – Smarten
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – Smarten
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - Smarten
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - Smarten
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - Smarten
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - Smarten
 
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
What Are Simple Random Sampling and Stratified Random Sampling Analytical Tec...
 
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
 
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
 
What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?What is SVM Classification Analysis and How Can It Benefit Business Analytics?
What is SVM Classification Analysis and How Can It Benefit Business Analytics?
 
What is Outlier Analysis and How Can It Improve Analysis?
What is Outlier Analysis and How Can It Improve Analysis?What is Outlier Analysis and How Can It Improve Analysis?
What is Outlier Analysis and How Can It Improve Analysis?
 
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...What is the Decision Tree Analysis and How Does it Help a Business to Analyze...
What is the Decision Tree Analysis and How Does it Help a Business to Analyze...
 

Último

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 

Último (20)

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 

What is Hierarchical Clustering and How Can an Organization Use it to Analyze Data?

  • 1. Master the Art of Analytics A Simplistic Explainer Series For Citizen Data Scientists J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
  • 3. Introduction with example Standard tuning parameters Sample UI for input variables & parameters selection & output Other sample output formats Interpretation of Output Limitations Business use cases What Are All Covered
  • 4. What is it used for… It’s a process by which objects are classified into number of groups so that they are as much dissimilar as possible from one group to another group and as much similar as possible within each group Thus it’s simply a grouping of similar things /data points For example ,objects within group 1(clusterc1) shown in image above should be as similar as possible in terms of attributes of the objects. Objects in group 1 & group 2 should be as dissimilar as possible
  • 5. Some Examples Let’s take a few examples for more clarity : Movie tickets booking website users can be grouped into movie freaks/moderate watchers/ rare watchers based on their past movie tickets purchase behavior, such as #days from last movie seen , average number of tickets booked each time , frequency of tickets booking per month , etc. Retail customers can be clubbed into loyal / infrequent / rare customer groups based on their retail outlet/website visits per month , purchase amount per month , purchase frequency per month etc.  It is used to find natural groups which have not been explicitly labeled in the data. This can be used to confirm business assumptions about what types of groups exist or to identify unknown groups in complex data sets.  Once the algorithm has been run and the groups are defined, any new data can be easily assigned to the relevant group
  • 6. We divide the cluster data into two clusters such that the data points in same cluster are as much similar/closer as possible but data points in different clusters are as much dissimilar/far as possible All observations start in one cluster For each cluster we repeat the process until specified number of clusters is reached How it Works
  • 8. Standard Tuning Parameters oNumber of clusters: Suggested range : 3 to 5, though this depends on the business demand The number of clusters must be less than the number of data points in the data set Iterations: The number of iterations generate clusters Default value should be set to 20
  • 9. Sample UI For Input Variables & Parameters Selection & Output
  • 10. Sample UI For Selecting Predictors And Applying Tuning Parameters: For Two Predictors The silhouette score is another useful criterion for assessing the natural and optimal number of clusters as well as for checking overall quality of partition The largest silhouette score, over different K, indicates the best number of clusters Select the variables you would like to use as predictors to build clusters Height Weight BMI 21 Tuning parameters Number of clusters Maximum iterations
  • 11. Height(cm) Weight(Kg) BMI Cluster Number 158 60 23 1 160 65 25 2 170 70 26 2 149 50 21 1 180 80 27 3 165 80 28 3 200 90 23 1 Each customer is assigned a cluster membership as shown in the table in left Height Weight silhouette = 0.7 Indicating good quality clusters Output UI: For two predictors As clusters are built using only 2 predictors here , scatter plot axis will reflect actual predictors i.e. Height and Weight instead of principle components . Again, lesser the overlap in cluster outlines , better the clusters’ assignment Alternatively silhouette score can be checked to evaluate how clusters are partitioned - Closer this value to 1 , better the partition quality.
  • 12. Sample UI for selecting predictors and applying tuning parameters: For four predictors• Select the variables you would like to use as predictors to build clusters Purchase amount Purchase frequency Total purchase quantity Annual income Website visits 21 Tuning parameters Number of clusters Maximum iterations  The silhouette score is another useful criterion for assessing the natural and optimal number of clusters as well as for checking overall quality of partition  The largest silhouette score, over different K, indicates the best number of clusters
  • 13. Customer ID Purchase amount(per month) Purchase frequency (per month) Total purchase quantity Annual Income slab Website visits (per month) Cluster Number 1 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2 2 1k to 5k 1 4 1 lac to 2 lac <3 1 3 10 to 15k 3 8 4 lac to 6 lac 6 to 10 3 4 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2 5 10 to 15k 3 8 4 lac to 6 lac 6 to 10 3 6 1k to 5k 1 2 1 lac to 2 lac <3 1 7 5k to 10k 2 6 2 lac to 4 lac 3 to 6 2 8 1k to 5k 1 2 1 lac to 2 lac <3 1 9 1k to 5k 1 2 1 lac to 2 lac <3 1 10 >20k 4 32 10 lac to 15 lac >10 3 First principal component Secondprincipalcomponent Output UI: For four predictors • In the 2D cluster plot shown in right , clusters distribution is plotted. • In this case, axis will reflect first two principle components (check definition below ) instead of actual predictors as number of predictors is >2. In case of 3D plot , first three principal components will be shown. Lesser the overlap between clusters , better the clusters’ assignment. • Also silhouette score can be checked to evaluate how clusters are partitioned - Closer this value to 1 , better the partition quality. Whenever there are more than 2 predictors as input , axis will reflect principle components instead of actual predictors Principle components are linear combination of original predictors which captures the maximum variance in data set. Most of the variance in data is explained by first three principle components so we can ignore remaining components. Each customer is assigned a cluster membership as shown in the table in left
  • 15. Other Sample Output Formats Clusters with silhouette score closer to 1 are more desirable. So this index can be used to measure the goodness of clusters. silhouette = -0.5 Indicating poor quality clusters
  • 17. Limitations Relatively time consuming compared to other clustering methods The distance measure used to separate the data points in a cluster is affected by the scale of variables, hence scaling/normalization of variables is necessary prior to clustering Based on the distance calculation method chosen, the algorithm sometimes may get influenced by outliers
  • 19. General applications Some examples of use cases are: Behavioral segmentation: Segment customers by purchase history Segment users by activities on application, website, or platform Define personas based on interests Create consumer profiles based on activity monitoring Some other general applications : Pattern recognitions, Market segmentation, Classification analysis, Artificial intelligence, agriculture and many others.
  • 20. Use Case 1 Business benefit: • Once segments are identified , bank will have a loan applicants’ dataset with each applicant labeled as high/medium/low risk. • Based on this labels , bank can easily make a decision on whether to give loan to an applicant or not and if yes then how much credit limit and interest rate each applicant is eligible for based on the amount of risk involved. Business problem : • Grouping loan applicants into high/medium/low risk applicants based on attributes such as Loan amount , Monthly installment, Employment tenure , Times delinquent, Annual income, Debt to income ratio etc.
  • 21. Use case 1 : Input dataset Customer ID Loan amount Monthly installment Annual income Debt to income ratio Times delinquent Employment tenure 1039153 21000 701.73 105000 9 5 4 1069697 15000 483.38 92000 11 5 2 1068120 25600 824.96 110000 10 9 2 563175 23000 534.94 80000 9 2 12 562842 19750 483.65 57228 11 3 21 562681 25000 571.78 113000 10 0 9 562404 21250 471.2 31008 12 1 12 700159 14400 448.99 82000 20 6 6 696484 10000 241.33 45000 18 8 2 702598 11700 381.61 45192 20 7 3 702470 10000 243.29 38000 17 9 7 702373 4800 144.77 54000 19 8 2 701975 12500 455.81 43560 15 8 4
  • 22. Use case 1 : Output : Each record will have the cluster (segment) assignment as shown below Customer ID Loan amount Monthly installment Annual income Debt to income ratio Times delinquent Employment tenure Cluster 1039153 21000 701.73 105000 9 5 4 Medium 1069697 15000 483.38 92000 11 5 2 Medium 1068120 25600 824.96 110000 10 9 2 Medium 563175 23000 534.94 80000 9 2 12 Low 562842 19750 483.65 57228 11 3 21 Low 562681 25000 571.78 113000 10 0 9 Low 562404 21250 471.2 31008 12 1 12 Low 700159 14400 448.99 82000 20 6 6 High 696484 10000 241.33 45000 18 8 2 High 702598 11700 381.61 45192 20 7 3 High 702470 10000 243.29 38000 17 9 7 High 702373 4800 144.77 54000 19 8 2 High 701975 12500 455.81 43560 15 8 4 High
  • 23. Use case 1: Output : Cluster profiles • As can be seen in the table above, there are distinctive characteristics of high /medium and low risk segments • High risk segment has high likelihood to be delinquent, highest debt to income ratio and lowest employment tenure as compared to other two segments • Whereas low risk segment exhibits exactly the opposite pattern i.e. lowest debt to income ratio, lowest delinquency and highest employment tenure as compared to other two segments • Hence , delinquency , employment tenure and debt to income ratio are the determinant factors when it comes to segmenting loan applicants Cluster Loan amount Monthly installment Annual income Debt to income ratio Times delinquent Employment tenure Risk Segment 1 10447.30 304.87 66467.74 9.58 1.69 16.82 Low 2 21391.58 598.54 94912.59 12.37 5.98 4.58 Medium 3 7521.32 227.43 60935.28 16.55 6.91 4.01 High
  • 24. Use case 1: Output : Cluster distribution • In the cluster distribution plot , there is negligible overlap in cluster outlines so we can say that cluster assignments is good in our case • Clusters with silhouette width average closer to 1 are more desirable. So this index can be used to test the quality of clusters’ distribution.
  • 25. Use case 2 Business benefit: • Once segments are identified , marketing messages and even products can be customized for each segment. • The better the segment(s) chosen for targeting by a particular organization , the more successful it is assumed to be in the market place. Business problem : • Organizing customers into groups/segments based on similar traits, product preferences and expectations • Segments are constructed on basis of the customers’ demographic characteristics, psychographics, past behavior and product use behaviors
  • 26. Use case 3 Business benefit: • Business marketing team can focus on risky customer segments in efficient way in order to avert them from churning/leaving • Sales team segments which are facing challenges based on current discounting strategy can be identified and deal negotiation strategy can be improved /optimized for them. Business problem : • Discount Analysis and Customer Retention – Visualize ‘segments of sales group based on discount behavior’ and ‘customer churn - segments of customers on verge of leaving’
  • 27. Want to Learn More? Get in touch with us @ support@Smarten.com And Do Checkout the Learning section on Smarten.com June 2018

Notas do Editor

  1. Four diff color for each group
  2. Check spark