SlideShare uma empresa Scribd logo
1 de 23
What’s in a Label?
Business value of “soft” vs “hard” cluster ensembles
                                              solutions-2
                              Nicole Huyghe & Anita Prinzie
Answers the who and the why
Theme 1


Theme 2


Theme 3

          ...

Theme 9
Theme 10


           Cluster
          Ensemble
HARD OR SOFT
CLUSTER ENSEMBLE
Stability   Integrity   Accuracy   Size
Stability




Similarity Index (Lange et al, 2004) indicates the percentage of pairs of observations that belong to the same
cluster in both clustering C and clustering C’.
Cluster Integrity – Heterogeneity




Total separation of clusters: based on the distance between cluster centers
Cluster Integrity - Homogeneity




Scatter (compactness): average ratio of the cluster variance to the variance of the dataset.
Accuracy
                Reality                                                    Prediction

                        5                                                             5
                                                                                              6
                   4        6                                                  4
                                                                                          2

    1         2                                                       1
         3                      7                                                             7
                                                                           3
                       8                                                             8

                            9                                                             9




Adjusted Rand Index (Hubert and Arabie, 1985): level of agreement between the predicted segment and the real
segment correcting for the expected level of agreement.
Size




Uniformity deviation: average deviation from each segment from uniform segment size (1/number of segments).
Rheumatism


Software journey


Osteoporosis
Stability          Heterogeneity


 H>S                     H>S




       Accuracy           Homogeneity

                                      S>H
H>S                      H>S
                   S>H                  S>H
LC gives smaller segments
            Rheumatism

Soft LC
Soft CCEA
Hard LC
Hard CCEA


            Software journey   Osteoporosis

Soft LC
Soft CCEA
Hard LC
Hard CCEA
MIXED EVIDENCE
Fixed Factors




                         x 10
 100   100   100   100
Stability: SOFT is better




                                  High confidence
                                  Low confidence

                                 Sim. Index soft > hard
                                 Sim. Index hard > soft

        Strong        Weak
       similarity   similarity
Homogeneity: SOFT is better


                                Scatter hard > soft


                                High confidence
                                Low confidence




       Strong        Weak
      similarity   similarity
Heterogeneity: Hard is better




                                High confidence
                                Low confidence



                                Tot. Sep. soft > hard

       Strong        Weak
      similarity   similarity
Size: Hard is better




                                 High confidence
                                 Low confidence



                                 Uni. dev. soft > hard

        Strong        Weak
       similarity   similarity
HARD ENSEMBLES
GIVE BETTER
BUSINESS
SEGMENTS
Anita Prinzie, Nicole Huyghe
                     anita@solutions2.be
                      www.solutions2.be




        do we cause

risingquestions
References

•   Fred and Jain, Combining Multiple Clustering using Evidence
    Accumulation (2005), IEEE Transactions on Pattern analysis and
    Machine Intelligence, 27(6), 835-850.
•   Lange, T., Roth., V., Braun L. And Buhmann J.M. (2004) , Stability-
    based validation of Clustering Solutions, Neural Computation, 16,
    1299-1323.
•   Haldiki, M.,Vazirgiannis M. and Batistakis, Y. (2000), Quality Scheme
    Assessment in the Clustering Process, Proc. Of the 4th European
    Conference on Principles of Data Mining and Knowledge
    Discovery, 265-276.
•   Hubert, L. And Arabie, P. (1985) Comparing partitions, Journal of
    Classification, 193-218.
•   Nieweglowski, L., CLV package (2007), R software.
•   Martin, A., Quinn, K.M. And Park, J.H., Markov Chain Monte Carlo
    Package (MCMCpack) (2003-2012), R software.

Mais conteúdo relacionado

Destaque

Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...
Compusense Inc.
 
Worst practices in Business Intelligence setup
Worst practices in Business Intelligence setupWorst practices in Business Intelligence setup
Worst practices in Business Intelligence setup
The Marketing Distillery
 
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
Compusense Inc.
 
2014 AR_English_WEB
2014 AR_English_WEB2014 AR_English_WEB
2014 AR_English_WEB
Yang Zhao
 

Destaque (17)

Gamification and the Moodle gradebook
Gamification and the Moodle gradebookGamification and the Moodle gradebook
Gamification and the Moodle gradebook
 
Twenty Years of CRC A Balance Sheet Volume II
Twenty Years of CRC A Balance Sheet Volume IITwenty Years of CRC A Balance Sheet Volume II
Twenty Years of CRC A Balance Sheet Volume II
 
Peculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi riverPeculiarities of transportation by the Mississippi river
Peculiarities of transportation by the Mississippi river
 
02.11.2012, NEWSWIRE, Issue 246
02.11.2012, NEWSWIRE, Issue 24602.11.2012, NEWSWIRE, Issue 246
02.11.2012, NEWSWIRE, Issue 246
 
19.07.2013, NEWSWIRE, Issues 282 283
19.07.2013, NEWSWIRE, Issues 282 28319.07.2013, NEWSWIRE, Issues 282 283
19.07.2013, NEWSWIRE, Issues 282 283
 
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon WinesSegmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
Segmentation of BIB Consumer Liking Data of 12 Cabernet Sauvignon Wines
 
The power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panelsThe power of calibrated descriptive sensory panels
The power of calibrated descriptive sensory panels
 
Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...Sensory Informed Design: An effective clustering of incomplete block consumer...
Sensory Informed Design: An effective clustering of incomplete block consumer...
 
Best Practices in Equivalence Testing
Best Practices in Equivalence TestingBest Practices in Equivalence Testing
Best Practices in Equivalence Testing
 
27.06.2014, NEWSWIRE, Issue331
27.06.2014, NEWSWIRE, Issue33127.06.2014, NEWSWIRE, Issue331
27.06.2014, NEWSWIRE, Issue331
 
Panel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case StudyPanel Recruitment and Scheduling Case Study
Panel Recruitment and Scheduling Case Study
 
Worst practices in Business Intelligence setup
Worst practices in Business Intelligence setupWorst practices in Business Intelligence setup
Worst practices in Business Intelligence setup
 
INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015INFORME: Fondos Soberanos 2015
INFORME: Fondos Soberanos 2015
 
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
A Preliminary Review of Multiple Group Principal Component Analysis for Descr...
 
2014 AR_English_WEB
2014 AR_English_WEB2014 AR_English_WEB
2014 AR_English_WEB
 
Sam Decker at SBS2010
Sam Decker at SBS2010Sam Decker at SBS2010
Sam Decker at SBS2010
 
Bazaarvoice
BazaarvoiceBazaarvoice
Bazaarvoice
 

Semelhante a Sawtooth 2012 what's in a label

TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
Anish Acharya
 
★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis
irisshicat
 

Semelhante a Sawtooth 2012 what's in a label (8)

Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
 
Pareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender SystemsPareto-Efficient Hybridization for Multi-Objective Recommender Systems
Pareto-Efficient Hybridization for Multi-Objective Recommender Systems
 
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee ApproachModular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
Modular Reasoning about Aspect-Oriented Programs: A Rely-Guarantee Approach
 
Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
 Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
Rely-Guarantee Approach to Reasoning about Aspect-Oriented Programs
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User Study
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
 
★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis★Mean shift a_robust_approach_to_feature_space_analysis
★Mean shift a_robust_approach_to_feature_space_analysis
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
 

Mais de solutions-2 (7)

Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3Showroom visual storytelling presentatie v3
Showroom visual storytelling presentatie v3
 
Solutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting toolsSolutions 2 - examples xls reporting tools
Solutions 2 - examples xls reporting tools
 
Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012Presentation nicole huyghe (advanced analytics) get inspired 2012
Presentation nicole huyghe (advanced analytics) get inspired 2012
 
Tables 2 slideshare
Tables 2 slideshareTables 2 slideshare
Tables 2 slideshare
 
The big window, bbc & solutions 2 - changing the way we think about age
The big window, bbc & solutions 2  -  changing the way we think about ageThe big window, bbc & solutions 2  -  changing the way we think about age
The big window, bbc & solutions 2 - changing the way we think about age
 
New company presentation slideshare
New company presentation slideshareNew company presentation slideshare
New company presentation slideshare
 
Inspiration run 2011 slideshare version
Inspiration run 2011   slideshare versionInspiration run 2011   slideshare version
Inspiration run 2011 slideshare version
 

Último

Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
amitlee9823
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Sheetaleventcompany
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Anamikakaur10
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 

Último (20)

Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
Falcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business GrowthFalcon Invoice Discounting: Empowering Your Business Growth
Falcon Invoice Discounting: Empowering Your Business Growth
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture concept
 
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceMalegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLJAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
 
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 

Sawtooth 2012 what's in a label

  • 1. What’s in a Label? Business value of “soft” vs “hard” cluster ensembles solutions-2 Nicole Huyghe & Anita Prinzie
  • 2. Answers the who and the why
  • 3. Theme 1 Theme 2 Theme 3 ... Theme 9 Theme 10 Cluster Ensemble
  • 5. Stability Integrity Accuracy Size
  • 6. Stability Similarity Index (Lange et al, 2004) indicates the percentage of pairs of observations that belong to the same cluster in both clustering C and clustering C’.
  • 7. Cluster Integrity – Heterogeneity Total separation of clusters: based on the distance between cluster centers
  • 8. Cluster Integrity - Homogeneity Scatter (compactness): average ratio of the cluster variance to the variance of the dataset.
  • 9. Accuracy Reality Prediction 5 5 6 4 6 4 2 1 2 1 3 7 7 3 8 8 9 9 Adjusted Rand Index (Hubert and Arabie, 1985): level of agreement between the predicted segment and the real segment correcting for the expected level of agreement.
  • 10. Size Uniformity deviation: average deviation from each segment from uniform segment size (1/number of segments).
  • 12. Stability Heterogeneity H>S H>S Accuracy Homogeneity S>H H>S H>S S>H S>H
  • 13. LC gives smaller segments Rheumatism Soft LC Soft CCEA Hard LC Hard CCEA Software journey Osteoporosis Soft LC Soft CCEA Hard LC Hard CCEA
  • 15. Fixed Factors x 10 100 100 100 100
  • 16.
  • 17. Stability: SOFT is better High confidence Low confidence Sim. Index soft > hard Sim. Index hard > soft Strong Weak similarity similarity
  • 18. Homogeneity: SOFT is better Scatter hard > soft High confidence Low confidence Strong Weak similarity similarity
  • 19. Heterogeneity: Hard is better High confidence Low confidence Tot. Sep. soft > hard Strong Weak similarity similarity
  • 20. Size: Hard is better High confidence Low confidence Uni. dev. soft > hard Strong Weak similarity similarity
  • 22. Anita Prinzie, Nicole Huyghe anita@solutions2.be www.solutions2.be do we cause risingquestions
  • 23. References • Fred and Jain, Combining Multiple Clustering using Evidence Accumulation (2005), IEEE Transactions on Pattern analysis and Machine Intelligence, 27(6), 835-850. • Lange, T., Roth., V., Braun L. And Buhmann J.M. (2004) , Stability- based validation of Clustering Solutions, Neural Computation, 16, 1299-1323. • Haldiki, M.,Vazirgiannis M. and Batistakis, Y. (2000), Quality Scheme Assessment in the Clustering Process, Proc. Of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 265-276. • Hubert, L. And Arabie, P. (1985) Comparing partitions, Journal of Classification, 193-218. • Nieweglowski, L., CLV package (2007), R software. • Martin, A., Quinn, K.M. And Park, J.H., Markov Chain Monte Carlo Package (MCMCpack) (2003-2012), R software.