O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.

O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.

O slideshow foi denunciado.

Gostou da apresentação? Compartilhe-a!

- What to Upload to SlideShare by LinkedIn SlideShare 3413168 views
- Global Healthcare Report Q2 2019 by CB Insights 1513266 views
- Be A Great Product Leader (Amplify,... by Adam Nash 444389 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 485502 views
- APIdays Paris 2019 - Innovation @ s... by apidays 528707 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 329168 views

222 visualizações

Publicada em

In this 30-minute ‘How To’ Webinar, author and NewMR founder Ray Poynter will cover:

- What is correlation?

- What is r-squared?

- When and why should we use correlation?

- How should we use correlation?

- Potential problems with correlation

- Alternatives to correlation

The webinar recording can be accessed via the NewMR Play Again page: https://newmr.org/play-again

Publicada em:
Educação

Sem downloads

Visualizações totais

222

No SlideShare

0

A partir de incorporações

0

Número de incorporações

1

Compartilhamentos

0

Downloads

22

Comentários

0

Gostaram

1

Nenhuma incorporação

Nenhuma nota no slide

- 1. How To Use Correlations To Find Stories In The Data May 2019 Ray Poynter NewMR
- 2. Today’s Plot 1. What is correlation? 2. The causes of correlation 3. Finding stories with correlation 4. Beyond correlation 5. Question and Answer
- 3. Correlation • Measures the linear association between 2 metric scales • There are correlation measures for nonmetric scales • It produces numbers between -1 and +1 • Where: +1 is perfect correlation, 0 is no correlation, and -1 is perfect negative correlation • We use the letter r to refer to the most common form of the correlation coefficient • We often square the correlation coefficient to get r‑squared, which we often express as a percentage
- 4. Perfect Correlation When we drive a car, the amount of fuel in the tank is negatively correlated with how far we have driven – until we add more fuel.
- 5. Strong Correlation • Correlation coefficients are not intuitive • We often use r-squared, r2, the variance • r-squared is the proportion of the total variance that is shared • 0.9 * 0.9 is 0.81 (often written 81%) • 81% of the variance is shared – and 19% is not
- 6. A typical strong correlation • In the real world, 0.7 (or -0.7) is a strong relationship • r2 for 0.7 (AND -0.7) is 51% • 51% of the total variation is shared (and 49% is not)
- 7. Notable correlations • In the real world, 0.5 (or -0.5) might be interesting • r2 for 0.5 (AND -0.5) is 25% • 25% of the total variation is shared (and 75% is not)
- 8. No (linear) relationship • In the real world, 0 is pretty rare • The process of measurement often creates some correlation • Selecting questions for a study often implies association
- 9. Today’s Plot 1. What is correlation? 2. The causes of correlation 3. Finding stories with correlation 4. Beyond correlation 5. Question and Answer
- 10. What ‘causes’ correlation “Correlation does not imply causality” If X is correlated with Y then we can usually say • X causes Y, or • Y causes X, or • X and Y are both caused by a common agent Z, or • X and Y both cause each other in a feedback system, or • It is just chance But if X is correlated with Y, we should investigate why
- 11. Smoking and Lung Cancer • 1950, UK, Richard Doll and Austin Bradford Hill conducted statistical research for the Medical Research Council • Discovered a correlation between the amount of tobacco smoked and lung cancer • Published findings and warnings in British Medical Journal in 1950 • A 1954 British Doctors Study confirmed the correlation, leading to UK Government advice that smoking and Cancer were related • Proof had to wait until the science could show the causative mechanism. Richard Doll Austin Bradford Hill
- 12. What ‘causes’ correlation “Correlation does not imply causality” If X is correlated with Y then we can usually say • X causes Y, or • Y causes X, or • X and Y are both caused by a common agent Z, or • X and Y both cause each other in a feedback system, or • It is just chance But if X is correlated with Y, we should investigate why
- 13. Lots of spurious correlations http://www.tylervigen.com/spurious-correlations
- 14. Spurious Correlations r=0.79 r2=63%
- 15. Today’s Plot 1. What is correlation? 2. The causes of correlation 3. Finding stories with correlation 4. Beyond correlation 5. Question and Answer
- 16. Knowing where to look Correlation can help make the map (the data) more readable
- 17. Knowing where to look Correlation can help make the map (the data) more readable
- 18. Satisfaction with features and the link to NPS
- 19. Correlation between NPS and feature satisfaction NPS Correlation Bed 0.80 Restaurant 0.58 Price 0.39 TV 0.26 Check in 0.43 Check out 0.40 Location 0.62 Internet 0.27 Bathroom 0.31 Mini-bar 0.20 NPS Correlation Bed 0.80 Location 0.62 Restaurant 0.58 Check in 0.43 Check out 0.40 Price 0.39 Bathroom 0.31 Internet 0.27 TV 0.26 Mini-bar 0.20 With story finding, sorting is key
- 20. The ratios between the features r-squared NPS R-squared Bed 64% Location 39% Restaurant 34% Check in 19% Check out 16% Price 15% Bathroom 10% Internet 7% TV 7% Mini-bar 4%
- 21. ‘Drivers’ of Choice – Derived Importance High Satisfaction Low Satisfaction High Importance Low Importance Bed Location Restaurant Check In Check out Price Bathroom Internet TV Mini-bar
- 22. The ratios between the features r-squared NPS R-squared Bed 64% Location 39% Restaurant 34% Check in 19% Check out 16% Price 15% Bathroom 10% Internet 7% TV 7% Mini-bar 4% Add the r-squared values together = 214% Why? Correlations between the scores of the features Multicollinearity
- 23. Multicollinearity NPS Bed Resta- urant Price TV Check in Check out Loca- tion Inter- net Bath- room Mini bar NPS 1.00 0.80 0.58 0.39 0.26 0.43 0.40 0.62 0.27 0.31 0.20 Bed 0.80 1.00 0.56 0.41 0.21 0.43 0.41 0.41 0.32 0.45 0.19 Restaurant 0.58 0.56 1.00 0.02 0.05 0.52 0.45 0.49 -0.01 0.30 0.22 Price 0.39 0.41 0.02 1.00 0.41 -0.08 0.03 0.08 0.15 0.00 0.03 TV 0.26 0.21 0.05 0.41 1.00 -0.09 0.01 -0.05 0.53 -0.07 0.33 Check in 0.43 0.43 0.52 -0.08 -0.09 1.00 0.71 0.30 0.14 0.16 0.07 Check out 0.40 0.41 0.45 0.03 0.01 0.71 1.00 0.55 0.11 0.23 0.16 Location 0.62 0.41 0.49 0.08 -0.05 0.30 0.55 1.00 0.05 -0.02 0.13 Internet 0.27 0.32 -0.01 0.15 0.53 0.14 0.11 0.05 1.00 -0.19 0.10 Bathroom 0.31 0.45 0.30 0.00 -0.07 0.16 0.23 -0.02 -0.19 1.00 -0.02 Mini-bar 0.20 0.19 0.22 0.03 0.33 0.07 0.16 0.13 0.10 -0.02 1.00
- 24. Today’s Plot 1. What is correlation? 2. The causes of correlation 3. Finding stories with correlation 4. Beyond correlation 5. Question and Answer
- 25. Factor Analysis Factor Solution Factor 1 Factor 2 Factor 3 Factor 4 Check In 0.87 0.07 0.05 0.08 TV 0.83 -0.01 -0.06 0.02 Check Out 0.71 0.08 -0.01 -0.09 Bed 0.70 0.21 0.04 0.32 Bathroom 0.09 0.98 0.11 -0.04 Mini bar 0.16 0.97 0.05 0.04 Price -0.05 0.28 0.84 -0.14 Restaurant -0.04 -0.05 0.76 0.21 Location 0.19 -0.03 0.62 -0.53 Internet 0.16 -0.02 0.04 0.87 Variance 29% 20% 15% 11% Cumulative Variance 29% 49% 64% 75% NewMR Webinar - Introduction to Factor Analysis https://newmr.org/blog/introduction-to-factor-analysis/
- 26. Drivers of choice, importance etc • Correlation • Easy, stable, but less rigourous – a gateway drug for Advanced Analytics • Regression • Shapley Values • Latent Class • Conjoint Analysis • Path Analysis
- 27. Correlation versus Regression y = 2x + 1 R² = 0.67028 0 5 10 15 20 25 0 1 2 3 4 5 6 7 8 9 Correlation shows the goodness of fit y = 2x + 1 R² = 1 0 5 10 15 20 25 0 1 2 3 4 5 6 7 8 9 Regression shows the scale of the relationship
- 28. Uses of correlation • To show patterns in the data, associations between items and between attributes • As a measure of importance, attributes that correlate with an outcome (e.g. satisfaction) might be more important • To suggest areas of investigation – the link between smoking and lung cancer started as a correlation, leading to the discovery of the causal link • In the form of r-squared to assess the quality of model – usually in terms of consistency rather than validity
- 29. Thank You Ray Poynter NewMR Connect with me via Twitter: @RayPoynter LinkedIn: linkedin.com/in/raypoynter
- 30. Q & A May 2019 Ray Poynter
- 31. #NewMR Sponsors May 2019 Communication Gold Silver

Nenhum painel de recortes público que contém este slide

Parece que você já adicionou este slide ao painel

Criar painel de recortes

Seja o primeiro a comentar