SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
Stat405         Graphic tips & tricks


                              Hadley Wickham
Wednesday, 9 September 2009
1. Homework
               2. Reading a scatterplot
               3. Scatterplot techniques for large data
               4. Iteration & story telling
               5. Project & homework



Wednesday, 9 September 2009
Homework
                   Great start!
                   Remember the grading scheme:
                   4.5–5 = A+, 4–4.5 = A, 3.5–4 = A-
                   Shorter is better than longer.
                   Check aspect ratios.
                   Read the comments!


Wednesday, 9 September 2009
Revision:
                              reading a scatterplot

                   • Big patterns
                   • Small patterns
                   • Deviations from the pattern
                   • Strange patterns




Wednesday, 9 September 2009
Wednesday, 9 September 2009
Strong linear relationship.
              A number of outliers.




Wednesday, 9 September 2009
Wednesday, 9 September 2009
Unusual striations. Two
                              groups? Little relationship
                              between table and price?




Wednesday, 9 September 2009
Wednesday, 9 September 2009
Curved (exponential?)
                              relationship. Outliers mostly
                              cheaper than expected.


Wednesday, 9 September 2009
But what’s the
                                 problem with
                              all these plots?


qplot(carat, price, data = diamonds)
Wednesday, 9 September 2009
But what’s the
                                 problem with
                              all these plots?
                                  In pairs, brainstorm
                              solutions for 2 minutes.

qplot(carat, price, data = diamonds)
Wednesday, 9 September 2009
Ideas

                   If x discrete, use boxplots.
                   Use semi-transparent points.
                   Divide into bins and count number of
                   points in each bin (2d histogram).
                   Display statistical summary.



Wednesday, 9 September 2009
Box and
                              whisker plots


Wednesday, 9 September 2009
Boxplots

                   Less information than a histogram, but
                   take up much less space.
                   Already seen them used with discrete x
                   values. Can also use with continuous x
                   values, by specifying how we want the
                   data grouped.



Wednesday, 9 September 2009
qplot(table, price, data = diamonds)
Wednesday, 9 September 2009
●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
         15000                          ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●
                                        ●




         10000
 price




         5000




                              50   60       70   80   90
qplot(table, price, data = diamonds, geom = "boxplot")
                             table
Wednesday, 9 September 2009
●   ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●   ●
                                       ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●     ●
                               ●       ●   ●   ●
                                               ●   ●   ●   ●   ●   ●
                                                                   ●   ●   ●
                                 ●     ●   ●
                                           ●   ●   ●
                                                   ●   ●   ●   ●   ●   ●
                                       ●   ●   ●       ●   ●   ●   ●
                               ● ●     ●
                                       ●   ●
                                           ●
                                           ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●   ●
                                                       ●
                                                       ●
                                                           ●
                                                           ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●
                                                                   ●
                                                                       ●
                                                                       ●   ● ●
                                       ●   ●   ●   ●
                                                   ●   ●
                                                       ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●   ●
                                 ●     ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●   ●   ●   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●
                                                                   ●   ●   ●
                                 ●     ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●   ●   ●
                                                           ●   ●
                                                               ●   ●   ●   ●
                                       ●
                                       ●   ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●
                                                                       ●   ● ●
                                       ●   ●   ●   ●   ●   ●   ●
                                                               ●   ●
                                                                   ●   ●           ●
                                  ●    ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●   ●   ●   ●   ●     ●
                                                                             ●     ●
                                       ●   ●   ●
                                               ●   ●
                                                   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●   ●
                                  ●
                                  ●    ●   ●   ●   ●   ●   ●
                                                           ●   ●   ●   ●
                                                                       ●   ● ●
                                  ●    ●   ●
                                           ●   ●
                                               ●   ●   ●
                                                       ●   ●
                                                           ●
                                                           ●   ●
                                                               ●   ●   ●   ●
                              ●   ●
                                ● ●    ●   ●   ●   ●
                                                   ●   ●
                                                       ●   ●   ●
                                                               ●   ●   ●
                                  ●    ●   ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●   ●   ●   ●   ●           ●
                                  ●    ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●           ●●
                                                                                   ●
                                  ●    ●
                                       ●   ●   ●   ●   ●   ●   ●
                                                               ●   ●   ●
                                  ●    ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●
                                                                       ●
                                       ●   ●   ●       ●
                                                       ●   ●   ●           ● ●
         15000                    ●
                                  ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●
                                           ●
                                           ●
                                               ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●
                                                   ●
                                                       ●
                                                       ●
                                                       ●
                                                           ●
                                                           ●
                                                           ●
                                                           ●
                                                               ●
                                                               ●
                                                               ●
                                                                   ●
                                                                   ●
                                                                   ●
                                                                       ●
                                                                       ●
                                                                       ●   ●
                                                                           ●
                                                                             ●         ●
                                                                                       ●
                                       ●   ●
                                           ●   ●
                                               ●   ●   ●   ●   ●   ●
                                                                   ●   ●   ●
                                ●      ●
                                       ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●       ●   ●
                                       ●
                                       ●   ●   ●   ●   ●   ●       ●
                                                                   ●
                                                                   ●   ●           ●
                                       ●
                                       ●   ●   ●   ●
                                                   ●   ●           ●
                                       ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●           ●
                                                                   ●       ●
                                   ●   ●
                                       ●   ●   ●   ●   ●           ●   ●   ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●
                                                                       ●   ●       ●●●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●           ●
                                   ●   ●
                                       ●   ●   ●   ●
                                                   ●               ●
                                                                   ●   ●       ●
                                       ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●               ●
                                                                   ●   ●   ●         ●
                                           ●   ●   ●               ●
                                                                   ●   ●
                                                                       ●   ●
                                   ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●   ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●   ●       ●●●     ●
                                       ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●   ●
                                                                           ●
                                       ●   ●   ●
                                               ●   ●               ●
                                                                   ●   ●   ●   ●           ●
                                       ●
                                       ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●                   ●
                                       ●   ●   ●
                                               ●   ●                   ●   ●           ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                       ●   ●   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                       ●
                                                                       ●   ●   ●   ●
                                   ●
                                   ●   ●   ●   ●                           ●   ●
                                       ●   ●   ●                           ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                               ●
                                                                               ●   ●● ●        ●
                                       ●
                                       ●   ●   ●                           ●   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                               ●   ●
                                   ●
                                   ●   ●   ●   ●
                                               ●                           ●         ● ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                           ●
                                   ●   ●   ●
                                           ●
                                           ●   ●
                                               ●                                     ●
         10000                     ●
                                   ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●
                                           ●
                                               ●                                   ●
 price




                                   ●   ●
                                       ●   ●
                                           ●
                                   ●
                                   ●   ●   ●                                           ●
                                   ●   ●
                                       ●   ●
                                           ●
                                   ●   ●
                                       ●   ●
                                           ●
                                       ●
                                       ●   ●
                                           ●                                           ●
                                       ●
                                       ●   ●
                                       ●
                                       ●
                                       ●
                                       ●




         5000




qplot(table, price, data = diamonds, geom 80 "boxplot",
               50       60         70     =         90
  group = round(table))      table
Wednesday, 9 September 2009
●   ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●   ●
                                       ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●     ●
                               ●       ●   ●   ●
                                               ●   ●   ●   ●   ●   ●
                                                                   ●   ●   ●
                                 ●     ●   ●
                                           ●   ●   ●
                                                   ●   ●   ●   ●   ●   ●
                                       ●   ●   ●       ●   ●   ●   ●
                               ● ●     ●
                                       ●   ●
                                           ●
                                           ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●   ●
                                                       ●
                                                       ●
                                                           ●
                                                           ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●
                                                                   ●
                                                                       ●
                                                                       ●   ● ●
                                       ●   ●   ●   ●
                                                   ●   ●
                                                       ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●   ●
                                 ●     ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●   ●   ●   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●
                                                                   ●   ●   ●
                                 ●     ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●   ●   ●
                                                           ●   ●
                                                               ●   ●   ●   ●
                                       ●
                                       ●   ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●
                                                                       ●   ● ●
                                       ●   ●   ●   ●   ●   ●   ●
                                                               ●   ●
                                                                   ●   ●           ●
                                  ●    ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●   ●   ●   ●   ●     ●
                                                                             ●     ●
                                       ●   ●   ●
                                               ●   ●
                                                   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●   ●
                                  ●
                                  ●    ●   ●   ●   ●   ●   ●
                                                           ●   ●   ●   ●
                                                                       ●   ● ●
                                  ●    ●   ●
                                           ●   ●
                                               ●   ●   ●
                                                       ●   ●
                                                           ●
                                                           ●   ●
                                                               ●   ●   ●   ●
                              ●   ●
                                ● ●    ●   ●   ●   ●
                                                   ●   ●
                                                       ●   ●   ●
                                                               ●   ●   ●
                                  ●    ●   ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●   ●   ●   ●   ●           ●
                                  ●    ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●           ●●
                                                                                   ●
                                  ●    ●
                                       ●   ●   ●   ●   ●   ●   ●
                                                               ●   ●   ●
                                  ●    ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●   ●
                                                               ●   ●
                                                                   ●   ●
                                                                       ●
                                       ●   ●   ●       ●
                                                       ●   ●   ●           ● ●
         15000                    ●
                                  ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●
                                           ●
                                           ●
                                               ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●
                                                   ●
                                                       ●
                                                       ●
                                                       ●
                                                           ●
                                                           ●
                                                           ●
                                                           ●
                                                               ●
                                                               ●
                                                               ●
                                                                   ●
                                                                   ●
                                                                   ●
                                                                       ●
                                                                       ●
                                                                       ●   ●
                                                                           ●
                                                                             ●         ●
                                                                                       ●
                                       ●   ●
                                           ●   ●
                                               ●   ●   ●   ●   ●   ●
                                                                   ●   ●   ●
                                ●      ●
                                       ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●   ●
                                                       ●   ●
                                                           ●       ●   ●
                                       ●
                                       ●   ●   ●   ●   ●   ●       ●
                                                                   ●
                                                                   ●   ●           ●
                                       ●
                                       ●   ●   ●   ●
                                                   ●   ●           ●
                                       ●   ●
                                           ●
                                           ●   ●
                                               ●   ●
                                                   ●   ●
                                                       ●           ●
                                                                   ●       ●
                                   ●   ●
                                       ●   ●   ●   ●   ●           ●   ●   ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●
                                                                       ●   ●       ●●●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●           ●
                                   ●   ●
                                       ●   ●   ●   ●
                                                   ●               ●
                                                                   ●   ●       ●
                                       ●
                                       ●   ●
                                           ●   ●   ●
                                                   ●               ●
                                                                   ●   ●   ●         ●
                                           ●   ●   ●               ●
                                                                   ●   ●
                                                                       ●   ●
                                   ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●   ●
                                               ●
                                               ●
                                                   ●
                                                   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●   ●       ●●●     ●
                                       ●
                                       ●   ●
                                           ●   ●
                                               ●   ●
                                                   ●               ●
                                                                   ●   ●
                                                                       ●   ●
                                                                           ●
                                       ●   ●   ●
                                               ●   ●               ●
                                                                   ●   ●   ●   ●           ●
                                       ●
                                       ●   ●
                                           ●   ●
                                               ●
                                               ●   ●
                                                   ●                   ●
                                       ●   ●   ●
                                               ●   ●                   ●   ●           ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                       ●   ●   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                       ●
                                                                       ●   ●   ●   ●
                                   ●
                                   ●   ●   ●   ●                           ●   ●
                                       ●   ●   ●                           ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                               ●
                                                                               ●   ●● ●        ●
                                       ●
                                       ●   ●   ●                           ●   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                               ●   ●
                                   ●
                                   ●   ●   ●   ●
                                               ●                           ●         ● ●
                                   ●
                                   ●   ●
                                       ●   ●
                                           ●   ●
                                               ●                           ●
                                   ●   ●   ●
                                           ●
                                           ●   ●
                                               ●                                     ●
         10000                     ●
                                   ●
                                       ●
                                       ●
                                       ●
                                           ●
                                           ●
                                           ●
                                               ●                                   ●
 price




                                   ●   ●
                                       ●   ●
                                           ●
                                   ●
                                   ●   ●   ●                                           ●
                                   ●   ●
                                       ●   ●
                                           ●
                                   ●   ●
                                       ●   ●
                                           ●
                                       ●
                                       ●   ●
                                           ●                                           ●
                                       ●
                                       ●   ●
                                       ●
                                       ●
                                       ●
                                       ●




         5000




     One boxplot for
    each unique value
     of this aesthetic
qplot(table, price, data = diamonds, geom 80 "boxplot",
               50       60         70     =         90
  group = round(table))      table
Wednesday, 9 September 2009
Alpha blending



Wednesday, 9 September 2009
qplot(carat, price, data = diamonds, alpha = I(1/10))
Wednesday, 9 September 2009
qplot(carat, price, data = diamonds, alpha = I(1/50))
Wednesday, 9 September 2009
qplot(carat, price, data = diamonds, alpha = I(1/250))
Wednesday, 9 September 2009
Statistical summary



Wednesday, 9 September 2009
qplot(carat, price, data = diamonds) + geom_smooth()
Wednesday, 9 September 2009
qplot(log10(carat), log10(price), data = diamonds) + geom_smooth()
Wednesday, 9 September 2009
qplot(log10(carat), log10(price), data = diamonds) +
  geom_smooth(method = "lm")
Wednesday, 9 September 2009
2d bins



Wednesday, 9 September 2009
# Very basic cleaning
     diamonds$x[diamonds$x == 0] <- NA
     diamonds$y[diamonds$y == 0] <- NA
     diamonds$y[diamonds$y > 12] <- NA

     qplot(x,                 y,   data   =   diamonds)
     qplot(x,                 y,   data   =   diamonds,   geom   =   "bin2d")
     qplot(x,                 y,   data   =   diamonds,   geom   =   "hex")
     qplot(x,                 y,   data   =   diamonds,   geom   =   "bin2d", bins = 100)
     qplot(x,                 y,   data   =   diamonds,   geom   =   "hex", bins = 100)

     # Zoom in
     qplot(x, y,                   data = diamonds, geom = "bin2d", bins = 100) +
       xlim(4,7)                   + ylim(4,7)
     qplot(x, y,                   data = diamonds, geom = "bin2d", bins = 100) +
       xlim(4,5)                   + ylim(4,5)

Wednesday, 9 September 2009
qplot(x,                 x / y, data = diamonds,
       geom =                 "bin2d")
     qplot(x,                 log(x / y), data = diamonds,
       geom =                 "bin2d")

     clean <- subset(diamonds, abs(log(x / y)) < 0.1)

     qplot(x, log(x / y), data = clean, geom = "bin2d")
     qplot(x, log(x / y), data = clean, geom = "bin2d",
       bins = 80)




Wednesday, 9 September 2009
qplot(x,                 x / y, data = diamonds,
       geom =                 "bin2d")
     qplot(x,                 log(x / y), data = diamonds,
       geom =                 "bin2d")

     clean <- subset(diamonds, abs(log(x / y)) < 0.1)

     qplot(x, log(x / y), data = clean, geom = "bin2d")
     qplot(x, log(x / y), data = clean, geom = "bin2d",
       bins = 80)
                             What would be a good name for
                               log(x / y)? What other variable
                              might you create to go with it?

Wednesday, 9 September 2009
Your turn
                   Continue to explore the relationship
                   between x, y, z and depth. Create new
                   variables as necessary.
                   (Hint: rerun the cleaning code from last
                   week, and create more as necessary)
                   Some good ideas here: http://
                   www.diamondhelpers.com/fivesteps/4-
                   certified-diamonds.shtml


Wednesday, 9 September 2009
x
                                      table width




                                                              z




                                  depth = z / diameter
                              table = table width / x * 100

Wednesday, 9 September 2009
y_big <- diamonds$y > 10
     z_big <- diamonds$z > 6

     x_zero <- diamonds$x == 0
     y_zero <- diamonds$y == 0
     z_zero <- diamonds$z == 0

     diamonds$x[x_zero] <- NA
     diamonds$y[y_zero | y_big] <- NA
     diamonds$z[z_zero | z_big] <- NA




Wednesday, 9 September 2009
qplot(z/y * 100, depth, data = diamonds)
     last_plot() + xlim(50, 100)
     last_plot() + xlim(50, 80) + ylim(50, 80)

     qplot(z/x * 100, depth, data = diamonds) +
       xlim(50, 80) + ylim(50, 80)
     qplot(z/x * 100, depth / (z/x), data = diamonds)
     last_plot() + xlim(50, 80) + ylim(80, 120)
     last_plot() + ylim(95, 105)

     # ...



Wednesday, 9 September 2009
Iteration & stories



Wednesday, 9 September 2009
Stories
                   Best data analyses tell a story, with a
                   natural flow from beginning to end.
                   For homeworks, try and come up with
                   three plots that tell a story.
                   Stories about a small sample of the data
                   can work well.



Wednesday, 9 September 2009
qplot(cty, hwy, data = mpg)
     qplot(cty, hwy, data = mpg, geom = "jitter")
     qplot(cty, hwy, data = mpg, geom = "jitter", colour =
     class)
     qplot(cty, cty / hwy, data = mpg, geom = "jitter",
     colour = class)
     qplot(cty, cty / hwy, data = mpg, colour = class)
     qplot(displ, cty / hwy, data = mpg, colour = class)
     qplot(displ, cty / hwy, data = mpg) + facet_wrap(~
     class)
     qplot(displ, cty / hwy, data = mpg) + facet_wrap(~
     class) + geom_smooth(se = F)
     qplot(displ, cty / hwy, data = mpg) + facet_wrap(~
     class) + geom_smooth(method = "lm", se = F)

     qplot(displ, cty, data = mpg) + facet_wrap(~ class)

Wednesday, 9 September 2009
Project
                   Due in 3.5 weeks.
                   Bigger group data analysis project. (Will
                   be discussing group dynamics on
                   Monday)
                   Homework is to get you started working
                   with the data.



Wednesday, 9 September 2009
Next week

                   Checking on a slot machine.
                   Learning how to write functions.
                   Basics of simulation.




Wednesday, 9 September 2009
Feedback
     http://hadley.wufoo.com/forms/stat405-feedback/




Wednesday, 9 September 2009

Mais conteúdo relacionado

Destaque

Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data AnalysisSaad Chahine
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in RFlorian Uhlitz
 
Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Ram Narasimhan
 
Basic java important interview questions and answers to secure a job
Basic java important interview questions and answers to secure a jobBasic java important interview questions and answers to secure a job
Basic java important interview questions and answers to secure a jobGaruda Trainings
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in RJeffrey Breen
 

Destaque (9)

02 Ddply
02 Ddply02 Ddply
02 Ddply
 
01 Intro
01 Intro01 Intro
01 Intro
 
Quant Data Analysis
Quant Data AnalysisQuant Data Analysis
Quant Data Analysis
 
27 development
27 development27 development
27 development
 
Reshaping Data in R
Reshaping Data in RReshaping Data in R
Reshaping Data in R
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)
 
Basic java important interview questions and answers to secure a job
Basic java important interview questions and answers to secure a jobBasic java important interview questions and answers to secure a job
Basic java important interview questions and answers to secure a job
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
 

Mais de Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 
08 functions
08 functions08 functions
08 functions
 
07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

05 Tips Tricks

  • 1. Stat405 Graphic tips & tricks Hadley Wickham Wednesday, 9 September 2009
  • 2. 1. Homework 2. Reading a scatterplot 3. Scatterplot techniques for large data 4. Iteration & story telling 5. Project & homework Wednesday, 9 September 2009
  • 3. Homework Great start! Remember the grading scheme: 4.5–5 = A+, 4–4.5 = A, 3.5–4 = A- Shorter is better than longer. Check aspect ratios. Read the comments! Wednesday, 9 September 2009
  • 4. Revision: reading a scatterplot • Big patterns • Small patterns • Deviations from the pattern • Strange patterns Wednesday, 9 September 2009
  • 6. Strong linear relationship. A number of outliers. Wednesday, 9 September 2009
  • 8. Unusual striations. Two groups? Little relationship between table and price? Wednesday, 9 September 2009
  • 10. Curved (exponential?) relationship. Outliers mostly cheaper than expected. Wednesday, 9 September 2009
  • 11. But what’s the problem with all these plots? qplot(carat, price, data = diamonds) Wednesday, 9 September 2009
  • 12. But what’s the problem with all these plots? In pairs, brainstorm solutions for 2 minutes. qplot(carat, price, data = diamonds) Wednesday, 9 September 2009
  • 13. Ideas If x discrete, use boxplots. Use semi-transparent points. Divide into bins and count number of points in each bin (2d histogram). Display statistical summary. Wednesday, 9 September 2009
  • 14. Box and whisker plots Wednesday, 9 September 2009
  • 15. Boxplots Less information than a histogram, but take up much less space. Already seen them used with discrete x values. Can also use with continuous x values, by specifying how we want the data grouped. Wednesday, 9 September 2009
  • 16. qplot(table, price, data = diamonds) Wednesday, 9 September 2009
  • 17. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 price 5000 50 60 70 80 90 qplot(table, price, data = diamonds, geom = "boxplot") table Wednesday, 9 September 2009
  • 18. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 ● ● ● ● ● ● ● ● ● ● price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 qplot(table, price, data = diamonds, geom 80 "boxplot", 50 60 70 = 90 group = round(table)) table Wednesday, 9 September 2009
  • 19. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 ● ● ● ● ● ● ● ● ● ● price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 One boxplot for each unique value of this aesthetic qplot(table, price, data = diamonds, geom 80 "boxplot", 50 60 70 = 90 group = round(table)) table Wednesday, 9 September 2009
  • 20. Alpha blending Wednesday, 9 September 2009
  • 21. qplot(carat, price, data = diamonds, alpha = I(1/10)) Wednesday, 9 September 2009
  • 22. qplot(carat, price, data = diamonds, alpha = I(1/50)) Wednesday, 9 September 2009
  • 23. qplot(carat, price, data = diamonds, alpha = I(1/250)) Wednesday, 9 September 2009
  • 25. qplot(carat, price, data = diamonds) + geom_smooth() Wednesday, 9 September 2009
  • 26. qplot(log10(carat), log10(price), data = diamonds) + geom_smooth() Wednesday, 9 September 2009
  • 27. qplot(log10(carat), log10(price), data = diamonds) + geom_smooth(method = "lm") Wednesday, 9 September 2009
  • 28. 2d bins Wednesday, 9 September 2009
  • 29. # Very basic cleaning diamonds$x[diamonds$x == 0] <- NA diamonds$y[diamonds$y == 0] <- NA diamonds$y[diamonds$y > 12] <- NA qplot(x, y, data = diamonds) qplot(x, y, data = diamonds, geom = "bin2d") qplot(x, y, data = diamonds, geom = "hex") qplot(x, y, data = diamonds, geom = "bin2d", bins = 100) qplot(x, y, data = diamonds, geom = "hex", bins = 100) # Zoom in qplot(x, y, data = diamonds, geom = "bin2d", bins = 100) + xlim(4,7) + ylim(4,7) qplot(x, y, data = diamonds, geom = "bin2d", bins = 100) + xlim(4,5) + ylim(4,5) Wednesday, 9 September 2009
  • 30. qplot(x, x / y, data = diamonds, geom = "bin2d") qplot(x, log(x / y), data = diamonds, geom = "bin2d") clean <- subset(diamonds, abs(log(x / y)) < 0.1) qplot(x, log(x / y), data = clean, geom = "bin2d") qplot(x, log(x / y), data = clean, geom = "bin2d", bins = 80) Wednesday, 9 September 2009
  • 31. qplot(x, x / y, data = diamonds, geom = "bin2d") qplot(x, log(x / y), data = diamonds, geom = "bin2d") clean <- subset(diamonds, abs(log(x / y)) < 0.1) qplot(x, log(x / y), data = clean, geom = "bin2d") qplot(x, log(x / y), data = clean, geom = "bin2d", bins = 80) What would be a good name for log(x / y)? What other variable might you create to go with it? Wednesday, 9 September 2009
  • 32. Your turn Continue to explore the relationship between x, y, z and depth. Create new variables as necessary. (Hint: rerun the cleaning code from last week, and create more as necessary) Some good ideas here: http:// www.diamondhelpers.com/fivesteps/4- certified-diamonds.shtml Wednesday, 9 September 2009
  • 33. x table width z depth = z / diameter table = table width / x * 100 Wednesday, 9 September 2009
  • 34. y_big <- diamonds$y > 10 z_big <- diamonds$z > 6 x_zero <- diamonds$x == 0 y_zero <- diamonds$y == 0 z_zero <- diamonds$z == 0 diamonds$x[x_zero] <- NA diamonds$y[y_zero | y_big] <- NA diamonds$z[z_zero | z_big] <- NA Wednesday, 9 September 2009
  • 35. qplot(z/y * 100, depth, data = diamonds) last_plot() + xlim(50, 100) last_plot() + xlim(50, 80) + ylim(50, 80) qplot(z/x * 100, depth, data = diamonds) + xlim(50, 80) + ylim(50, 80) qplot(z/x * 100, depth / (z/x), data = diamonds) last_plot() + xlim(50, 80) + ylim(80, 120) last_plot() + ylim(95, 105) # ... Wednesday, 9 September 2009
  • 36. Iteration & stories Wednesday, 9 September 2009
  • 37. Stories Best data analyses tell a story, with a natural flow from beginning to end. For homeworks, try and come up with three plots that tell a story. Stories about a small sample of the data can work well. Wednesday, 9 September 2009
  • 38. qplot(cty, hwy, data = mpg) qplot(cty, hwy, data = mpg, geom = "jitter") qplot(cty, hwy, data = mpg, geom = "jitter", colour = class) qplot(cty, cty / hwy, data = mpg, geom = "jitter", colour = class) qplot(cty, cty / hwy, data = mpg, colour = class) qplot(displ, cty / hwy, data = mpg, colour = class) qplot(displ, cty / hwy, data = mpg) + facet_wrap(~ class) qplot(displ, cty / hwy, data = mpg) + facet_wrap(~ class) + geom_smooth(se = F) qplot(displ, cty / hwy, data = mpg) + facet_wrap(~ class) + geom_smooth(method = "lm", se = F) qplot(displ, cty, data = mpg) + facet_wrap(~ class) Wednesday, 9 September 2009
  • 39. Project Due in 3.5 weeks. Bigger group data analysis project. (Will be discussing group dynamics on Monday) Homework is to get you started working with the data. Wednesday, 9 September 2009
  • 40. Next week Checking on a slot machine. Learning how to write functions. Basics of simulation. Wednesday, 9 September 2009
  • 41. Feedback http://hadley.wufoo.com/forms/stat405-feedback/ Wednesday, 9 September 2009