SlideShare a Scribd company logo
1 of 44
Download to read offline
Stat405   Graphics for large data


                           Hadley Wickham
Thursday, 26 August 2010
Majoring in Stat

                    • Declare early (even if you’re not sure)
                    • Weekly lunches
                    • Summer opportunities
                      (research & internships)




Thursday, 26 August 2010
1. Leftovers from last lecture
                2. The diamonds data
                3. Histograms and bar charts
                4. More boxplots and scatterplots
                5. Homework



Thursday, 26 August 2010
# Remember: start with                                                                                                            ●                            ●


             library(ggplot2)                                                                                                                                           ●

       40

                                                                                                                                                                                    ●

                                                                                                                                                            ●
                                                                                                                                           ●
                                                                                                                                                                                         ●    ●
       35
                                                                                                                                                   ●
                                                                                                                                                                                ●
                                                                                                                                                                    ●
                                                                                                                         ●                 ●       ●
                                                                                                                    ●                                           ●
                                                                                                                    ●                                                                   ●          ●
                                                                                                                     ●       ●                                                           ●
                                                                                                                                                                                ●
       30                                                                                                                     ●   ●                                                               ●

                                                                                                                          ● ● ●       ●●           ●       ●●                  ● ● ● ●
                                                                                                                                                                               ●
                                                                                                                                                                            ● ●●
                                                                                                                                                                             ●
                                                                                                                                           ●                        ●       ●●  ●
                                                                                                                              ●                                     ●             ●
                                                                                                                         ● ● ●                 ●
 hwy




                                                                                                                            ●                                               ●                ● ●
                                                                                                                                                                                             ● ●
                                                                    ●                                                ● ●     ●●                                         ●   ●
                                                        ●                                               ●             ●● ●
                                                                                                                      ●        ●
                                                                                                                                 ●         ●●              ●●                   ●        ●
                                                                                                            ●       ● ●
                                                                                                                    ● ●
                                                                                                                    ●         ●●●          ● ●                                   ●        ●     ●●
                                                                                                                    ●                                               ●                          ●
                                                                                                                                                   ●                            ● ●
       25                                           ●       ●                                     ●                          ● ●●
                                                                                                                                                                                         ●●
                                                                                                                                                                                         ●
                                                                                                                                                                                               ●
                                                                                                                                                                                               ●
                                                ●                            ●                ●                                                    ●                 ●              ●    ●
                                                                                                                ●                              ●
                                                                                     ●●                              ●                                              ●
                                                                                                            ●                ●
                                                                ●                         ●                                                    ●                                               ●
                                                                         ●
                                   ●                    ●                                 ●                                                                     ●
                                                                                 ●        ●                                                                      ●
                                                                         ●
                                                                                                                                                       ●
                      ●        ●
       20                          ●
                                           ●     ●
                                               ● ●          ●●      ●                                                                                           ●
                   ●                                        ●●      ●●
                 ●●        ●                ●                       ●●
                                                  ●      ●
                                       ●      ●
                   ●                          ●● ● ●●●
                    ●     ●
                          ●    ●           ● ● ● ●●
                      ●  ●     ●           ●●● ●     ●     ●                     ●
                  ● ● ●
                   ●                           ●● ●     ●  ●
                  ●● ●        ●                        ● ●
                        ●
                 ●                            ● ●
       15        ● ●        ●                ● ●        ● ●
                                                 ●
                                                 ●


                  ●            ●   ●                 ●          ●




                      pickup                         suv                     minivan                  2seater            midsize           subcompact                           compact
qplot(reorder(class, hwy),reorder(class, hwy) = mpg, geom = "jitter")
                            hwy, data
Thursday, 26 August 2010
●           ●




                                                                       ●

       40


                                                                                   ●


       35                                                                          ●




       30
 hwy




                                ●
                                ●

       25                       ●
                                ●
                                ●
                           ●


       20


                                        ●


       15


                           ●    ●



                      pickup   suv   minivan   2seater   midsize   subcompact   compact
qplot(reorder(class, hwy), hwy, data hwy)mpg, geom = "boxplot")
                           reorder(class, =
Thursday, 26 August 2010
●
                                                                                                                                                       ●                            ●
                                                                                                                                                                                    ●




                                                                                                                                                       ●           ●

       40


                                                                                                                                                                                    ● ●
                                                                                                                                                   ● ●

       35                                                                                                                                                                  ●        ● ●
                                                                                                                                                       ●
                                                                                                                                                                                           ●
                                                                                                                                                                   ●
                                                                                                                                ●             ● ● ●
                                                                                                               ●                                                           ●                 ●
                                                                                                                       ●        ●●                                                       ●
       30                                                                                                      ●           ●    ●                                              ●
                                                                                                                                                                              ●●
                                                                                                                                                                                        ●●
                                                                                                                                ●
                                                                                                                               ● ●                                 ●                      ●●
                                                                                                                                          ●   ●●●                           ●● ●●       ● ●●
                                                                                                                                   ●
                                                                                                                       ●       ●               ●                           ●
 hwy




                                                                                                                                ●
                                                                                                                                ●                              ●
                                                                                                               ●                 ●             ●                                    ●● ●●
                                                         ● ●                                                                                                               ●           ●
                                                                                                                   ●     ●            ●                                                 ●
                                                 ●                                                             ●        ●
                                                                                                                        ● ●          ●    ●   ●●●                  ●              ●●
                                                         ●                                             ●               ●●            ●
                                                                                                                                    ●●                                        ●      ●●
                                                                                                                                                                                      ●
                                                                                               ●                   ●    ●
                                                                                                                        ●           ●       ●   ●●                              ●
                                                                                                           ●
       25                                    ●           ●         ●                                               ●                ●                                        ● ●● ●
                                                                                                                                     ●
                                                                                                                                     ●                             ●       ●            ●
                                                         ●             ●                                   ●                              ●                ●
                                                                                                                                                           ●                         ●●
                                                                       ●       ●   ●                                   ●                                       ●
                                                               ●                                                                                                             ●
                                             ●           ●                                         ●               ●                                       ●
                                                                       ●                   ●
                           ●                                               ●
                           ●                             ●                 ●
                                                                                           ●                                                       ●           ●
                                                                                                                                                                       ●
                                                                                       ●
                                   ●          ●
       20          ●       ●               ●● ●
                                            ●        ●
                                                     ●                                                                                                 ●
                                                    ●● ● ●●
                  ●                ●● ●    ● ●       ●    ●
                                    ●
                                    ●          ●   ● ●  ●
                                                 ●
                                                 ●     ●
                             ●                  ●      ●                       ●
                 ● ●●
                    ●     ● ●●             ●●●●● ● ● ●
                                             ●
                                              ● ●● ● ● ● ●
                                                 ●                             ●
                      ● ● ●                            ●
                       ●                       ●
                  ●   ●
                      ●    ●
                         ●                   ●       ●
       15         ●     ●    ●               ●        ●●           ●
                                                     ●
                                                     ●

                                   ●             ●
                      ●        ●                         ●
                                                         ●
                                       ●



qplot(reorder(class,minivan
       pickup suv
                      hwy), 2seater data = subcompact
                                  hwy, midsize mpg,                                                                                                                            compact
  geom = c("jitter", "boxplot"))
                         reorder(class, hwy)
Thursday, 26 August 2010
Your turn

                    Read the help for reorder. Redraw the
                    previous plots with class ordered by
                    median hwy.
                    How would you put the jittered points on
                    top of the boxplots?




Thursday, 26 August 2010
Diamonds



Thursday, 26 August 2010
Diamonds data
                    ~54,000 round diamonds from
                    http://www.diamondse.info/
                    Carat, colour, clarity, cut
                    Total depth, table, depth,
                    width, height
                    Price


Thursday, 26 August 2010
x
                                   table width




                                                           z




                               depth = z / diameter
                           table = table width / x * 100

Thursday, 26 August 2010
Recall

                    Write down five ways to inspect the
                    diamonds dataset.
                    You have one minute!




Thursday, 26 August 2010
Your turn


                    Inspect the data and familiarise yourself
                    with the variables. If you don’t know what
                    they mean, look them up on wikipedia.




Thursday, 26 August 2010
Histogram &
                            bar charts


Thursday, 26 August 2010
Histograms and
                              barcharts

                    Used to display the distribution of a
                    variable
                    Categorical variable → bar chart
                    Continuous variable → histogram




Thursday, 26 August 2010
Always
     experiment with
      the bin width!
Thursday, 26 August 2010
Examples
                # With only one variable, qplot guesses that
                # you want a bar chart or histogram
                qplot(cut, data = diamonds)

                qplot(carat, data = diamonds)
                qplot(carat, data = diamonds, binwidth = 1)
                qplot(carat, data = diamonds, binwidth = 0.1)
                qplot(carat, data = diamonds, binwidth = 0.01)
                resolution(diamonds$carat)

                last_plot() + xlim(0, 3)


Thursday, 26 August 2010
Examples
                # With only one variable, qplot guesses that
                # you want a bar chart or histogram
                qplot(cut, data = diamonds)

                qplot(carat, data = diamonds)
                qplot(carat, data = diamonds, binwidth = 1)
                      Common ggplot2
                qplot(carat, data = diamonds, binwidth = 0.1)
                      technique: adding
                qplot(carat, data = diamonds, binwidth = 0.01)
                         together plot
                resolution(diamonds$carat)
                           components

                last_plot() + xlim(0, 3)


Thursday, 26 August 2010
qplot(table, data = diamonds, binwidth = 1)

     # To zoom in on a plot region   use xlim() and ylim()
     qplot(table, data = diamonds,   binwidth = 1) +
        xlim(50, 70)
     qplot(table, data = diamonds,   binwidth = 0.1) +
       xlim(50, 70)
     qplot(table, data = diamonds,   binwidth = 0.1) +
       xlim(50, 70) + ylim(0, 50)

     # Note that this type of zooming discards data
     outside of the plot regions
     # See coord_cartesian() for an alternative


Thursday, 26 August 2010
Additional variables

                    As with scatterplots can use aesthetics
                    or faceting. Using aesthetics creates
                    pretty, but ineffective, plots.
                    The following examples show the
                    difference, when investigation the
                    relationship between cut and depth.



Thursday, 26 August 2010
4000




         3000
 count




         2000




         1000




            0

                           56   58   60   62   64   66   68   70
qplot(depth, data = diamonds, binwidth = 0.2)
                          depth
Thursday, 26 August 2010
4000




         3000

                                            cut
                                                  Fair
                                                  Good
 count




         2000                                     Very Good
                                                  Premium
                                                  Ideal



         1000




            0

qplot(depth, data = diamonds, binwidth = 0.2,
        56   58   60    62  64  66   68   70
  fill = cut) + xlim(55, 70)
                      depth
Thursday, 26 August 2010
4000




         3000

                                            cut
                                                  Fair
                                                  Good
 count




         2000                                     Very Good
                                                  Premium
                                                  Ideal



         1000




         Fill is the aesthetic
           0
             for fill colour
qplot(depth, data = diamonds, binwidth = 0.2,
        56   58   60    62  64  66   68   70
  fill = cut) + xlim(55, 70)
                      depth
Thursday, 26 August 2010
Fair    Good               Very Good


         2500

         2000

         1500

         1000

         500

            0
 count




                           Premium   Ideal


         2500

         2000

         1500

         1000

         500

            0
qplot(depth, 62 64 66= 68 70 56 58 60 binwidth = 0.2) +
       56 58 60 data    diamonds, 62 64 66 68 70 56 58 60   62 64 66 68 70

  xlim(55, 70) + facet_wrap(~depth    cut)
Thursday, 26 August 2010
Your turn

                    Explore the distribution of price.
                    How does it vary with colour, or cut, and
                    clarity?
                    Practice zooming in on regions of interest.




Thursday, 26 August 2010
Box and
                           whisker plots


Thursday, 26 August 2010
Boxplots

                    Less information than a histogram, but
                    take up much less space.
                    Already seen them used with discrete x
                    values. Can also use with continuous x
                    values, by specifying how we want the
                    data grouped.



Thursday, 26 August 2010
qplot(table, price, data = diamonds)
Thursday, 26 August 2010
●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
         15000                       ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●
                                     ●




         10000
 price




         5000




                           50   60       70   80   90
qplot(table, price, data = diamonds, geom = "boxplot")
                             table
Thursday, 26 August 2010
●   ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●   ●
                                    ●
                                    ●   ●
                                        ●   ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●     ●
                            ●       ●   ●   ●
                                            ●   ●   ●   ●   ●   ●
                                                                ●   ●   ●
                              ●     ●   ●
                                        ●   ●   ●
                                                ●   ●   ●   ●   ●   ●
                                    ●   ●   ●       ●   ●   ●   ●
                            ● ●     ●
                                    ●   ●
                                        ●
                                        ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●   ●
                                                    ●
                                                    ●
                                                        ●
                                                        ●
                                                        ●   ●
                                                            ●   ●
                                                                ●
                                                                ●
                                                                    ●
                                                                    ●   ● ●
                                    ●   ●   ●   ●
                                                ●   ●
                                                    ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●   ●
                              ●     ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●   ●   ●   ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●
                                                                ●   ●   ●
                              ●     ●
                                    ●   ●
                                        ●   ●   ●
                                                ●   ●   ●
                                                        ●   ●
                                                            ●   ●   ●   ●
                                    ●
                                    ●   ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●
                                                                    ●   ● ●
                                    ●   ●   ●   ●   ●   ●   ●
                                                            ●   ●
                                                                ●   ●           ●
                               ●    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●   ●   ●   ●   ●     ●
                                                                          ●     ●
                                    ●   ●   ●
                                            ●   ●
                                                ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●   ●
                               ●
                               ●    ●   ●   ●   ●   ●   ●
                                                        ●   ●   ●   ●
                                                                    ●   ● ●
                               ●    ●   ●
                                        ●   ●
                                            ●   ●   ●
                                                    ●   ●
                                                        ●
                                                        ●   ●
                                                            ●   ●   ●   ●
                           ●   ●
                             ● ●    ●   ●   ●   ●
                                                ●   ●
                                                    ●   ●   ●
                                                            ●   ●   ●
                               ●    ●   ●   ●
                                            ●
                                            ●   ●
                                                ●   ●   ●   ●   ●   ●           ●
                               ●    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●           ●●
                                                                                ●
                               ●    ●
                                    ●   ●   ●   ●   ●   ●   ●
                                                            ●   ●   ●
                               ●    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●
                                                                    ●
                                    ●   ●   ●       ●
                                                    ●   ●   ●           ● ●
         15000                 ●
                               ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●
                                        ●
                                        ●
                                            ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●
                                                ●
                                                    ●
                                                    ●
                                                    ●
                                                        ●
                                                        ●
                                                        ●
                                                        ●
                                                            ●
                                                            ●
                                                            ●
                                                                ●
                                                                ●
                                                                ●
                                                                    ●
                                                                    ●
                                                                    ●   ●
                                                                        ●
                                                                          ●         ●
                                                                                    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●   ●   ●   ●   ●
                                                                ●   ●   ●
                             ●      ●
                                    ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●       ●   ●
                                    ●
                                    ●   ●   ●   ●   ●   ●       ●
                                                                ●
                                                                ●   ●           ●
                                    ●
                                    ●   ●   ●   ●
                                                ●   ●           ●
                                    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●           ●
                                                                ●       ●
                                ●   ●
                                    ●   ●   ●   ●   ●           ●   ●   ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●
                                                                    ●
                                                                    ●   ●       ●●●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●           ●
                                ●   ●
                                    ●   ●   ●   ●
                                                ●               ●
                                                                ●   ●       ●
                                    ●
                                    ●   ●
                                        ●   ●   ●
                                                ●               ●
                                                                ●   ●   ●         ●
                                        ●   ●   ●               ●
                                                                ●   ●
                                                                    ●   ●
                                ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●   ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●               ●
                                                                ●   ●
                                                                    ●   ●       ●●●     ●
                                    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●
                                                                    ●   ●
                                                                        ●
                                    ●   ●   ●
                                            ●   ●               ●
                                                                ●   ●   ●   ●           ●
                                    ●
                                    ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●                   ●
                                    ●   ●   ●
                                            ●   ●                   ●   ●           ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                       ●   ●   ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                       ●
                                                                    ●   ●   ●   ●
                                ●
                                ●   ●   ●   ●                           ●   ●
                                    ●   ●   ●                           ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                               ●
                                                                            ●   ●● ●        ●
                                    ●
                                    ●   ●   ●                           ●   ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                               ●   ●
                                ●
                                ●   ●   ●   ●
                                            ●                           ●         ● ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                           ●
                                ●   ●   ●
                                        ●
                                        ●   ●
                                            ●                                     ●
         10000                  ●
                                ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●
                                        ●
                                            ●                                   ●
 price




                                ●   ●
                                    ●   ●
                                        ●
                                ●
                                ●   ●   ●                                           ●
                                ●   ●
                                    ●   ●
                                        ●
                                ●   ●
                                    ●   ●
                                        ●
                                    ●
                                    ●   ●
                                        ●                                           ●
                                    ●
                                    ●   ●
                                    ●
                                    ●
                                    ●
                                    ●




         5000




qplot(table, price, data = diamonds, geom 80 "boxplot",
               50       60         70     =         90
  group = round(table))      table
Thursday, 26 August 2010
●   ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●   ●
                                    ●
                                    ●   ●
                                        ●   ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●     ●
                            ●       ●   ●   ●
                                            ●   ●   ●   ●   ●   ●
                                                                ●   ●   ●
                              ●     ●   ●
                                        ●   ●   ●
                                                ●   ●   ●   ●   ●   ●
                                    ●   ●   ●       ●   ●   ●   ●
                            ● ●     ●
                                    ●   ●
                                        ●
                                        ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●   ●
                                                    ●
                                                    ●
                                                        ●
                                                        ●
                                                        ●   ●
                                                            ●   ●
                                                                ●
                                                                ●
                                                                    ●
                                                                    ●   ● ●
                                    ●   ●   ●   ●
                                                ●   ●
                                                    ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●   ●
                              ●     ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●   ●   ●   ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●
                                                                ●   ●   ●
                              ●     ●
                                    ●   ●
                                        ●   ●   ●
                                                ●   ●   ●
                                                        ●   ●
                                                            ●   ●   ●   ●
                                    ●
                                    ●   ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●
                                                                    ●   ● ●
                                    ●   ●   ●   ●   ●   ●   ●
                                                            ●   ●
                                                                ●   ●           ●
                               ●    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●   ●   ●   ●   ●     ●
                                                                          ●     ●
                                    ●   ●   ●
                                            ●   ●
                                                ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●   ●
                               ●
                               ●    ●   ●   ●   ●   ●   ●
                                                        ●   ●   ●   ●
                                                                    ●   ● ●
                               ●    ●   ●
                                        ●   ●
                                            ●   ●   ●
                                                    ●   ●
                                                        ●
                                                        ●   ●
                                                            ●   ●   ●   ●
                           ●   ●
                             ● ●    ●   ●   ●   ●
                                                ●   ●
                                                    ●   ●   ●
                                                            ●   ●   ●
                               ●    ●   ●   ●
                                            ●
                                            ●   ●
                                                ●   ●   ●   ●   ●   ●           ●
                               ●    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●           ●●
                                                                                ●
                               ●    ●
                                    ●   ●   ●   ●   ●   ●   ●
                                                            ●   ●   ●
                               ●    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●   ●
                                                            ●   ●
                                                                ●   ●
                                                                    ●
                                    ●   ●   ●       ●
                                                    ●   ●   ●           ● ●
         15000                 ●
                               ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●
                                        ●
                                        ●
                                            ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●
                                                ●
                                                    ●
                                                    ●
                                                    ●
                                                        ●
                                                        ●
                                                        ●
                                                        ●
                                                            ●
                                                            ●
                                                            ●
                                                                ●
                                                                ●
                                                                ●
                                                                    ●
                                                                    ●
                                                                    ●   ●
                                                                        ●
                                                                          ●         ●
                                                                                    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●   ●   ●   ●   ●
                                                                ●   ●   ●
                             ●      ●
                                    ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●   ●
                                                    ●   ●
                                                        ●       ●   ●
                                    ●
                                    ●   ●   ●   ●   ●   ●       ●
                                                                ●
                                                                ●   ●           ●
                                    ●
                                    ●   ●   ●   ●
                                                ●   ●           ●
                                    ●   ●
                                        ●
                                        ●   ●
                                            ●   ●
                                                ●   ●
                                                    ●           ●
                                                                ●       ●
                                ●   ●
                                    ●   ●   ●   ●   ●           ●   ●   ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●
                                                                    ●
                                                                    ●   ●       ●●●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●           ●
                                ●   ●
                                    ●   ●   ●   ●
                                                ●               ●
                                                                ●   ●       ●
                                    ●
                                    ●   ●
                                        ●   ●   ●
                                                ●               ●
                                                                ●   ●   ●         ●
                                        ●   ●   ●               ●
                                                                ●   ●
                                                                    ●   ●
                                ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●   ●
                                            ●
                                            ●
                                                ●
                                                ●
                                                ●               ●
                                                                ●   ●
                                                                    ●   ●       ●●●     ●
                                    ●
                                    ●   ●
                                        ●   ●
                                            ●   ●
                                                ●               ●
                                                                ●   ●
                                                                    ●   ●
                                                                        ●
                                    ●   ●   ●
                                            ●   ●               ●
                                                                ●   ●   ●   ●           ●
                                    ●
                                    ●   ●
                                        ●   ●
                                            ●
                                            ●   ●
                                                ●                   ●
                                    ●   ●   ●
                                            ●   ●                   ●   ●           ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                       ●   ●   ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                       ●
                                                                    ●   ●   ●   ●
                                ●
                                ●   ●   ●   ●                           ●   ●
                                    ●   ●   ●                           ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                               ●
                                                                            ●   ●● ●        ●
                                    ●
                                    ●   ●   ●                           ●   ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                               ●   ●
                                ●
                                ●   ●   ●   ●
                                            ●                           ●         ● ●
                                ●
                                ●   ●
                                    ●   ●
                                        ●   ●
                                            ●                           ●
                                ●   ●   ●
                                        ●
                                        ●   ●
                                            ●                                     ●
         10000                  ●
                                ●
                                    ●
                                    ●
                                    ●
                                        ●
                                        ●
                                        ●
                                            ●                                   ●
 price




                                ●   ●
                                    ●   ●
                                        ●
                                ●
                                ●   ●   ●                                           ●
                                ●   ●
                                    ●   ●
                                        ●
                                ●   ●
                                    ●   ●
                                        ●
                                    ●
                                    ●   ●
                                        ●                                           ●
                                    ●
                                    ●   ●
                                    ●
                                    ●
                                    ●
                                    ●




         5000




     One boxplot for
    each unique value
     of this aesthetic
qplot(table, price, data = diamonds, geom 80 "boxplot",
               50       60         70     =         90
  group = round(table))      table
Thursday, 26 August 2010
Scatterplots



Thursday, 26 August 2010
Interpreting a
                             scatterplot

                    • Global patterns
                    • Local patterns
                    • Deviations




Thursday, 26 August 2010
Thursday, 26 August 2010
Strong linear relationship.
               A number of outliers.




Thursday, 26 August 2010
Thursday, 26 August 2010
Unusual striations. Two
                           groups? Little relationship
                           between table and price?




Thursday, 26 August 2010
Thursday, 26 August 2010
Curved (exponential?)
                           relationship. Outliers mostly
                           cheaper than expected.


Thursday, 26 August 2010
But what’s the
                               problem with
                            all these plots?


qplot(carat, price, data = diamonds)
Thursday, 26 August 2010
But what’s the
                               problem with
                            all these plots?
                               In pairs, brainstorm
                           solutions for 2 minutes.

qplot(carat, price, data = diamonds)
Thursday, 26 August 2010
Idea             ggplot
                     Small points        shape = I(".")

                   Transparency         alpha = I(1/50)

                           Jittering    geom = "jitter"

                  Smooth curve          geom = "smooth"
                                        geom = "bin2d" or
                           2d bins         geom = "hex"

             Density contours          geom = "density2d"
Thursday, 26 August 2010
Your turn

                    Practice doing these plots yourself.
                    Read the online documentation for each
                    plot type: http://had.co.nz/ggplot2




Thursday, 26 August 2010
Homework

                    Practice your graphics/data exploration
                    skills with the diamonds or mpg data.
                    Due in one week.
                    Make sure to read the grading rubric, and
                    find a colour printer.



Thursday, 26 August 2010
Asking questions

                    You have two minutes to write down as
                    many questions as you can come up with
                    that you might want to answer about the
                    diamonds data.
                    Write your best question on a piece of
                    paper and turn it in.



Thursday, 26 August 2010

More Related Content

Similar to 02 Large

Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1rusersla
 
Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Hadley Wickham
 
Over Visie, Missie En Strategie
Over Visie, Missie En StrategieOver Visie, Missie En Strategie
Over Visie, Missie En StrategieGuus Vos
 
About Vision, Mission And Strategy
About Vision, Mission And StrategyAbout Vision, Mission And Strategy
About Vision, Mission And StrategyGuus Vos
 
How People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersHow People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersMat Morrison
 
研修企画書11 12term voda-カヤック
研修企画書11 12term voda-カヤック研修企画書11 12term voda-カヤック
研修企画書11 12term voda-カヤックaiesecsfc_icx2011
 
研修企画書11-12term voda-カヤック
研修企画書11-12term voda-カヤック研修企画書11-12term voda-カヤック
研修企画書11-12term voda-カヤックaiesecsfc_icx2011
 
Modul mulus bahagian c sjk (modul murid)
Modul mulus bahagian c sjk (modul murid)Modul mulus bahagian c sjk (modul murid)
Modul mulus bahagian c sjk (modul murid)Anparasu
 
Modul mulus bahagian c sjk (modul guru)
Modul mulus bahagian c sjk (modul guru)Modul mulus bahagian c sjk (modul guru)
Modul mulus bahagian c sjk (modul guru)Anparasu
 
Modul mulus bahagian c sk (modul murid)
Modul mulus bahagian c sk (modul murid)Modul mulus bahagian c sk (modul murid)
Modul mulus bahagian c sk (modul murid)Anparasu
 

Similar to 02 Large (20)

04 Wrapup
04 Wrapup04 Wrapup
04 Wrapup
 
08 Continuous
08 Continuous08 Continuous
08 Continuous
 
08 Continuous
08 Continuous08 Continuous
08 Continuous
 
13 Bivariate
13 Bivariate13 Bivariate
13 Bivariate
 
Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1
 
1 basics
1 basics1 basics
1 basics
 
Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)
 
Over Visie, Missie En Strategie
Over Visie, Missie En StrategieOver Visie, Missie En Strategie
Over Visie, Missie En Strategie
 
About Vision, Mission And Strategy
About Vision, Mission And StrategyAbout Vision, Mission And Strategy
About Vision, Mission And Strategy
 
01 Intro
01 Intro01 Intro
01 Intro
 
How People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersHow People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It Matters
 
14 case-study
14 case-study14 case-study
14 case-study
 
研修企画書11 12term voda-カヤック
研修企画書11 12term voda-カヤック研修企画書11 12term voda-カヤック
研修企画書11 12term voda-カヤック
 
21 Ml
21 Ml21 Ml
21 Ml
 
研修企画書11-12term voda-カヤック
研修企画書11-12term voda-カヤック研修企画書11-12term voda-カヤック
研修企画書11-12term voda-カヤック
 
17 polishing
17 polishing17 polishing
17 polishing
 
17 Sampling Dist
17 Sampling Dist17 Sampling Dist
17 Sampling Dist
 
Modul mulus bahagian c sjk (modul murid)
Modul mulus bahagian c sjk (modul murid)Modul mulus bahagian c sjk (modul murid)
Modul mulus bahagian c sjk (modul murid)
 
Modul mulus bahagian c sjk (modul guru)
Modul mulus bahagian c sjk (modul guru)Modul mulus bahagian c sjk (modul guru)
Modul mulus bahagian c sjk (modul guru)
 
Modul mulus bahagian c sk (modul murid)
Modul mulus bahagian c sk (modul murid)Modul mulus bahagian c sk (modul murid)
Modul mulus bahagian c sk (modul murid)
 

More from Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 
07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

02 Large

  • 1. Stat405 Graphics for large data Hadley Wickham Thursday, 26 August 2010
  • 2. Majoring in Stat • Declare early (even if you’re not sure) • Weekly lunches • Summer opportunities (research & internships) Thursday, 26 August 2010
  • 3. 1. Leftovers from last lecture 2. The diamonds data 3. Histograms and bar charts 4. More boxplots and scatterplots 5. Homework Thursday, 26 August 2010
  • 4. # Remember: start with ● ● library(ggplot2) ● 40 ● ● ● ● ● 35 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 30 ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● hwy ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● 25 ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● pickup suv minivan 2seater midsize subcompact compact qplot(reorder(class, hwy),reorder(class, hwy) = mpg, geom = "jitter") hwy, data Thursday, 26 August 2010
  • 5. ● ● 40 ● 35 ● 30 hwy ● ● 25 ● ● ● ● 20 ● 15 ● ● pickup suv minivan 2seater midsize subcompact compact qplot(reorder(class, hwy), hwy, data hwy)mpg, geom = "boxplot") reorder(class, = Thursday, 26 August 2010
  • 6. ● ● ● ● ● 40 ● ● ● ● 35 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● 30 ● ● ● ● ●● ●● ● ● ● ● ●● ● ●●● ●● ●● ● ●● ● ● ● ● ● hwy ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● 25 ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15 ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● qplot(reorder(class,minivan pickup suv hwy), 2seater data = subcompact hwy, midsize mpg, compact geom = c("jitter", "boxplot")) reorder(class, hwy) Thursday, 26 August 2010
  • 7. Your turn Read the help for reorder. Redraw the previous plots with class ordered by median hwy. How would you put the jittered points on top of the boxplots? Thursday, 26 August 2010
  • 9. Diamonds data ~54,000 round diamonds from http://www.diamondse.info/ Carat, colour, clarity, cut Total depth, table, depth, width, height Price Thursday, 26 August 2010
  • 10. x table width z depth = z / diameter table = table width / x * 100 Thursday, 26 August 2010
  • 11. Recall Write down five ways to inspect the diamonds dataset. You have one minute! Thursday, 26 August 2010
  • 12. Your turn Inspect the data and familiarise yourself with the variables. If you don’t know what they mean, look them up on wikipedia. Thursday, 26 August 2010
  • 13. Histogram & bar charts Thursday, 26 August 2010
  • 14. Histograms and barcharts Used to display the distribution of a variable Categorical variable → bar chart Continuous variable → histogram Thursday, 26 August 2010
  • 15. Always experiment with the bin width! Thursday, 26 August 2010
  • 16. Examples # With only one variable, qplot guesses that # you want a bar chart or histogram qplot(cut, data = diamonds) qplot(carat, data = diamonds) qplot(carat, data = diamonds, binwidth = 1) qplot(carat, data = diamonds, binwidth = 0.1) qplot(carat, data = diamonds, binwidth = 0.01) resolution(diamonds$carat) last_plot() + xlim(0, 3) Thursday, 26 August 2010
  • 17. Examples # With only one variable, qplot guesses that # you want a bar chart or histogram qplot(cut, data = diamonds) qplot(carat, data = diamonds) qplot(carat, data = diamonds, binwidth = 1) Common ggplot2 qplot(carat, data = diamonds, binwidth = 0.1) technique: adding qplot(carat, data = diamonds, binwidth = 0.01) together plot resolution(diamonds$carat) components last_plot() + xlim(0, 3) Thursday, 26 August 2010
  • 18. qplot(table, data = diamonds, binwidth = 1) # To zoom in on a plot region use xlim() and ylim() qplot(table, data = diamonds, binwidth = 1) + xlim(50, 70) qplot(table, data = diamonds, binwidth = 0.1) + xlim(50, 70) qplot(table, data = diamonds, binwidth = 0.1) + xlim(50, 70) + ylim(0, 50) # Note that this type of zooming discards data outside of the plot regions # See coord_cartesian() for an alternative Thursday, 26 August 2010
  • 19. Additional variables As with scatterplots can use aesthetics or faceting. Using aesthetics creates pretty, but ineffective, plots. The following examples show the difference, when investigation the relationship between cut and depth. Thursday, 26 August 2010
  • 20. 4000 3000 count 2000 1000 0 56 58 60 62 64 66 68 70 qplot(depth, data = diamonds, binwidth = 0.2) depth Thursday, 26 August 2010
  • 21. 4000 3000 cut Fair Good count 2000 Very Good Premium Ideal 1000 0 qplot(depth, data = diamonds, binwidth = 0.2, 56 58 60 62 64 66 68 70 fill = cut) + xlim(55, 70) depth Thursday, 26 August 2010
  • 22. 4000 3000 cut Fair Good count 2000 Very Good Premium Ideal 1000 Fill is the aesthetic 0 for fill colour qplot(depth, data = diamonds, binwidth = 0.2, 56 58 60 62 64 66 68 70 fill = cut) + xlim(55, 70) depth Thursday, 26 August 2010
  • 23. Fair Good Very Good 2500 2000 1500 1000 500 0 count Premium Ideal 2500 2000 1500 1000 500 0 qplot(depth, 62 64 66= 68 70 56 58 60 binwidth = 0.2) + 56 58 60 data diamonds, 62 64 66 68 70 56 58 60 62 64 66 68 70 xlim(55, 70) + facet_wrap(~depth cut) Thursday, 26 August 2010
  • 24. Your turn Explore the distribution of price. How does it vary with colour, or cut, and clarity? Practice zooming in on regions of interest. Thursday, 26 August 2010
  • 25. Box and whisker plots Thursday, 26 August 2010
  • 26. Boxplots Less information than a histogram, but take up much less space. Already seen them used with discrete x values. Can also use with continuous x values, by specifying how we want the data grouped. Thursday, 26 August 2010
  • 27. qplot(table, price, data = diamonds) Thursday, 26 August 2010
  • 28. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 price 5000 50 60 70 80 90 qplot(table, price, data = diamonds, geom = "boxplot") table Thursday, 26 August 2010
  • 29. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 ● ● ● ● ● ● ● ● ● ● price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 qplot(table, price, data = diamonds, geom 80 "boxplot", 50 60 70 = 90 group = round(table)) table Thursday, 26 August 2010
  • 30. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10000 ● ● ● ● ● ● ● ● ● ● price ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5000 One boxplot for each unique value of this aesthetic qplot(table, price, data = diamonds, geom 80 "boxplot", 50 60 70 = 90 group = round(table)) table Thursday, 26 August 2010
  • 32. Interpreting a scatterplot • Global patterns • Local patterns • Deviations Thursday, 26 August 2010
  • 34. Strong linear relationship. A number of outliers. Thursday, 26 August 2010
  • 36. Unusual striations. Two groups? Little relationship between table and price? Thursday, 26 August 2010
  • 38. Curved (exponential?) relationship. Outliers mostly cheaper than expected. Thursday, 26 August 2010
  • 39. But what’s the problem with all these plots? qplot(carat, price, data = diamonds) Thursday, 26 August 2010
  • 40. But what’s the problem with all these plots? In pairs, brainstorm solutions for 2 minutes. qplot(carat, price, data = diamonds) Thursday, 26 August 2010
  • 41. Idea ggplot Small points shape = I(".") Transparency alpha = I(1/50) Jittering geom = "jitter" Smooth curve geom = "smooth" geom = "bin2d" or 2d bins geom = "hex" Density contours geom = "density2d" Thursday, 26 August 2010
  • 42. Your turn Practice doing these plots yourself. Read the online documentation for each plot type: http://had.co.nz/ggplot2 Thursday, 26 August 2010
  • 43. Homework Practice your graphics/data exploration skills with the diamonds or mpg data. Due in one week. Make sure to read the grading rubric, and find a colour printer. Thursday, 26 August 2010
  • 44. Asking questions You have two minutes to write down as many questions as you can come up with that you might want to answer about the diamonds data. Write your best question on a piece of paper and turn it in. Thursday, 26 August 2010