Seu SlideShare está sendo baixado.
×

- 1. A Survey of R Graphics June 18 2009 R Users Group of LA Michael E. Driscoll Principal, Dataspora [email_address] www.dataspora.com
- 2. “ The sexy job in the next ten years will be statisticians…” - Hal Varian
- 4. (from Jessica Hagy’s thisisindexed.com) Hypothesis
- 5. gdp <- read.csv('gdp.csv') hours <- read.csv('hours.csv') gdp.hours <- merge(hours,gdp) gdp.hours$freetime <- 4380 - gdp.hours$hours attach(gdp.hours) plot(freetime ~ gdp) m <- lm(freetime ~ gdp,data=gdp.hours) abline(m,col=3,lw=2) pm <- loess(freetime ~ gdp) lines(spline(gdp,fitted(pm))) Munge & Model
- 6. Visualization library(ggplot2) qplot(gdp,freetime, data=gdp.hours, geom=c("point", "smooth"), span=1)
- 7. basic graphics
- 8. R’s Two Graphics Systems
- 9. plot() graphs objects plot(freetime ~ gdp, data=gdp.hours) model <- lm(freetime ~ gdp, data=gdp.hours) ab line(model)
- 10. plot() graphs objects abline(model, col="red", lwd=3 )
- 11. par sets graphical par ameters par( pch =20, cex =5, col ="#5050a0 BB ") RGB hex alpha blending! help(par) plot(freetime ~ gdp, data=gdp.hours)
- 12. par sets graphical par ameters parameters for par() pch col adj srt pt.cex graphing functions points() text() xlab() legend()
- 13. Paneling Graphics <ul><li>By setting one parameter in particular, mfrow , we can partition the graphics display to give us a m ultiple f ramework in which to panel our plots, row wise. </li></ul><ul><li>par(mfrow = c( nrow, ncol)) </li></ul>Number of rows Number of columns
- 14. Paneling Graphics <ul><li>par(mfrow=c(2,2)) </li></ul><ul><li>hist (D$wg, main='Histogram',xlab='Weight Gain', ylab ='Frequency', col=heat.colors(14)) </li></ul><ul><li>boxplot (wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg, wg.11a$wg, wg.12p$wg, main='Weight Gain', ylab='Weight Gain (lbs)', </li></ul><ul><li>xlab='Shift', names = c('7am','8am','9am','10am','11am','12pm')) </li></ul><ul><li>plot (D$metmin,D$wg,main='Met Minutes vs. Weight Gain', xlab='Mets (min)',ylab='Weight Gain (lbs)',pch=2) </li></ul><ul><li>plot (t1,D2$Intel,type="l",main='Closing Stock Prices',xlab='Time',ylab='Price $') </li></ul><ul><li>lines(t1,D2$DELL,lty=2) </li></ul>
- 15. Paneling Graphics
- 16. Working with Graphics Devices <ul><li>Starting up a new graphic X11 window </li></ul><ul><ul><li>x11() </li></ul></ul><ul><li>To write graphics to a file, open a device, write to it, close. </li></ul><ul><ul><li>pdf(“mygraphic.pdf”,width=7,height=7) </li></ul></ul><ul><ul><li>plot(x) </li></ul></ul><ul><ul><li>dev.off() </li></ul></ul><ul><ul><li>In Linux, the package “Cairo “ is recommended for a device that renders high-quality vector and raster images (alpha blending!). The command would read Cairo(“mygraphic.pdf”, … </li></ul></ul><ul><ul><li>Common gotcha: under non-interactive sessions, you should explicitly invoke a print command to send a plot object to an open device. For example </li></ul></ul><ul><ul><li>print(plot(x)) </li></ul></ul>
- 17. library( ggplot2 )
- 18. gg plot2 = g rammar of g raphics
- 19. gg plot2 = g rammar of g raphics
- 20. Visualizing 50,000 Diamonds with ggplot2
- 21. qplot (carat, price, data = diamonds)
- 22. qplot( log (carat), log (price), data = diamonds) qplot(carat, price, log=“xy” , data = diamonds) OR
- 23. qplot(log(carat), log(price), data = diamonds, alpha = I(1/20) )
- 24. qplot(log(carat), log(price), data = diamonds, alpha = I(1/20), colour=color )
- 25. Achieving small multiples with “facets” qplot(log(carat), log(price), data = diamonds, alpha=I(1/20)) + facet_grid(. ~ color)
- 26. qplot(color, price/carat, data = diamonds, alpha = I(1/20), geom=“jitter” ) qplot(color, price/carat, data = diamonds, geom=“boxplot” ) old new
- 28. library( lattice )
- 29. lattice = trellis <ul><ul><li>(source: http://lmdvr.r-forge.r-project.org ) </li></ul></ul>
- 30. visualizing six dimensions of MLB pitches with lattice
- 31. xyplot(x ~ y, data=pitch)
- 32. xyplot(x ~ y, groups=type , data=pitch)
- 33. xyplot( x ~ y | type , data=pitch)
- 34. xyplot(x ~ y | type, data=pitch, fill.color = pitch$color, panel = function(x,y, fill.color, …, subscripts) { fill <- fill.color[subscripts] panel.xyplot(x,y, fill= fill, …) })
- 35. xyplot(x ~ y | type, data=pitch, fill.color = pitch$color, panel = function(x,y, fill.color, …, subscripts) { fill <- fill.color[subscripts] panel.xyplot(x, y, fill= fill, …) })
- 36. A Story of Two Pitchers Hamels Webb
- 37. list of lattice functions densityplot(~ speed | type, data=pitch)
- 38. plotting big data
- 39. xyplot with 1m points = Bad Idea Jeans xyplot(log(price)~log(carat),data=diamonds)
- 40. efficient plotting with hexbinplot hexbinplot(log(price)~log(carat),data=diamonds,xbins=40)
- 41. 100 thousand gene measures
- 42. efficient plotting with geneplotter
- 43. beautiful colors with Colorspace library(“Colorspace”) red <- LAB(50,64,64) blue <- LAB(50,-48,-48) mixcolor(10, red, blue)
- 44. R--> web
- 45. L inux A pache M ySQL R http://labs.dataspora.com/gameday
- 48. Configuring rapache <ul><li>Hello world script </li></ul>setContentType("text/html") png("/var/www/hello.png") plot(sample(100,100),col=1:8,pch=19) dev.off() cat("<html>") cat("<body>") cat("<h1>hello world</h1>") cat('<img src="../hello.png"') cat("</body>") cat("</html>")
- 49. Data Visualization References <ul><ul><li>ggplot2: Elegant Graphics for Data Analysis </li></ul></ul><ul><ul><li>by Hadley Wickham </li></ul></ul><ul><ul><li>http://had.co.nz/ggplot2 </li></ul></ul><ul><ul><li>Lattice : Multivariate Data Visualization with R </li></ul></ul><ul><ul><li>by Deepayan Sarkar </li></ul></ul><ul><ul><li>http://lmdvr.r-forge.r-project.org/ </li></ul></ul>
- 50. Contact Us Michael E. Driscoll, Ph.D. Principal [email_address] www.dataspora.com