1. D I A N E T A L L E Y
U N I V E R S I T Y O F N O R T H C A R O L I N A , C H A P E L
H I L L
R Programming for Psychometrics
Presented to Alpine Testing Solutions August 2013
2. Define R/perceptions of R
R in psychometrics
What’s great and not so
great about R
Legal defensibility and R
Learning R
The R environment
A few tips for beginners
3. What is R?
Implementation of the S statistical programing
language (Bell Labs -Chambers, Wilkes, Becker)
Developed at University of Auckland by professors
Robert Gentleman and Ross Ihaka
http://www.r-project.org/contributors.html - lists all
R contributors
An object oriented language…..sort of
4. The R Community: Perceptions from without and
within
R is for hippies! …(Quote from
a SAS user)
…or perhaps nerds with a quirky
sense of humor using words such as
Cran-tastic, Cranberries, and useRs
6. But seriously, what is R?
It has an academic following and data analytics
across industries (i.e., pharma, biostats).
The commercialized side of R: Data Analytics
Revolution Analytics
Enterprise software
Possibly the SAS Enterprise version of R
7. How is R being used in Psychometrics
I’m using the term psychometrics in reference to the field
of testing (educational, credentialing, and psychology)
Research
See references
Mostly using simulated data and related to the use of a particular r
package (i.e., eRm for Rasch modeling)
Test delivery
The Psychometric Centre at Cambridge University
Concerto
http://www.psychometrics.cam.ac.uk/page/338/concerto-testing-
platform.htm
What about use in practice? There’s not much evidence yet that I have
found indicating R is being used to construct examinations for high
stakes testing purposes.
8. Benefits
It’s free!
Runs on multiple platforms (Windows, Unix,
MacOS)
Validation/replication of analyses (assumes
commented code and documentation)
Long term efficiency (using the same code for
multiple projects)
10. Drawbacks
Perceptions (as they pertain to using R for high
stakes testing purposes)
Open source could be a problem for use with high
stakes testing projects…maybe
Challenging to learn (some say R is one of the
hardest programming languages to learn)
http://www.statmethods.net/about/learningcurve.html
http://datakeyword.blogspot.com/2012/10/analysis-tools-comparison-r-
language.html
11. R is free software and
comes with
ABSOLUTELY NO
WARRANTY.
What does that mean for
use in psychometric
practice? Or for any
practice for that matter.
12. R and Legal Defensibility
Is the open source nature of R an issue for legal
defensibility?
14. Books
For a comprehensive list go to http://www.r-
project.org/doc/bib/R-books.html
Field, A., Miles, J., & Field, Z. (2012). Discovering statistics
using R. London: Sage Publications Ltd .
This is great for learning how to use R in the context of
statistical tests, unless you are sensitive to Dr. Field’s non-pc
sense of humor.
Pace, L. (2012). Beginning R: An introduction to statistical
programming. New York: Apress.
These two are great reference books to have on the shelf:
Teetor, P. (2011). R cookbook. Sebastopol, CA: O'Reilly.
Teetor, P. (2013). R graphics cookbook . Sebastopol, CA: O'Reilly.
15. Free Online R Tutorials
http://www.statmethods.net/
Quick R – This was one of my favorites for getting started.
https://www.coursera.org/course/compdata
There’s a course starting in September taught by a professor at
John’s Hopkins University
http://ww2.coastal.edu/kingw/statistics/R-tutorials/
http://tryr.codeschool.com/
Beware the pirate humor!
http://r-statistics.net/r-tutorial.html
http://www.personality-project.org/r/book/
http://www.computerworld.com/s/article/9239625/Beginner_s_gui
de_to_R_Introduction
http://decisionstats.com/2013/07/18/datamind-a-new-effort-to-
teach-r-online-for-free-
rstats/?goback=.gde_77616_member_259229553
Heavily focused on data analytics in R
16. R Training for a Price
http://georgia-r-
school.org/?goback=.gmr_77616.gde_77616_member_20182
0973
• Online only
http://www.revolutionanalytics.com/
• Instructor led training and online (through stats.com)
• Path available that leads to a credential
17. User Groups and Blogs That I Like
LinkedIn R Project for Statistical Computing
Most friendly to new users who are asking basic questions.
http://www.r-bloggers.com
http://www.foastat.org/
http://planetr.stderr.org/
http://stackoverflow.com/
This is the best I’ve found for technical questions
18. Associations
FOA – Foundation for Open-Analytic Statistics
Promoting methodology and software that allows truly
reproducible research
Free online journal
http://eeecon.uibk.ac.at/psychoco/
Psychometric computing
19. Conventions and best practices
No official best practices
Google’s R Style Guide is helpful
When in doubt use rseek.org (this is google with an R
filter – hugely helpful!)
21. Installing R
http://cran.r-project.org/
Technical docs - http://developer.r-project.org/
Latest release 3.0.1
Mirrors - R isn’t housed in a single location, but across the globe at
mirror sites. Pick the one nearest you.
Task View
This is an amazing reference. Packages are organized by purpose (i.e.,
Social Sciences, Psychometrics, Graphics).
Updates
You can install new version without uninstalling old version. Haven’t
found an answer to the question of whether you should do this.
Internet based R, if you prefer
http://roncloud.com –
22. Installing Packages
Base packages
Psych packages
http://cran.r-project.org/web/views/
Install once, call each time you need to use it
library()
require()
Or, if you are using an IDE such as Rstudio it’s as simple as
checking a box
Masking - Learn what it is and pay attention to it!
23. Programming Environments
Basic R
Some free IDEs
Rstudio (My personal preference) -
http://www.rstudio.com/ide/docs/using/source
Architect - http://www.openanalytics.eu/downloads
RevolutionR -
http://www.revolutionanalytics.com/downloads/
Revolution Analytics has 3 versions, a “community” version that is
free and two that are not free
24. Some Psychometric Packages to Start With
CTT
psych
psychometric
ggplot2
equate
plyr
eRM
mirt
sem
25. Getting Help
R manual
library(help="stats")
Package documentation and vignettes
rseek.org
26. A note about graphs and tables
Graphs
graphics is part of basic R
ggplot2 is recommended by many users and books
Tables: It’s not pretty if you are doing this using
psych packages!
I’m sure there’s a way to make an APA table, but I haven’t
found it yet.
Applications that allow you to create reports (pdf,
word, LaTeX)
Sweave
knitr
28. Objects and Functions
object <- function (formal arguments)
Example 1
rawScores <- c(26, 42, 36, 49)
Example 2
mean(rawScores)
Is this all you need to know?
29. Functions, Classes, and Object-Oriented
Programming
http://developer.r-project.org/
Chambers, J. (2006) How S4 methods work. Retrieved from
http://developer.r-project.org/howMethodsWork.pdf August 1,
2013
This is a good description of how classes, methods, objects, and
functions work. It also explains how R is different from other OO
languages such as Java and C++.
Venables, W. N., (2009). An introduction to R. United Kingdom:
Network Theory Limited. (Also avaible online at
http://www.cran.r-project.org/doc/manuals/R-intro.pdf)
30. A few things to note…
naming objects (i.e., rawScores, raw.scores)
Keep them simple
Start with letter, not a number
Don’t use underscores or spaces
Don’t begin with caps
Don’t replicate (we’ll come back to this)
R users don’t like loops. Use apply() in the plyr
package – although some disagree
attach()
Popular opinion is NOT to use attach. This gets to the problem
of masking if you have two objects with the same name and
you attach them both, which is being called by your program?
31. Psychometrics in R – Related
Research
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R
environment. Journal of Statistical Software, 48(6), 1-29.
Debelak, R., & Tran, U. S. (2013). Principal component analysis of smoothed tetrachoric
correlation matrices as a measure of dimensionality. Educational and Psychological
Measurement, 73(1), 63-77.
De Leeuw, J., & Mair, P. (2007). An introduction to the special volume of “psychometrics
in R.” Journal of Statistical Software, 20(1), 1-5.
Epskamp, S., Cramer, A. O. J., Waldrop, L. J., Schmittmann, V. D., & Borsboom, D. (2012).
qgraph: Network visualizations of relationships in psychometric data. Journal of
Statistical Software, 48(4), 1-18.
32. More References
Fox, J. P., Entink, R. K., & van der Linden, W. (2007). Modeling of responses and
response times with the package cirt. Journal of Statistical Software, 20(7),
1-14.
Frick, H., Strobl, C., Leisch, F., & Zeileis, A. (2012). Flexible Rasch mixture models
with package psychomix. Journal of Statistical Software, 48(7), 1-25.
Hatzinger, R., & Dittrich, R.(2012). prefmod: An R package for modeling
preferences based on paired comparisons, rankings, or ratings. Journal of
Statistical Software, 48(10), 1-31.
Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: The eRm package for
the application of IRT models in R. Journal of statistical software, 20(9), 1-
20.
33. More References
Monecke, A., & Leisch, F. (2012). semPLS: Structural equation modeling using
partial least squares. Journal of statistical software, 48(3), 1-32.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling.
Journal of Statistical software, 48(2), 1-36.
Verhelst, N. D., Hatzinger, R., & Mair, P. (2007). The Rasch sampler. Journal of
Statistical Software, 20(4), 1-14.
Weeks, J. P. (2010). plink: An R package for linking mixed-format tests using irt-
based methods. Journal of Statistical Software, 35(12), 1-33.
Wickelmaier, Fl., Strobl, C., & Zeileis, A. (2012). Psychoco: Psychometric
computing in R. Journal of Statistical software, 48(1), 1-5.
34. One Final Comment
R really is a community of users in support of a
common cause. I have found a great deal of passion
and dedication in its users and a strong desire to help
others in the common pursuit of good research. Ask
questions, but be respectful of the community and
research your problems/questions before you ask
them.
Notas do Editor
Graphic from FOAS – Foundation for Open Access Statistics – Journal of Statistical SoftwarePhilosophy of the organization is to promote reproducible, independent research and access to research for free. All research can be replicated using the same software.
The Comprehensive R Archive Network
Research – Typically related to a particular R package and studies conducted using simulated data.When asking on R user group (LinkedIn), responses were basically about the existence of packages and that it can be done. The question of should it be done has not been answered.
This may be one place we can discuss reusability of code.
one experienced professional programmer said he knew about a dozen other languages and this was the hardest to learn. No so much harder than the others, but unconventional. blog review of r in comparison with other software including SPSS and SAS. R is score low on UI/usability, requires high technical knowledge and programming ability. Compared with SAS, similar reviews, slightly higher on UI.
Discussion points here: driving force behind this projectShould we be using R in high stakes situationsWhat problems are you trying solve and will R solve those problems? Would using a commercial version of R resolve concerns?
r tutorials – written by R enthusiastsstatsmethods.net – “Quick-R”coursera – RogerPeng from Johns Hopkins School of Public Health – 4 week course beginning September 2013 http://ww2.coastal.edu/kingw/statistics/R-tutorials/ - written by professor at Coastal Carolina Universitycode school – O’ReilleyThere are more free tutorials listed on the handoutAlso mention u-tube
foastat – foundation for open access statistics – Journal of Statistical Softwaremention recent posting on eqating with catr- http://www.r-bloggers.com/item-equating-with-same-group-sat-act-example/?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+RBloggers+%28R+bloggers%29
I’ve provided some of the related package documentation for you in the folder I shared.
graphics is the base R graphics packageggplot is the popular choice - tufte
There are classes and methods, but not necessary in basic analysis.
What are classes and methods and how are they relevant to programming in R?
Exam Analysis exampleopening file and creating a dataframeuse formAreviewing the dataClick on the dataframe in the workspace to see the table in a tab and show how to edit the dataCommenting using #remember – case sensitivestr()describe()summary()What to do with missing data (NA)create a new variable - raw scorescolmeans and rowmeans and sumsCTT Rasch parametersReliability (alpha - naming issue)graphs using ggplot2address the table issue