Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Moving Data to and From R
1. Advanced Data Analytics:
Moving Data Around
Jeffrey Stanton
School of Information Studies
Syracuse University
2. R and the File System
• R maintains a current working directory to simplify the
process of reading and saving files
getwd() # shows the pathname of current folder
setwd("pathname") # Sets a new path
history() # shows most recent commands
# Creates a CSV file using data from a dataframe
write.table(dataFr, sep=",", file="filename.csv")
# Reads a CSV file into a dataframe
targetFrame = read.table("filename.csv", sep=",")
2
3. R and the Windows Clipboard
• For small chunks of data, it may be
convenient to “cut and paste”
• Create a small rectangle of data in
Excel and copy it to the clipboard
• Then, in R:
> read.DIF("clipboard",transpose=TRUE)
V1 V2
1 1 1
2 2 0
3 3 1
4 4 0
5 5 1
6 6 0
3
4. Include Variable Names
• You can pull in the variable names (the
column headings) as well
• Then, in R:
> read.DIF("clipboard",transpose=TRUE,header=TRUE)
Subject Code
1 1 1
2 2 0
3 3 1
4 4 0
5 5 1
6 6 0
4
6. An Explanation of Data Frames
• Every single piece of data in R is a “vector”: A list of “scalar” values all
of the same mode
– Scalar just means a single element or value, like the number 5
– R vectors can be lists with any number of elements, including just one
element; so a scalar could be stored in a vector of length one
– The mode of a vector can be numerical, or character, or logical
• Just like Excel spreadsheets and other data programs like SPSS, vectors
in R can be two dimensional, with a certain number of columns and a
certain number of rows; a two dimensional vector is called a matrix
• But, being a vector, a matrix has to contain elements all of the same
mode, so a matrix cannot always hold a typical spreadsheet or data set,
because these often have different types in each column
• This is where the data frame comes in: A data frame is a list of vectors,
all of the same length, each of which can be a different type
6
7. read.DIF also works with files
> setwd(“C:/DataMining/DataFiles")
> newDF =
read.DIF(“excelExport.dif",
transpose=TRUE,header=TRUE)
> class(newDF)
[1] "data.frame"
> attach(newDF)
# Note that Excel, DIF, and R
# don’t always agree on data
# formats. For example, currency
# in Excel will not export to
# integer values in R, so remove
# as much formatting as possible.
7
8. Demonstrating Mastery
• Create or find data in an Excel spreadsheet and export as a
CSV file
• Import data into R from a CSV or TXT file
• Export a data frame into a CSV file
• Read the CSV file into Excel
• Advanced: Use data interchange format (“DIF”) to
exchange files between R and Excel
• Advanced: Use a data frame in R to store data obtained from
a spreadsheet
8