SlideShare uma empresa Scribd logo
1 de 34
Intro to using R for Bioinformatics: Part 1 : The Basics Angel Pizarro angel@upenn.edu
Injecting a bit of reality
Taking it a bit further… Waxing floors is not fun, and may not seem relevant, but have some faith Daniel-san
Outline We will teach you some basic uses of R “Do & Tell” method where you will be asked to do an exercise and once done, we will explain what just happened. Will cover basics, plotting and microarray analysis We will not teach you statistics.
What is        ? R is a language and environment for statistical computing and graphics.  – http://www.r-project.org  You can do stuff like this
Install & Run R You should have already installed R, but if you had trouble please see us after class  Start R On Windows, use Tinn-R On Mac, use the source R application On Linux, use the console
Help is plentiful Help in three ways Too much! Get me out!
More Help help.start() Start an HTML help session help(mean) Looks up the mean() function's help page ?mean  help.search(mean)  Displays all help pages that contain text “mean” ??mean
Whet your appetite…
The Basics Please enter each of the following lines into your R session:
Basic Algebra You will also see this form:
Variables “x” and “y” are variables.  They are pointers to some value They can also be pointers to some function
Vectors Enter this in your session: Results
Small tangent: What is “c (1,2,3)”? Use  the help()
Accessing Vector Members In R, Vectors start indexes at 1. Most programming languages start indexing at zero Also, NOT WHAT YOU THINK IT IS! It is a INDEX VECTOR, meaning that you access the members of a vector with a vector
Small Tangent 2: Creating Sequences Create regular sequences using a colon Colon has high operator precedence Also see the seq() function
Vectors Are a list of  items of the same data type Short for “double precision floating point number”
Doing Stuff with Vectors Math operations occur on each element in sequence Returns a vector of the same size
Factors Simply a vector of items that mean something Disease classifications, drug dosage, US states, months, hapmap ethnic group Can be ordered Can have multiple levels GO Functions
Array and Matrix Multi-dimensional generalizations of vectors k-dimensions where k > 0 Assigned by the dim attribute Can be indexed by two or more indices If a single index value (can be a vector) is given, then dim is ignored and underlying vector values are accessed directly Unless the given index values is also an array Matrix is a two-dimensional array
Example An INDEX ARRAY
List An ordered collection of named components
List Access
Data Frame Bastard step child of List and Matrix Essentially a list of vectors of same length Closest representation to an Excel file in R Easiest way to make one is to read in a CSV file
Functions We’ve already used them Functions take in arguments and perform some action using those arguments.  Actions do not affect the input arguments
Example
Write to CSV file Extra column of the row indices
Save your work! R keeps track of your data and functions You can start from where you left off if you save these to some file
Start from your save point

Mais conteúdo relacionado

Mais procurados

Improving Spreadsheet Test Practices
Improving Spreadsheet Test PracticesImproving Spreadsheet Test Practices
Improving Spreadsheet Test PracticesFelienne Hermans
 
Introduction to lists
Introduction to listsIntroduction to lists
Introduction to listsaiclub_slides
 
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets Felienne Hermans
 
Svm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweetsSvm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweetsS M Raju
 
Arrays and linked lists
Arrays and linked listsArrays and linked lists
Arrays and linked listsAfriyieCharles
 
Python nltk synonyms and antonyms
Python nltk synonyms and antonymsPython nltk synonyms and antonyms
Python nltk synonyms and antonymsRati Sharma
 
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...Felienne Hermans
 

Mais procurados (11)

Improving Spreadsheet Test Practices
Improving Spreadsheet Test PracticesImproving Spreadsheet Test Practices
Improving Spreadsheet Test Practices
 
Introduction to lists
Introduction to listsIntroduction to lists
Introduction to lists
 
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
 
List data structure
List data structure List data structure
List data structure
 
Svm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweetsSvm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweets
 
Arrays and linked lists
Arrays and linked listsArrays and linked lists
Arrays and linked lists
 
Python nltk synonyms and antonyms
Python nltk synonyms and antonymsPython nltk synonyms and antonyms
Python nltk synonyms and antonyms
 
Ppt lesson 12
Ppt lesson 12Ppt lesson 12
Ppt lesson 12
 
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...
Spreadsheets are graphs too: Using Neo4J as backend to store spreadsheet info...
 
Presentation
PresentationPresentation
Presentation
 
Link list assi
Link list assiLink list assi
Link list assi
 

Destaque

Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
foreign institutional investor (fii) their impact on Indian stock market
foreign institutional investor (fii)  their impact on  Indian stock marketforeign institutional investor (fii)  their impact on  Indian stock market
foreign institutional investor (fii) their impact on Indian stock marketAamish Pandoh
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environmentizahn
 

Destaque (7)

Language R
Language RLanguage R
Language R
 
R Datatypes
R DatatypesR Datatypes
R Datatypes
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
 
foreign institutional investor (fii) their impact on Indian stock market
foreign institutional investor (fii)  their impact on  Indian stock marketforeign institutional investor (fii)  their impact on  Indian stock market
foreign institutional investor (fii) their impact on Indian stock market
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 

Semelhante a Itmat pcbi-r-course-1 (20)

Unit 3
Unit 3Unit 3
Unit 3
 
Getting started with R
Getting started with RGetting started with R
Getting started with R
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Data analysis in R
Data analysis in RData analysis in R
Data analysis in R
 
R Programming
R ProgrammingR Programming
R Programming
 
Beginning linq
Beginning linqBeginning linq
Beginning linq
 
Software fundamentals
Software fundamentalsSoftware fundamentals
Software fundamentals
 
Sharbani bhattacharya VB Structures
Sharbani bhattacharya VB StructuresSharbani bhattacharya VB Structures
Sharbani bhattacharya VB Structures
 
Python - Data Collection
Python - Data CollectionPython - Data Collection
Python - Data Collection
 
Unit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptx
 
Bc0038– data structure using c
Bc0038– data structure using cBc0038– data structure using c
Bc0038– data structure using c
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against You
 
Lambdas: Myths and Mistakes
Lambdas: Myths and MistakesLambdas: Myths and Mistakes
Lambdas: Myths and Mistakes
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Python cheat-sheet
Python cheat-sheetPython cheat-sheet
Python cheat-sheet
 
Computer project
Computer projectComputer project
Computer project
 
Python for Beginners(v1)
Python for Beginners(v1)Python for Beginners(v1)
Python for Beginners(v1)
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 

Mais de delagoya

Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetdelagoya
 
padrino_and_sequel
padrino_and_sequelpadrino_and_sequel
padrino_and_sequeldelagoya
 
Everything comes in 3's
Everything comes in 3'sEverything comes in 3's
Everything comes in 3'sdelagoya
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couchdelagoya
 
Couchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemCouchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemdelagoya
 

Mais de delagoya (7)

Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNet
 
Ruby FFI
Ruby FFIRuby FFI
Ruby FFI
 
padrino_and_sequel
padrino_and_sequelpadrino_and_sequel
padrino_and_sequel
 
Everything comes in 3's
Everything comes in 3'sEverything comes in 3's
Everything comes in 3's
 
CouchDB : More Couch
CouchDB : More CouchCouchDB : More Couch
CouchDB : More Couch
 
Couchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemCouchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problem
 

Itmat pcbi-r-course-1

  • 1. Intro to using R for Bioinformatics: Part 1 : The Basics Angel Pizarro angel@upenn.edu
  • 2. Injecting a bit of reality
  • 3. Taking it a bit further… Waxing floors is not fun, and may not seem relevant, but have some faith Daniel-san
  • 4. Outline We will teach you some basic uses of R “Do & Tell” method where you will be asked to do an exercise and once done, we will explain what just happened. Will cover basics, plotting and microarray analysis We will not teach you statistics.
  • 5. What is ? R is a language and environment for statistical computing and graphics. – http://www.r-project.org You can do stuff like this
  • 6. Install & Run R You should have already installed R, but if you had trouble please see us after class Start R On Windows, use Tinn-R On Mac, use the source R application On Linux, use the console
  • 7. Help is plentiful Help in three ways Too much! Get me out!
  • 8. More Help help.start() Start an HTML help session help(mean) Looks up the mean() function's help page ?mean help.search(mean) Displays all help pages that contain text “mean” ??mean
  • 10. The Basics Please enter each of the following lines into your R session:
  • 11.
  • 12.
  • 13.
  • 14. Basic Algebra You will also see this form:
  • 15. Variables “x” and “y” are variables. They are pointers to some value They can also be pointers to some function
  • 16. Vectors Enter this in your session: Results
  • 17. Small tangent: What is “c (1,2,3)”? Use the help()
  • 18. Accessing Vector Members In R, Vectors start indexes at 1. Most programming languages start indexing at zero Also, NOT WHAT YOU THINK IT IS! It is a INDEX VECTOR, meaning that you access the members of a vector with a vector
  • 19. Small Tangent 2: Creating Sequences Create regular sequences using a colon Colon has high operator precedence Also see the seq() function
  • 20. Vectors Are a list of items of the same data type Short for “double precision floating point number”
  • 21. Doing Stuff with Vectors Math operations occur on each element in sequence Returns a vector of the same size
  • 22. Factors Simply a vector of items that mean something Disease classifications, drug dosage, US states, months, hapmap ethnic group Can be ordered Can have multiple levels GO Functions
  • 23. Array and Matrix Multi-dimensional generalizations of vectors k-dimensions where k > 0 Assigned by the dim attribute Can be indexed by two or more indices If a single index value (can be a vector) is given, then dim is ignored and underlying vector values are accessed directly Unless the given index values is also an array Matrix is a two-dimensional array
  • 25. List An ordered collection of named components
  • 27. Data Frame Bastard step child of List and Matrix Essentially a list of vectors of same length Closest representation to an Excel file in R Easiest way to make one is to read in a CSV file
  • 28.
  • 29.
  • 30. Functions We’ve already used them Functions take in arguments and perform some action using those arguments. Actions do not affect the input arguments
  • 32. Write to CSV file Extra column of the row indices
  • 33. Save your work! R keeps track of your data and functions You can start from where you left off if you save these to some file
  • 34. Start from your save point

Notas do Editor

  1. Exercise 1: Use the sequence function to 1:20 by 2Exercise 2: Use rep() to create a vector with a sequence of number from 1 to 3, repeated three timesExercise 3: Create a vector 1:10 with 3 consecutive repeating values (e.g. 1 1 1 2 2 2 … 10 10 10 )
  2. Changing the data type of a single element changes the type of the rest of the elements.The one exception to this is “missing values” represented by the special constant “NA”. See example above