SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Statistics Lab
Rodolfo Metulini
IMT Institute for Advanced Studies, Lucca, Italy

Introduction to R - 09.01.2014
Getting help with functions
To get more information on any specific named function, for
example solve, the command is
> help(solve)
An alternative is:
> ?solve
Running:
> help.start()
we will launch a Web browser that allows to enter to the help
home page.
The ?? command allows searching for help in a different way. For
example, it is usefull to get a help of non installed packages.
objects and saving data
The entities that R creates and manipulates are known as
objects.
During an R session, objects are created and stored by name.
> objects() can be used to display the name of the objects which
are currently stored within R.
> rm() can be used to remove objects.
At the end of each R session, you are given the opportunity to save
all the currently available objects. You can save the objects (the
workspace) in .RData format in the current directory. You also can
save command lines in .Rhistory format.
scalars and vectors: manipulation
To set up a vector named x, namely 1, 2, 3, 4 and 5, let use the R
command:
> x = c(1,2,3,4,5)
or, identically, the assign function could be used.
> assign(”x”, c(1,2,3,4,5))
x is a vector of length 5. To check it we can use the following
function:
> length(x)
>1/x gives the reciprocal of x.
> y = c(x,0,x) would create a vector with 11 entries consisting of
two copies of x with a 0 in the middle.
scalars and vectors: manipulation
Vectors can be used in arithmetic expressions.
Vector in the same expression need not all to be of the same
length. If not, the output value have the length of the longest
vector in the expression.
For example:
>v =2∗x +y +1
generate a new vector of length 11 constructed by adding together,
element by element, 2*x repeated 2.2 times, y repeated just once,
and 1 repeated 11 times.
So, WARNING: R compute that kind of expression even if it is
wrongly defined.
scalars and vectors: manipulation - 2
In addition, are also available log, exp, sin, cos, tan, sqrt and, of
course, the classical arithmetic operators
min(x) and max(x) select the smallest and the largest element of
the vector.
sum(x) and prod(x) display the sum and the product, respectively,
of the numbers within the vector.
mean(x) calculates the sample (arithmetic) mean, wich is the same
of sum(x)/length(x); and var(x) gives the sample variance:
sum((x − mean(x))2 )/(length(x) − 1)
sort(x) returns a vector of the same size of x, with the elements in
increasing order.
seq and rep
There are facilities to generate commonly used sequences of
numbers.
> 1:30 is the vector c(1,2, ..., 29,30)
> 2*1:15 is the vector c(2,4, ..., 28,30) of length 15.
In addition, seq() is in use. seq(2:10) is the same of the vector
2:10
by=, from=, to= are usefull command:
>seq(from= 30, to = 1)
>seq(-10, 10, by = 0.5)
rep() can be used for replicating and object.
> rep(x, times=5) > rep(x, each=5)
logical vectors

As well as numerical vectors, R allows manipulation of logical
quantities.
The elements of a logical vector can have the value TRUE, FALSE
and NA (”not available”)
Logical vectors are generated by conditions. Example:
> temp = x > 3
The logical operator are : <, <=, >=, ==, ! = for inequality. In
addition, if c1 and c2 are logical expressions, then c1c2 is the
intersection (”and”), c1|c2 is the union (”or ”), and !c1 is the
negation of c1
missing Values

In some cases the components of a vector may not be completely
known: in this case we assign the value ”NA”
The function is.na(x) gives a logical vector of the same size as x
with value TRUE if the corresponding element in x is NA. > z =
c(1:3, NA); ind = is.na(z)
There is a second kind of ”missing” values that are produced by
numerical computation, the so-called Not a Number, NaN, values.
Example:
> 0/0
> Inf/Inf
index vectors: subsets of a vector
Subsets of the elements of a vector may be selected by appendix to
the name of the vector an index vector in square brackets.
1. A logical vector: Values corresponding to TRUE in the index
vector are selected: > y = x[!is.na(x)]
2. A vector of positive (negative) integer quantities: in this
case the values in the index vector must lie in the set
{1, 2, ..., length(x)}. In the second case the selected vales will
be excluded. > x[2:3]; x[-(2:3)]
3. A vector of character string: this is possible only after
applying a names to the objects.
> cars = c(1,2,3)
> names(cars)=c(”ferrari”,”lamborghini”,”bugatti”)
> pref = cars[c(”ferrari”,”bugatti”)]
Objects and attribute
To each object it is associated one (and only one) attribute (it’s
the reason why we called them ”atomic”)
The objects can be: numeric, logical, complex, character and
raw
Usefull commands: mode(), as.numeric(), is.numeric()
For example, create a numeric vector:
> z = 0:9
change it in character: > digits = as.character(z);
and coerce it in a numeric:> d = as.integer(digits)
d and z are the same!
arrays, matrices and data.frame

Vectors are the most important type of objects in R, but there are
several others. Between the others:
matrix: they are multidimensional generalizations of vectors
data.frame: matrix-like structures, but the column can be of
different types. This is used when we manage with both
numerical and categorical data.
How to transform a vector in matrix?
> v = 1:50
> dim(v) = c(10,5)
arrays, matrices and data.frame (2)

How to create by beginning a matrix?
> m = array(1:20, dim= c(4,5))
Subsetting a matrix or replacing a subset of a matrix with zeros?
Lets give a look to the examples in the codes.
matrix manipulation
The operator ÷ ∗ ÷ is used for the matrix moltiplication.
An nx1 or 1xn matrices are also valid matrices.
If for example, A and B are square matrix of the same size,
then:
>A*B
is the matrix of element by element products(it doesn’t work for
matrices with different dimension), and
> A ÷ ∗ ÷ t(B)
is the matrix product.
diag(A) return the elements in the main diagonal of A. ginv(A)
and t(A) return the inverse and the transposed matrix.
Ginv() require MASS package.
lists and data frames
An R list is an object consisting of an ordered collection of objects
known as its components.
Here is a simple example of how to make a list:
> Lst = list(name=”Rodolfo”, surname=”Metulini”, age =
”30”)
It is possible to concatenating two or more lists:
list.ABC = C(list.A, list.B, list.C)
A data.frame is a list with a specific class ”data.frame”.
We can convert a matrix object in a data.frame objects with the
command as.data.frame(matrix)
The Easiest way to create a data.frame object is by mean of
read.table () function.
reading data
Large data objects will usually be read as values from external files
rather than entered during an R session at the keyboard.
There are basically two similar commands to upload data.
1. read.table(): specific for .csv files.
2. read.delim(): specific for .txt files
Usefull commands:
sep = ” ”: to specify if data in the dataset are separated by ;, ., ,
or they are tab delimited.
header = TRUE : to specify that first row in the dataset refers to
variable names
moreover, read.dta() is used to upload data from STATA :)
distributions and co.

One convenient use of R is to provide a comprehensive set of
statistical tables. Functions are provided to evaluate the
comulative distribution P(X < x), the probability density function
and the quantile function (given q, the smallest x such that
P(X < x) > q), and to simulate from the distribution.
Here, by ”d” for the density , ”p” (pnorm, punif, pexp etc ..) for
the CDF, ”q” for the quantile function. and ”r ” for
simulation.
Let empirically examine the distribution of a variable
(codes).
covar and concentration indices
The covariance and the correlation measure the degree at which
two variables change togheter
The correlation is a index [-1,1], the covariance is a pure number
(depends on the values assumed by the variables)
> Cov = cov(A,B) > Cor = corr(A,B)
We can also calculate the correlation netween A and B as
follow:
> CorAB = Cov / sqrt(Var(A)*Var(B))
Gini index: it is the most popular concentration index, we need to
install ineq package
Mode: the most frequent value within the distribution, we need to
install modeest package, mfv command
homeworks

For who of us is familiar with STATA, lets try to upload a .dta
file with read.dta() function.
Study the agreement with other distributions (exponential?
uniform? it is up to you) of eruption data.

Mais conteúdo relacionado

Mais procurados

R short-refcard
R short-refcardR short-refcard
R short-refcard
conline
 
Arrays and structures
Arrays and structuresArrays and structures
Arrays and structures
Mohd Arif
 

Mais procurados (18)

R short-refcard
R short-refcardR short-refcard
R short-refcard
 
Data Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetData Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat Sheet
 
Arrays and structures
Arrays and structuresArrays and structures
Arrays and structures
 
Lesson 4
Lesson 4Lesson 4
Lesson 4
 
3 Data Structure in R
3 Data Structure in R3 Data Structure in R
3 Data Structure in R
 
Data transformation-cheatsheet
Data transformation-cheatsheetData transformation-cheatsheet
Data transformation-cheatsheet
 
Array
ArrayArray
Array
 
R Basics
R BasicsR Basics
R Basics
 
2 data structure in R
2 data structure in R2 data structure in R
2 data structure in R
 
Array
ArrayArray
Array
 
Array in c++
Array in c++Array in c++
Array in c++
 
Identifiers, keywords and types
Identifiers, keywords and typesIdentifiers, keywords and types
Identifiers, keywords and types
 
Commands list
Commands listCommands list
Commands list
 
Row major and column major in 2 d
Row major and column major in 2 dRow major and column major in 2 d
Row major and column major in 2 d
 
Programming with effects - Graham Hutton
Programming with effects - Graham HuttonProgramming with effects - Graham Hutton
Programming with effects - Graham Hutton
 
A complete introduction on matlab and matlab's projects
A complete introduction on matlab and matlab's projectsA complete introduction on matlab and matlab's projects
A complete introduction on matlab and matlab's projects
 
maXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VIImaXbox starter69 Machine Learning VII
maXbox starter69 Machine Learning VII
 
ARRAY
ARRAYARRAY
ARRAY
 

Destaque

Swap Poster 2007 06 05
Swap Poster 2007 06 05Swap Poster 2007 06 05
Swap Poster 2007 06 05
Julie Allinson
 
The Credit Crisis Explained
The Credit Crisis ExplainedThe Credit Crisis Explained
The Credit Crisis Explained
Anshul Wadhwa
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
University of Salerno
 

Destaque (16)

Swap Poster 2007 06 05
Swap Poster 2007 06 05Swap Poster 2007 06 05
Swap Poster 2007 06 05
 
Saad Chahine ICSEI 2012
Saad Chahine ICSEI 2012Saad Chahine ICSEI 2012
Saad Chahine ICSEI 2012
 
9112 rosalba esparzamiranda_g_plana-poligonal
9112 rosalba esparzamiranda_g_plana-poligonal9112 rosalba esparzamiranda_g_plana-poligonal
9112 rosalba esparzamiranda_g_plana-poligonal
 
Software
SoftwareSoftware
Software
 
Relative TSR
Relative TSRRelative TSR
Relative TSR
 
Polonia, un mercado virgen para el aceite de oliva español
Polonia, un mercado virgen para el aceite de oliva españolPolonia, un mercado virgen para el aceite de oliva español
Polonia, un mercado virgen para el aceite de oliva español
 
XAYDUNGSUMENHQUANLICHIENLUOC
XAYDUNGSUMENHQUANLICHIENLUOCXAYDUNGSUMENHQUANLICHIENLUOC
XAYDUNGSUMENHQUANLICHIENLUOC
 
Talk 5
Talk 5Talk 5
Talk 5
 
The Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with KriskogramThe Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with Kriskogram
 
Eu
EuEu
Eu
 
Ad b 1702_metu_v2
Ad b 1702_metu_v2Ad b 1702_metu_v2
Ad b 1702_metu_v2
 
Talk 2
Talk 2Talk 2
Talk 2
 
Talk 3
Talk 3Talk 3
Talk 3
 
The Credit Crisis Explained
The Credit Crisis ExplainedThe Credit Crisis Explained
The Credit Crisis Explained
 
Top Ten Marketing Concepts
Top Ten Marketing ConceptsTop Ten Marketing Concepts
Top Ten Marketing Concepts
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
 

Semelhante a Introduction to R

Semelhante a Introduction to R (20)

R Programming Reference Card
R Programming Reference CardR Programming Reference Card
R Programming Reference Card
 
MatlabIntro.ppt
MatlabIntro.pptMatlabIntro.ppt
MatlabIntro.ppt
 
MatlabIntro.ppt
MatlabIntro.pptMatlabIntro.ppt
MatlabIntro.ppt
 
MatlabIntro.ppt
MatlabIntro.pptMatlabIntro.ppt
MatlabIntro.ppt
 
Matlab intro
Matlab introMatlab intro
Matlab intro
 
MatlabIntro.ppt
MatlabIntro.pptMatlabIntro.ppt
MatlabIntro.ppt
 
R command cheatsheet.pdf
R command cheatsheet.pdfR command cheatsheet.pdf
R command cheatsheet.pdf
 
@ R reference
@ R reference@ R reference
@ R reference
 
Short Reference Card for R users.
Short Reference Card for R users.Short Reference Card for R users.
Short Reference Card for R users.
 
Reference card for R
Reference card for RReference card for R
Reference card for R
 
20170509 rand db_lesugent
20170509 rand db_lesugent20170509 rand db_lesugent
20170509 rand db_lesugent
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
1. Introduction.pptx
1. Introduction.pptx1. Introduction.pptx
1. Introduction.pptx
 
bobok
bobokbobok
bobok
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
R language introduction
R language introductionR language introduction
R language introduction
 
Lesson 3
Lesson 3Lesson 3
Lesson 3
 
MATLAB/SIMULINK for Engineering Applications day 2:Introduction to simulink
MATLAB/SIMULINK for Engineering Applications day 2:Introduction to simulinkMATLAB/SIMULINK for Engineering Applications day 2:Introduction to simulink
MATLAB/SIMULINK for Engineering Applications day 2:Introduction to simulink
 
Frp2016 3
Frp2016 3Frp2016 3
Frp2016 3
 
Principles of functional progrmming in scala
Principles of functional progrmming in scalaPrinciples of functional progrmming in scala
Principles of functional progrmming in scala
 

Mais de University of Salerno

Poster venezia
Poster veneziaPoster venezia
Poster venezia
University of Salerno
 
Metulini280818 iasi
Metulini280818 iasiMetulini280818 iasi
Metulini280818 iasi
University of Salerno
 
Metulini1503
Metulini1503Metulini1503
Metulini1503
University of Salerno
 
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
University of Salerno
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov Chains
University of Salerno
 

Mais de University of Salerno (20)

Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large data
 
Regression models for panel data
Regression models for panel dataRegression models for panel data
Regression models for panel data
 
Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2
 
A strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census dataA strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census data
 
Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...
 
BASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORSBASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORS
 
Human activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataHuman activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone data
 
Poster venezia
Poster veneziaPoster venezia
Poster venezia
 
Metulini280818 iasi
Metulini280818 iasiMetulini280818 iasi
Metulini280818 iasi
 
Players Movements and Team Performance
Players Movements and Team PerformancePlayers Movements and Team Performance
Players Movements and Team Performance
 
Big Data Analytics for Smart Cities
Big Data Analytics for Smart CitiesBig Data Analytics for Smart Cities
Big Data Analytics for Smart Cities
 
Meeting progetto ode_sm_rm
Meeting progetto ode_sm_rmMeeting progetto ode_sm_rm
Meeting progetto ode_sm_rm
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
 
Metulini1503
Metulini1503Metulini1503
Metulini1503
 
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
 
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
 
The Global Virtual Water Network
The Global Virtual Water NetworkThe Global Virtual Water Network
The Global Virtual Water Network
 
Talk 4
Talk 4Talk 4
Talk 4
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov Chains
 

Último

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Último (20)

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Introduction to R

  • 1. Statistics Lab Rodolfo Metulini IMT Institute for Advanced Studies, Lucca, Italy Introduction to R - 09.01.2014
  • 2. Getting help with functions To get more information on any specific named function, for example solve, the command is > help(solve) An alternative is: > ?solve Running: > help.start() we will launch a Web browser that allows to enter to the help home page. The ?? command allows searching for help in a different way. For example, it is usefull to get a help of non installed packages.
  • 3. objects and saving data The entities that R creates and manipulates are known as objects. During an R session, objects are created and stored by name. > objects() can be used to display the name of the objects which are currently stored within R. > rm() can be used to remove objects. At the end of each R session, you are given the opportunity to save all the currently available objects. You can save the objects (the workspace) in .RData format in the current directory. You also can save command lines in .Rhistory format.
  • 4. scalars and vectors: manipulation To set up a vector named x, namely 1, 2, 3, 4 and 5, let use the R command: > x = c(1,2,3,4,5) or, identically, the assign function could be used. > assign(”x”, c(1,2,3,4,5)) x is a vector of length 5. To check it we can use the following function: > length(x) >1/x gives the reciprocal of x. > y = c(x,0,x) would create a vector with 11 entries consisting of two copies of x with a 0 in the middle.
  • 5. scalars and vectors: manipulation Vectors can be used in arithmetic expressions. Vector in the same expression need not all to be of the same length. If not, the output value have the length of the longest vector in the expression. For example: >v =2∗x +y +1 generate a new vector of length 11 constructed by adding together, element by element, 2*x repeated 2.2 times, y repeated just once, and 1 repeated 11 times. So, WARNING: R compute that kind of expression even if it is wrongly defined.
  • 6. scalars and vectors: manipulation - 2 In addition, are also available log, exp, sin, cos, tan, sqrt and, of course, the classical arithmetic operators min(x) and max(x) select the smallest and the largest element of the vector. sum(x) and prod(x) display the sum and the product, respectively, of the numbers within the vector. mean(x) calculates the sample (arithmetic) mean, wich is the same of sum(x)/length(x); and var(x) gives the sample variance: sum((x − mean(x))2 )/(length(x) − 1) sort(x) returns a vector of the same size of x, with the elements in increasing order.
  • 7. seq and rep There are facilities to generate commonly used sequences of numbers. > 1:30 is the vector c(1,2, ..., 29,30) > 2*1:15 is the vector c(2,4, ..., 28,30) of length 15. In addition, seq() is in use. seq(2:10) is the same of the vector 2:10 by=, from=, to= are usefull command: >seq(from= 30, to = 1) >seq(-10, 10, by = 0.5) rep() can be used for replicating and object. > rep(x, times=5) > rep(x, each=5)
  • 8. logical vectors As well as numerical vectors, R allows manipulation of logical quantities. The elements of a logical vector can have the value TRUE, FALSE and NA (”not available”) Logical vectors are generated by conditions. Example: > temp = x > 3 The logical operator are : <, <=, >=, ==, ! = for inequality. In addition, if c1 and c2 are logical expressions, then c1c2 is the intersection (”and”), c1|c2 is the union (”or ”), and !c1 is the negation of c1
  • 9. missing Values In some cases the components of a vector may not be completely known: in this case we assign the value ”NA” The function is.na(x) gives a logical vector of the same size as x with value TRUE if the corresponding element in x is NA. > z = c(1:3, NA); ind = is.na(z) There is a second kind of ”missing” values that are produced by numerical computation, the so-called Not a Number, NaN, values. Example: > 0/0 > Inf/Inf
  • 10. index vectors: subsets of a vector Subsets of the elements of a vector may be selected by appendix to the name of the vector an index vector in square brackets. 1. A logical vector: Values corresponding to TRUE in the index vector are selected: > y = x[!is.na(x)] 2. A vector of positive (negative) integer quantities: in this case the values in the index vector must lie in the set {1, 2, ..., length(x)}. In the second case the selected vales will be excluded. > x[2:3]; x[-(2:3)] 3. A vector of character string: this is possible only after applying a names to the objects. > cars = c(1,2,3) > names(cars)=c(”ferrari”,”lamborghini”,”bugatti”) > pref = cars[c(”ferrari”,”bugatti”)]
  • 11. Objects and attribute To each object it is associated one (and only one) attribute (it’s the reason why we called them ”atomic”) The objects can be: numeric, logical, complex, character and raw Usefull commands: mode(), as.numeric(), is.numeric() For example, create a numeric vector: > z = 0:9 change it in character: > digits = as.character(z); and coerce it in a numeric:> d = as.integer(digits) d and z are the same!
  • 12. arrays, matrices and data.frame Vectors are the most important type of objects in R, but there are several others. Between the others: matrix: they are multidimensional generalizations of vectors data.frame: matrix-like structures, but the column can be of different types. This is used when we manage with both numerical and categorical data. How to transform a vector in matrix? > v = 1:50 > dim(v) = c(10,5)
  • 13. arrays, matrices and data.frame (2) How to create by beginning a matrix? > m = array(1:20, dim= c(4,5)) Subsetting a matrix or replacing a subset of a matrix with zeros? Lets give a look to the examples in the codes.
  • 14. matrix manipulation The operator ÷ ∗ ÷ is used for the matrix moltiplication. An nx1 or 1xn matrices are also valid matrices. If for example, A and B are square matrix of the same size, then: >A*B is the matrix of element by element products(it doesn’t work for matrices with different dimension), and > A ÷ ∗ ÷ t(B) is the matrix product. diag(A) return the elements in the main diagonal of A. ginv(A) and t(A) return the inverse and the transposed matrix. Ginv() require MASS package.
  • 15. lists and data frames An R list is an object consisting of an ordered collection of objects known as its components. Here is a simple example of how to make a list: > Lst = list(name=”Rodolfo”, surname=”Metulini”, age = ”30”) It is possible to concatenating two or more lists: list.ABC = C(list.A, list.B, list.C) A data.frame is a list with a specific class ”data.frame”. We can convert a matrix object in a data.frame objects with the command as.data.frame(matrix) The Easiest way to create a data.frame object is by mean of read.table () function.
  • 16. reading data Large data objects will usually be read as values from external files rather than entered during an R session at the keyboard. There are basically two similar commands to upload data. 1. read.table(): specific for .csv files. 2. read.delim(): specific for .txt files Usefull commands: sep = ” ”: to specify if data in the dataset are separated by ;, ., , or they are tab delimited. header = TRUE : to specify that first row in the dataset refers to variable names moreover, read.dta() is used to upload data from STATA :)
  • 17. distributions and co. One convenient use of R is to provide a comprehensive set of statistical tables. Functions are provided to evaluate the comulative distribution P(X < x), the probability density function and the quantile function (given q, the smallest x such that P(X < x) > q), and to simulate from the distribution. Here, by ”d” for the density , ”p” (pnorm, punif, pexp etc ..) for the CDF, ”q” for the quantile function. and ”r ” for simulation. Let empirically examine the distribution of a variable (codes).
  • 18. covar and concentration indices The covariance and the correlation measure the degree at which two variables change togheter The correlation is a index [-1,1], the covariance is a pure number (depends on the values assumed by the variables) > Cov = cov(A,B) > Cor = corr(A,B) We can also calculate the correlation netween A and B as follow: > CorAB = Cov / sqrt(Var(A)*Var(B)) Gini index: it is the most popular concentration index, we need to install ineq package Mode: the most frequent value within the distribution, we need to install modeest package, mfv command
  • 19. homeworks For who of us is familiar with STATA, lets try to upload a .dta file with read.dta() function. Study the agreement with other distributions (exponential? uniform? it is up to you) of eruption data.