The markovchain package use r2016

The markovchain R package
Giorgio A. Spedicato
University of Bologna & UnipolSai Assicurazioni
Discrete Time Markov Chains (DTMCs) represent a notable class of stochastic processes.
Whilst their basic theory is rather simple, they are extremely effective to model categorical
data sequences. Notable applications can be found in linguistic, information theory, life
sciences, economics and sociology .
The markovchain package aims to fill a gap within CRAN ecosystem providing an unified
infrastructure to easily manage DTMC and to perform statistical and probabilistic analyses on
Markov chains stochastic processes.
The markovchain package contains S4 classes and methods that allow to create and manage
homogeneous and non - homogeneous DTMC straightforwardly.
This means that DTMCs can be created, represented and exported using an approach natural
and intuitive for any R programmer.
At the same time, it provides functions to perform structural analysis of transition matrices
(that means classification of matrices and states, analysis of the stationary distribution,
periodicity, etc...). In addition, it provides methods to estimate transition matrix from data and
to perform some inference (confidence intervals calculations), statistical tests on the order and
stationarity, etc... Finally embryonic procedures for Bayesian modeling of DTMCs,
continuous time Markov Chains and higher order ones has been started to be provided.
Acknowledgements
For further information please contact Giorgio A. Spedicato (spedicato_giorgio@yahoo.it).
This poster is available upon request.
Develop an easy-to-use software package that permits to
perform most common statistical analyses on Markov
Chains.
Simulation, estimation and inferenceAbstract
Objective
Contact information
Representing DTMCs
Creating and operating with DTMCs
##create a DTMC object
library(markovchain)
#defining a Transition Matrix
weatherStates <- c("rainy", "nice", "sunny")
weatherMatrix <- matrix(data = c(0.50,0.25, 0.25,0.5, 0.0,
0.5, 0.25, 0.25, 0.5), byrow = TRUE, nrow = 3,dimnames =
list(weatherStates, weatherStates))
#create the DTMC (long and short way)
#long way
mcWeather <- new("markovchain", states = weatherStates, byrow
= TRUE,
transitionMatrix = weatherMatrix,name = "Weather")
mcWeather2<-as(mcWeather, "markovchain")
name(mcWeather2)<-"Weather Mc"
mcWeather2
Weather Mc
A 3 - dimensional discrete Markov Chain defined by the
following states:
rainy, nice, sunny
The transition matrix (by rows) is defined as follows:
rainy nice sunny
rainy 0.50 0.25 0.25
nice 0.50 0.00 0.50
sunny 0.25 0.25 0.50
Non-homogeneous DTMCs
Creating DTMCs S4 classes is extremely easy
Algebraic and logical operations can be easily performed
Algebraic and logical operations can be naturally
performed:
#equality
mcWeather == mcWeather2
TRUE
#exponentiation
mcWeather^2
Weather Mc^2
A 3 - dimensional discrete Markov Chain defined
by the following states:
rainy, nice, sunny
The transition matrix (by rows) is defined as
follows:
rainy nice sunny
rainy 0.4375 0.1875 0.3750
nice 0.3750 0.2500 0.3750
sunny 0.3750 0.1875 0.4375
plot(mcWeather,main="Weather transition
matrix")
plot(mcWeather, package = "DiagrammeR", label =
"Weather transition matrix")
The markovchain s4 method wraps plot functions from igraph
and DiagrammeR packages
Structural analysis of DTMCs
The package allows to perform structural analysis of DTMCs,
thanks to an algorithm by (Feres, 2007) ported from Matlab.
require(matlab)
mathematicaMatr <- zeros(5)
mathematicaMatr[1,] <- c(0, 1/3, 0, 2/3, 0)
mathematicaMatr[2,c(1,5)] <- 0.5
mathematicaMatr[c(3,4),c(3,4)] <- 0.5
mathematicaMatr[5,5] <- 1
mathematicaMc <- new("markovchain", transitionMatrix = mathematicaMatr,
name = "Mathematica MC")
names(mathematicaMc)<-LETTERS[1:5]
#the summary method provides an overview of structural characteristics of DTMCs
summary(mathematicaMc)
Mathematica MC Markov chain that is composed by:
Closed classes:
C D
E
Recurrent classes:
{C,D},{E}
Transient classes:
{A,B}
The Markov chain is not irreducible
The absorbing states are: E
Various functions have been defined at this purpose.
#checking accessibility
is.accessible(mathematicaMc,from = "A","E")
TRUE
is.accessible(mathematicaMc,from = "C","E")
FALSE
#canonic form
myCanonicForm<-canonicForm(mathematicaMc)
Mathematica MC
A 5 - dimensional discrete Markov Chain defined by the following states:
C, D, E, A, B
C D E A B
C 0.5 0.5000000 0.0 0.0 0.0000000
D 0.5 0.5000000 0.0 0.0 0.0000000
E 0.0 0.0000000 1.0 0.0 0.0000000
A 0.0 0.6666667 0.0 0.0 0.3333333
B 0.0 0.0000000 0.5 0.5 0.0000000
#periodicity
period(mathematicaMc)
0
Warning message: In period(mathematicaMc) : The matrix is not irreducible
period(as(matrix(c(0,1,1,0),nrow = 2),"markovchain"))
Steady state distribution(s) can be easily found as well
#finding the steady state
steadyStates(mcWeather)
rainy nice sunny
0.4 0.2 0.4
dailyWeathers<-
rmarkovchain(n=365,object=mcWeather,
t0=“sunny”)
dailyWeathers[1:7]
"sunny" "sunny" "sunny" "sunny" "nice" "rainy"
"rainy"
Given a markovchain object, simulating a stochastic sequence
given a transition matrix is straightforward
The markovchainFit function allows to estimate the underlying
transition matrix for a given character sequence. Asymptotic
standard error and MLE confidence intervals are given by
default.
mleFit<-markovchainFit(data=dailyWeathers)
mleFit
$estimate
nice rainy sunny
nice 0.0000000 0.4864865 0.5135135
rainy 0.2406015 0.4586466 0.3007519
sunny 0.2675159 0.2229299 0.5095541
$standardError
nice rainy sunny
nice 0.00000000 0.08108108 0.08330289
rainy 0.04253274 0.05872368 0.04755305
sunny 0.04127860 0.03768204 0.05696988
$confidenceInterval
$confidenceInterval$confidenceLevel
[1] 0.95
$confidenceInterval$lowerEndpointMatrix
nice rainy sunny
nice 0.0000000 0.3531200 0.3764924
rainy 0.1706414 0.3620548 0.2225341
sunny 0.1996187 0.1609485 0.4158470
$confidenceInterval$upperEndpointMatrix
nice rainy sunny
nice 0.0000000 0.6198530 0.6505346
rainy 0.3105616 0.5552385 0.3789697
sunny 0.3354132 0.2849114 0.6032613
$logLikelihood
[1] -354.3094
In addition to maximum likelihood, estimates
can be obtained by bootstrap and bayesian
approaches.
Non-homogeneous DTMCs help when transition probabilities
across states structurally change during time. They are
represented by an implicity ordered list of markovchain
objects.
mcC=as(matrix(data=c(0.1,.9,.5,.5),byrow=2,nrow=2),"markovchain")
mcA=as(matrix(data=c(0.4,.6,.1,.9),byrow=2,nrow=2),"markovchain")
mcB=as(matrix(data=c(0.2,.8,.2,.8),byrow=2,nrow=2),"markovchain")
mcC=as(matrix(data=c(0.1,.9,.5,.5),byrow=2,nrow=2),"markovchain")
>
myMcList=new("markovchainList",markovchains=list(mcA,mcB,mcC))
myMcList
list of Markov chain(s)
Markovchain 1
Unnamed Markov chain
A 2 - dimensional discrete Markov Chain with following states:
s1, s2
s1 s2
s1 0.4 0.6
s2 0.1 0.9
Markovchain 2
s1, s2
s1 s2
s1 0.2 0.8
s2 0.2 0.8
Markovchain 3
s1, s2
s1 s2
s1 0.1 0.9
s2 0.5 0.5
Function for simulate from and fit markovchainList are
available as well
mcListSims<-rmarkovchain(n=1000,object=myMcList,what="matrix")
head(mcListSims,3)
[,1] [,2] [,3]
[1,] "s2" "s2" "s1"
[2,] "s2" "s1" "s2"
[3,] "s2" "s2" "s1"
myMcListFit<-markovchainListFit(data=mcListSims)
myMcListFit$estimate
list of Markov chain(s)
Markovchain 1
s1, s2
s1 s2
s1 0.2326203 0.7673797
s2 0.2220447 0.7779553
Markovchain 2
s1, s2
s1 s2
s1 0.3584071 0.6415929
s2 0.3552972 0.6447028
Special thanks are given to: the Google Summer of Code Project
(2015-2016 sessions) and the many users that continuously return
feedbacks, suggestions and bugs checks.

The markovchain package use r2016

Recommended

Recommended

More Related Content

Similar to The markovchain package use r2016

Similar to The markovchain package use r2016 (20)

More from Giorgio Alfredo Spedicato

More from Giorgio Alfredo Spedicato (17)

Recently uploaded

Recently uploaded (20)

The markovchain package use r2016