1. A unified framework to combine disperate data types in species distribution modelling
A unified framework to combine disperate data
types in species distribution modelling
Slides on Slideshare:
http://www.slideshare.net/oharar/gf-o2014talk
Bob O’Hara1 Petr Keil 2 Walter Jetz2
1BiK-F, Biodiversity and Climate Change Research Centre
Frankfurt am Main
Germany
Twitter: @bobohara
2Department of Ecology and Evolutionary Biology
Yale University
New Haven, CT, USA
2. A unified framework to combine disperate data types in species distribution modelling
A ”Real”Curve
0 20 40 60 80 100
020406080
Curve
3. A unified framework to combine disperate data types in species distribution modelling
Approximated with a Discretised Curve
0 20 40 60 80 100
020406080
Curve
Discrete
4. A unified framework to combine disperate data types in species distribution modelling
Better: linear interpolation
0 20 40 60 80 100
020406080
Curve
Discrete
Interpolated
5. A unified framework to combine disperate data types in species distribution modelling
With more points, the approximations improve
0 20 40 60 80 100
020406080
Curve
Discrete
Interpolated
6. A unified framework to combine disperate data types in species distribution modelling
What does this have to do with distribution models?
7. A unified framework to combine disperate data types in species distribution modelling
What does this have to do with distribution models?
This is how SDMs see the world:
source: http://bit.ly/1l8sG7M
Map produced by Peter Blancher, Science and Technology Branch, Environment Canada, based on data from the
North American Breeding Bird Survey
8. A unified framework to combine disperate data types in species distribution modelling
Problems: scale, within-grid heterogeneity
9. A unified framework to combine disperate data types in species distribution modelling
Let’s sidestep the whole problem
Work in continuous space instead
The maths will let us work on different scales
e.g. Renner & Warton (2013) doi:
10.1111/j.1541-0420.2012.01824.x
Lets us deal with points & irregular shapes
Makes it straightforward to include different sorts of data
10. A unified framework to combine disperate data types in species distribution modelling
Motivation
Map Of Life
www.mol.org/
Different data sources
GBIF
expert range maps
eBird and similar
citizen science efforts
organised surveys
(BBS, BMSs)
Regional checklists
11. A unified framework to combine disperate data types in species distribution modelling
A Unified Model
There is a single state - density of the species
Actual State
Presence
Absence
Presence
Only
Expert
Range
Maps
¨
¨¨% c
r
rrj
12. A unified framework to combine disperate data types in species distribution modelling
Point Processes: Model
Each point in space, ξ, has an
intensity, ρ(ξ)
log(ρ(ξ)) = η(ξ) = βX(ξ)+ν(ξ)
The number of individuals in an
area A follows a Poisson
distibution with mean
λ(A) =
A
ρ(ξ)ds
13. A unified framework to combine disperate data types in species distribution modelling
Point Processes: Reality
Approximate λ(ξ) numerically:
select some integration points,
and sum over those
λ(A) ≈
N
s=1
|A(s)|eη(s)
14. A unified framework to combine disperate data types in species distribution modelling
Observation Models
Presence only points: thinned point process
Abundance: Poisson Presence/Absence: binomial, cloglog
with µA(A, t) = η(A) + log(|A|) + log(t) + log(p)
(large) areas:
Pr(n(A) > 0) = 1 − e A eρ(ξ)dξ
Expert range: use distance to range as a covariate
15. A unified framework to combine disperate data types in species distribution modelling
Put these together
Data likelihoods: P(Xi |λ) for data Xi . Total likelihood is
P(X) =
i
P(Xi |λ)P(λ)
Where P(λ) is the actual distribution model, and will depend on
environmental and other covariates
16. A unified framework to combine disperate data types in species distribution modelling
In practice
Be Bayesian. Could use MCMC, but this is quicker in INLA
SolTim.res <- inla(SolTim.formula,
family=c('poisson','binomial'),
data=inla.stack.data(stk.all),
control.family = list(list(link = "log"),
list(link = "cloglog")),
control.predictor=list(A=inla.stack.A(stk.all)),
Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)
17. A unified framework to combine disperate data types in species distribution modelling
The Solitary Tinamou
Photo credit: Francesco Veronesi on Flickr (https://www.flickr.com/photos/francesco veronesi/12797666343)
18. A unified framework to combine disperate data types in species distribution modelling
Data
Whole Region
Expert range
Park, absent
Park, present
eBird
GBIF
expert range
2 point
processes (49
points)
28 parks
19. A unified framework to combine disperate data types in species distribution modelling
A Fitted Model
mean sd
Intercept -0.03 0.02
b.eBird 1.54 0.39
b.GBIF 1.54 0.24
Forest 0.00 0.01
NPP -0.01 0.01
Altitude -0.01 0.01
DistToRange -0.01 0.00
20. A unified framework to combine disperate data types in species distribution modelling
Predicted Distribution
Posterior Mean
−0.10
−0.09
−0.08
−0.07
−0.06
−0.05
−0.04
−0.03
−0.02
Posterior Standard Deviation
0.01
0.02
0.03
0.04
0.05
0.06
21. A unified framework to combine disperate data types in species distribution modelling
Individual Data Types
eBird GBIF Parks Expert Range
22. A unified framework to combine disperate data types in species distribution modelling
Join the bandwagon!
Using continuous space - makes life
easier
In practice, use INLA (but I need to
tidy up the code)
23. A unified framework to combine disperate data types in species distribution modelling
Not the final answer...
http://www.gocomics.com/nonsequitur/2014/06/24