SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
EURO-­‐BASIN,	
  www.euro-­‐basin.eu	
     Introduc)on	
  to	
  Sta)s)cal	
  Modelling	
  Tools	
  for	
  Habitat	
  Models	
  Development,	
  26-­‐28th	
  Oct	
  2011	
  
2


                     OUTLINE
• Why to model?


• Habitat models


• Model properties


• Steps for modelling


• What about data?
3


             WHY TO MODEL?
• “All models are wrong, some models are useful” (G. Box)


• Models are how we understand the world:
       We see the world through models
       We learn about the world using formal descriptions


• Model types:


   – Static vs dynamic
   – Explanatory vs predictive
   – Deterministic vs stochastic
   – Discrete vs continuous
4


             HABITAT MODELS
• Habitat models are focused on how environmental factors control
  the distribution of species and communities.


• Multiple applications:


    – Biogeography, impact of the global change, management,
      conservation, ecology, …


• New conceptual and operative advances due to the growth in
  computing power, e.g. GIS, remote sensing, new statistical
  modelling tools (computer intensive), etc
5


          MODEL PROPERTIES
Some desirable model properties:


• Parsimony (Occam’s razor): “All things being equal, the simplest
  solution tends to be the best one”
• Tractability: easy to be analysed
• Conceptually insightful: reveal fundamental properties
• Generalizability: can be applied to other situations/species/…
• Empirical consistency: consistent with the available data
• Falsifiability: can be tested by observations
• Predictive precision
6


         MODEL PROPERTIES



  Predictive habitat
distribution models




                Levins (1966); Sharpe (1990); Guisan and Zimmermann (2000)
7


 MODEL PROPERTIES

                              COMPLEXITY


        GENERALITY




The more complex model is not necessarily the best…
8


STEPS FOR MODELLING
 1) Conceptual phase


 2) Model formulation


 3) Model calibration


 4) Spatial predictions


 5) Model evaluation


 6) Model applicability
9


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
10


             1. Conceptual phase
• Some sort of theoretical model should be in mind, before a statistical
  model is even considered
• This phase includes:
    – Literature review
    – Define an up-to-date conceptual model
    – Set multiple hypothesis
    – Assess available and missing data
    – Identify appropriate sampling strategy for new data
    – Choose appropriate spatio-temporal resolution and geographic
      extent
    – Identify the most appropriate statistical methods for the other
      phases
11


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
12


             2. Model formulation
• The model depends on the type of response variable and its
  associated probability distribution


        Distribution             Examples
        Gaussian                 Biomass
        Poisson                  Individual counts
        Negative Binomial        Individual counts
        Multinomial              Communities
        Binomial                 Presence/absence
13


2. Model formulation




             Guisan and Zimmermann (2000)
14



REGRESSION ANALYSIS   2. Model formulation




                                   50
                                   40
                                   30
                               y
                                   20
                                   10
                                   0




                                        0     2       4       6   8   10

                      oct-11            © AZTI-Tecnalia
                                                          x            14
15



REGRESSION ANALYSIS   2. Model formulation




                               50
                               40
                               30
                           y
                               20
                               10
                               0




                                    0     2       4       6   8   10

                                    © AZTI-Tecnalia
                                                      x            15
16



REGRESSION ANALYSIS   2. Model formulation




                                   10
                                   5
                               y
                                   0
                                   -5




                                        0.0    0.2    0.4       0.6   0.8   1.0

                      oct-11             © AZTI-Tecnalia
                                                            x                 16
17



REGRESSION ANALYSIS   2. Model formulation




                                   10
                                   5
                               y
                                   0
                                   -5




                                        0.0    0.2    0.4       0.6   0.8   1.0

                      oct-11             © AZTI-Tecnalia
                                                            x                 17
18



REGRESSION ANALYSIS        2. Model formulation




                        LINK
                      FUNCTION



                         The response variable y can follow distributions like:
                             NORMAL, BINOMIAL, POISSON, GAMMA, etc

                                            McCullagh and Nelder (1989); Dobson (2008)
                                                © AZTI-Tecnalia                   18
                            oct-11
19



REGRESSION ANALYSIS        2. Model formulation




                        LINK                                                      SMOOTHS
                      FUNCTION



                         The response variable y can follow distributions like:
                             NORMAL, BINOMIAL, POISSON, GAMMA, etc

                                                Hastie and Tibshirani (1990); Wood (2006)
                                                 © AZTI-Tecnalia                     19
                            oct-11
20



REGRESSION ANALYSIS      2. Model formulation

                           Modelo lineal                          Modelo aditivo
                             (LM)                                    (AM)




                      Modelo lineal generalizado            Modelo aditivo generalizado
                               (GLM)                                 (GAM)




                          oct-11               © AZTI-Tecnalia                     20
21



REGRESSION ANALYSIS         2. Model formulation
                      Other regression models:


                      • Mixed models: LM, GLM and GAMs including random effect
                        terms. Useful for meta-analysis.


                      • Quantile regression: the quantiles are modelled instead of
                        the mean. Useful for finding limiting factors


                      • Segmented regression: the model changes depending on a
                        partition of the explanatory variable. Useful for detecting
                        regime changes


                      • Spatial autocorrelation and autoregressive models
22


CLASSIFICATION TECHNIQUES           2. Model formulation
                            • Classification is the placement of species and/or sample units
                              into groups based on the environmental variables
23


CLASSIFICATION TECHNIQUES           2. Model formulation
                            • Classification is the placement of species and/or sample units
                              into groups based on the environmental variables


                            • Many techniques included: classification decision tree,
                              regression decision tree, rule-based classification, maximum-
                              likelihood classification


                            • Mainly two groups:
                               – Supervised classification: a training data set is required
                                 (groups are known beforehand)
                               – unsupervised classification: groups are unknown and need
                                 to be defined, like in cluster analysis
24


ENVIRONMENTAL ENVELOPES           2. Model formulation
                          • The environmental envelope of a species is defined as the set
                            of environments within which it is believed that the species can
                            persist (Walker and Cocks, 1991)
25


ENVIRONMENTAL ENVELOPES           2. Model formulation
                          • The environmental envelope of a species is defined as the set
                            of environments within which it is believed that the species can
                            persist (Walker and Cocks, 1991)


                          • Examples of models:


                              – BIOCLIM: minimal       rectilinear   envelopes   based   on
                                classification trees
                              – HABITAT: convex        polytope      envelopes   based   on
                                classification trees
                              – DOMAIN: based on multivariate distance metrics
26


                                2. Model formulation
                        • Ordination is the arrangement or ‘ordering’ of species and/or
ORDINATION TECHNIQUES



                          sample units along gradients


                        • Usually applied to community data matrices (row: species,
                          column: samples, value: abundance)
27


                                   2. Model formulation
                        •   Indirect gradient analysis (no environmental data used)
                             – Distance-based approaches:
ORDINATION TECHNIQUES



                                  • Polar ordination, Principal Coordinates Analysis, Nonmetric
                                    Multidimensional Scaling
                             – Eigenanalysis-based approaches
                                  • Linear model
                                       – Principal Components Analysis
                                  • Unimodal model
                                       – Correspondence Analysis, Detrended Correspondence Analysis
                        •   Direct gradient analysis (environmental data used)
                             – Linear model
                                  • Redundancy Analysis
                             – Unimodal model
                                  • Canonical Correspondence Analysis, Detrended Canonical
                                    Correspondence Analysis


                                                                         ter Braak and Prentice (1988)
28


                          2. Model formulation
                  • Models inspired in the human-brain (interconnected group of
                    neurons)
NEURAL NETWORKS




                  • They define a non-linear function, decomposed further as a
                    weighted sum of functions, that similarly can be further
                    decomponsed, etc. So, complex non-parametric model (black-
                    box?)


                  • Adjusted by varying parameters, connection weights, or
                    specifics of the architecture such as the number of neurons or
                    their connectivity


                  • Few examples available yet
29


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
30


             3. Model calibration
• It includes model fitting (find the best value of the unknown
  parameters to improve the agreement between the data and model
  outputs) and model selection (which explanatory variables to be
  included)


• To take into account:
   – Use of predictors that are ecologically relevant: direct vs indirect
     (proxy) variables
   – Correlation between explanatory variables


• Each method has each own diagnostic tools according to their
  assumptions, e.g, in regression models the residual deviance
31


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
32


             4.Spatial predictions

• Spatial predictions can be done on the data set used for calibration
  or on new data sets. Care must be taken if predictions are done in a
  new data set with new combinations between the explanatory
  variables and for values outside the range of values in the data set
  for calibration


• GIS tools are very often used, but still many statistical models are
  not implemented in a GIS environment
33


STEPS FOR MODELLING




            Guisan and Zimmermann (2000)
34


              5. Model evaluation
• The aim is to evaluate the predictive power of a model


• If only one data set is available (we have used the data set for
  calibration), bootstrap, cross-validation, jacknife


• If other data sets are available (independent of the calibration data
  set), predicted and observed values are compared using:
    – the same goodness of fit measure as used for model calibration
    – any other measure of association


    The data sets for calibration and evaluation are called respectively
    training and evaluation data sets. Sometimes the original single
    data set is split in two (split-sample approach)
35


STEPS FOR MODELLING




      APPLICABILITY




               Guisan and Zimmermann (2000)
36


            6. Model applicability
• It refers to the domain over which a validated model can be properly
  used


• Potential uses (Decoursey, 1992):


   – Screening


   – Research


   – Planning, monitoring and assessment
37


         WHAT ABOUT DATA?
• Data is even more important than the model itself.


• Usually from multiple sources: surveys (continuous, stations, vertical
  profiles), remote sensing, circulation models, …


• The scale of the response and the environmental variables might not
  be the same. Need to define a common scale unit. Sometimes
  interpolation might be needed. This might include additional
  uncertainities


• Simple exploratory statistics and figures can be very useful before
  even start thinking on any model. They also help to spot errors in the
  data.
EURO-­‐BASIN,	
  www.euro-­‐basin.eu	
     Introduc)on	
  to	
  Sta)s)cal	
  Modelling	
  Tools	
  for	
  Habitat	
  Models	
  Development,	
  26-­‐28th	
  Oct	
  2011	
  

Mais conteúdo relacionado

Mais de DTU - Technical University of Denmark

Mais de DTU - Technical University of Denmark (6)

Open Access For Global Climate Change Factsheet 2011
Open Access For Global Climate Change Factsheet 2011Open Access For Global Climate Change Factsheet 2011
Open Access For Global Climate Change Factsheet 2011
 
Introduction to gis by ibon gasparsoro euro basin training
Introduction to gis by ibon gasparsoro euro basin trainingIntroduction to gis by ibon gasparsoro euro basin training
Introduction to gis by ibon gasparsoro euro basin training
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
 
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine FeldenIntroduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
 
Model Validation, performance measures, models comparison and Weka (open sour...
Model Validation, performance measures, models comparison and Weka (open sour...Model Validation, performance measures, models comparison and Weka (open sour...
Model Validation, performance measures, models comparison and Weka (open sour...
 
Modelling Spatial Distribution of fish, by Benjamin Planque
Modelling Spatial Distribution of fish, by Benjamin PlanqueModelling Spatial Distribution of fish, by Benjamin Planque
Modelling Spatial Distribution of fish, by Benjamin Planque
 

Último

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 

Último (20)

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 

Predictive Habitat Distribution Models, Leire Ibaibarriaga

  • 1. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011  
  • 2. 2 OUTLINE • Why to model? • Habitat models • Model properties • Steps for modelling • What about data?
  • 3. 3 WHY TO MODEL? • “All models are wrong, some models are useful” (G. Box) • Models are how we understand the world: We see the world through models We learn about the world using formal descriptions • Model types: – Static vs dynamic – Explanatory vs predictive – Deterministic vs stochastic – Discrete vs continuous
  • 4. 4 HABITAT MODELS • Habitat models are focused on how environmental factors control the distribution of species and communities. • Multiple applications: – Biogeography, impact of the global change, management, conservation, ecology, … • New conceptual and operative advances due to the growth in computing power, e.g. GIS, remote sensing, new statistical modelling tools (computer intensive), etc
  • 5. 5 MODEL PROPERTIES Some desirable model properties: • Parsimony (Occam’s razor): “All things being equal, the simplest solution tends to be the best one” • Tractability: easy to be analysed • Conceptually insightful: reveal fundamental properties • Generalizability: can be applied to other situations/species/… • Empirical consistency: consistent with the available data • Falsifiability: can be tested by observations • Predictive precision
  • 6. 6 MODEL PROPERTIES Predictive habitat distribution models Levins (1966); Sharpe (1990); Guisan and Zimmermann (2000)
  • 7. 7 MODEL PROPERTIES COMPLEXITY GENERALITY The more complex model is not necessarily the best…
  • 8. 8 STEPS FOR MODELLING 1) Conceptual phase 2) Model formulation 3) Model calibration 4) Spatial predictions 5) Model evaluation 6) Model applicability
  • 9. 9 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 10. 10 1. Conceptual phase • Some sort of theoretical model should be in mind, before a statistical model is even considered • This phase includes: – Literature review – Define an up-to-date conceptual model – Set multiple hypothesis – Assess available and missing data – Identify appropriate sampling strategy for new data – Choose appropriate spatio-temporal resolution and geographic extent – Identify the most appropriate statistical methods for the other phases
  • 11. 11 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 12. 12 2. Model formulation • The model depends on the type of response variable and its associated probability distribution Distribution Examples Gaussian Biomass Poisson Individual counts Negative Binomial Individual counts Multinomial Communities Binomial Presence/absence
  • 13. 13 2. Model formulation Guisan and Zimmermann (2000)
  • 14. 14 REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 oct-11 © AZTI-Tecnalia x 14
  • 15. 15 REGRESSION ANALYSIS 2. Model formulation 50 40 30 y 20 10 0 0 2 4 6 8 10 © AZTI-Tecnalia x 15
  • 16. 16 REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 16
  • 17. 17 REGRESSION ANALYSIS 2. Model formulation 10 5 y 0 -5 0.0 0.2 0.4 0.6 0.8 1.0 oct-11 © AZTI-Tecnalia x 17
  • 18. 18 REGRESSION ANALYSIS 2. Model formulation LINK FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc McCullagh and Nelder (1989); Dobson (2008) © AZTI-Tecnalia 18 oct-11
  • 19. 19 REGRESSION ANALYSIS 2. Model formulation LINK SMOOTHS FUNCTION The response variable y can follow distributions like: NORMAL, BINOMIAL, POISSON, GAMMA, etc Hastie and Tibshirani (1990); Wood (2006) © AZTI-Tecnalia 19 oct-11
  • 20. 20 REGRESSION ANALYSIS 2. Model formulation Modelo lineal Modelo aditivo (LM) (AM) Modelo lineal generalizado Modelo aditivo generalizado (GLM) (GAM) oct-11 © AZTI-Tecnalia 20
  • 21. 21 REGRESSION ANALYSIS 2. Model formulation Other regression models: • Mixed models: LM, GLM and GAMs including random effect terms. Useful for meta-analysis. • Quantile regression: the quantiles are modelled instead of the mean. Useful for finding limiting factors • Segmented regression: the model changes depending on a partition of the explanatory variable. Useful for detecting regime changes • Spatial autocorrelation and autoregressive models
  • 22. 22 CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables
  • 23. 23 CLASSIFICATION TECHNIQUES 2. Model formulation • Classification is the placement of species and/or sample units into groups based on the environmental variables • Many techniques included: classification decision tree, regression decision tree, rule-based classification, maximum- likelihood classification • Mainly two groups: – Supervised classification: a training data set is required (groups are known beforehand) – unsupervised classification: groups are unknown and need to be defined, like in cluster analysis
  • 24. 24 ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991)
  • 25. 25 ENVIRONMENTAL ENVELOPES 2. Model formulation • The environmental envelope of a species is defined as the set of environments within which it is believed that the species can persist (Walker and Cocks, 1991) • Examples of models: – BIOCLIM: minimal rectilinear envelopes based on classification trees – HABITAT: convex polytope envelopes based on classification trees – DOMAIN: based on multivariate distance metrics
  • 26. 26 2. Model formulation • Ordination is the arrangement or ‘ordering’ of species and/or ORDINATION TECHNIQUES sample units along gradients • Usually applied to community data matrices (row: species, column: samples, value: abundance)
  • 27. 27 2. Model formulation • Indirect gradient analysis (no environmental data used) – Distance-based approaches: ORDINATION TECHNIQUES • Polar ordination, Principal Coordinates Analysis, Nonmetric Multidimensional Scaling – Eigenanalysis-based approaches • Linear model – Principal Components Analysis • Unimodal model – Correspondence Analysis, Detrended Correspondence Analysis • Direct gradient analysis (environmental data used) – Linear model • Redundancy Analysis – Unimodal model • Canonical Correspondence Analysis, Detrended Canonical Correspondence Analysis ter Braak and Prentice (1988)
  • 28. 28 2. Model formulation • Models inspired in the human-brain (interconnected group of neurons) NEURAL NETWORKS • They define a non-linear function, decomposed further as a weighted sum of functions, that similarly can be further decomponsed, etc. So, complex non-parametric model (black- box?) • Adjusted by varying parameters, connection weights, or specifics of the architecture such as the number of neurons or their connectivity • Few examples available yet
  • 29. 29 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 30. 30 3. Model calibration • It includes model fitting (find the best value of the unknown parameters to improve the agreement between the data and model outputs) and model selection (which explanatory variables to be included) • To take into account: – Use of predictors that are ecologically relevant: direct vs indirect (proxy) variables – Correlation between explanatory variables • Each method has each own diagnostic tools according to their assumptions, e.g, in regression models the residual deviance
  • 31. 31 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 32. 32 4.Spatial predictions • Spatial predictions can be done on the data set used for calibration or on new data sets. Care must be taken if predictions are done in a new data set with new combinations between the explanatory variables and for values outside the range of values in the data set for calibration • GIS tools are very often used, but still many statistical models are not implemented in a GIS environment
  • 33. 33 STEPS FOR MODELLING Guisan and Zimmermann (2000)
  • 34. 34 5. Model evaluation • The aim is to evaluate the predictive power of a model • If only one data set is available (we have used the data set for calibration), bootstrap, cross-validation, jacknife • If other data sets are available (independent of the calibration data set), predicted and observed values are compared using: – the same goodness of fit measure as used for model calibration – any other measure of association The data sets for calibration and evaluation are called respectively training and evaluation data sets. Sometimes the original single data set is split in two (split-sample approach)
  • 35. 35 STEPS FOR MODELLING APPLICABILITY Guisan and Zimmermann (2000)
  • 36. 36 6. Model applicability • It refers to the domain over which a validated model can be properly used • Potential uses (Decoursey, 1992): – Screening – Research – Planning, monitoring and assessment
  • 37. 37 WHAT ABOUT DATA? • Data is even more important than the model itself. • Usually from multiple sources: surveys (continuous, stations, vertical profiles), remote sensing, circulation models, … • The scale of the response and the environmental variables might not be the same. Need to define a common scale unit. Sometimes interpolation might be needed. This might include additional uncertainities • Simple exploratory statistics and figures can be very useful before even start thinking on any model. They also help to spot errors in the data.
  • 38. EURO-­‐BASIN,  www.euro-­‐basin.eu   Introduc)on  to  Sta)s)cal  Modelling  Tools  for  Habitat  Models  Development,  26-­‐28th  Oct  2011