Key lecture for the EURO-BASIN Training Workshop on Introduction to Statistical Modelling for Habitat Model Development, 26-28 Oct, AZTI-Tecnalia, Pasaia, Spain (www.euro-basin.eu)
2. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
3. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
4. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Introduction
ƒlides ˜—sed m—inly in ‡itten —nd pr—nk @PHHSAY €érez et —lF
@PHHSAY ellen @PHHWAY pern—ndes @PHIIA
y˜je™tiveX to me—sure how well — model represents truthF
„ruth ™—nnot ˜e —™™ur—tely me—suredX o˜serv—tionsF
uestionsX
How well the model ts the observations (goodness-of-t)?
How well the model forecast new events (generalisation)?
How superior is one model compared to another?
Which is more important, precision or trend?
enswersX
Validation procedures.
Metrics or performance measures.
Statistical tests.
5. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Model prediction (P), observations (O), true state (T)
—A model with no skill
˜A ide—l model
‚eprodu™ed from ƒtow et —lF @PHHWA —nd ellen @PHHWA
6. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Goodness-of-t vs generalisation
pittingX
N: Total number of cases
Training-set
Test-set
Chances of over-tting.
qener—liz—tion → tr—inEtest splitX
N: Total number of cases
Training-set Test-set
Hold-out (commonly 66%-33% split) (Larson, 1931)
Hold-out depends on how fortunate the train-test split is.
7. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
K-fold cross-validation (CV)
€erform—n™e is the —ver—ge of k models @v—™hen˜ru™h —nd
wi™keyD IWTVY ƒtoneD IWURAF
ell d—t— is eventu—lly used for testingF
ƒtill sensitive to d—t— splitX str—ti(edD repe—ted @fou™k—ert —nd
pr—nkD PHHRAF
‚eprodu™ed from €érez et —lF @PHHSAF
8. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Leave-one-out cross-validation (LOOCV)
N: Total number of cases
...
...
...
...
...
...
...
...
x modelsD xEI ™—ses for tr—ining —nd I ™—se for testing
@wosteller —nd „ukeyD IWTVAF
ƒuit—˜le for sm—ll d—t—setsD more ™omput—tion—lly expensiveF
†—ri—n™e of the error is the l—rgestD ˜ut less ˜i—sedF
st ™—n ˜e used for more st—˜le p—r—meters @less v—ri—n™eA
9. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Bootstrapping (0.632 bootstrap)
e ™—se h—s — HFTQP pro˜—˜ility of ˜eing pi™ked for tr—iningEset
@ifronD IWUWAF
error a H.TQP ∗ etest @gener—lis—tionA C H.QTV ∗ etraining @(tAF
et le—st IHH res—mplingsD some studies suggest IHHHHF
‚eprodu™ed from €érez et —lF @PHHSAF
10. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Sumarizing
Real performance Real performance
Estimated performance
Accuracy Estimated performance
Precision
sn™re—sing d—t— p—rtitions le—ds to FFF
more accurate performance estimation (+).
more variance in the performance estimation, less precise (-).
more computationally expensive (-).
uEfold ™rossEv—lid—tionX tr—deEo' @‚odríguez et —lFD PHIHAF
11. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in lter methods
Discretize Factor
Factors Selection
Naive Bayes
12. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in lter methods
Discretize Factor
Factors Selection
Naive Bayes
13. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in lter methods
Discretize Factor
Factors Selection
Naive Bayes
Full 10x5cv
Dataset
14. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in lter methods
Discretize Factor
Train 1
Factors Selection
Performance
Test 1 Naive Bayes estimation
...
(Fold 1)
...
...
Discretize Factor
...
Train 5
Factors Selection
Performance
Performance estimation
Test 5 Naive Bayes estimation
Full 10x5cv (Fold 5)
Dataset
15. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in lter methods
Discretize Factor
Train 1
R Factors Selection
E Performance
P Test 1 Naive Bayes estimation
E
...
(Fold 1)
...
...
A
Discretize Factor
...
T Train 5
Factors Selection
1 Performance
Performance estimation
Test 5 Naive Bayes estimation
(Repeat 1)
Full 10x5cv (Fold 5)
...
Dataset
Train 1 Whole methodology
R performance estimation
.
.
.
E 10 repeats average
P Test 1
E
...
A
Performance estimation
T Train 5
(Repeat 10)
10
.
.
.
Test 5
16. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodel†—lid—tion
Pipeline validation in wrapper methods
Model validation
Train 1
Discretize Discretize CFS with Model building
Class Predictors LOOCV
Test 1 Naive Bayes Performance estimation
Bootstrapping (Bootstrap 1)
...
...
(100)
...
...
Discretize Discretize CFS with Class
Train 100
Class Predictors LOOCV cut-off
points
Performance estimation
Test 100 Naive Bayes evaluation
(Bootstrap 100)
Apply selected
class discretization
Train 1
Discretize CFS with
Predictors LOOCV
R
E Performance
P Test 1 Naive Bayes estimation
E (Fold 1)
...
A
T Train 5
1
.
.
.
Performance Performance estimation
Test 5 estimation (Repeat 1)
(Fold 5) 5 folds average
Full 10x5cv
...
Dataset
Train 1
Whole methodology
R
.
.
.
E performance estimation
P Test 1 10 repeats average
E
...
A
T Train 5 Performance estimation
10 (Repeat 10)
.
.
.
5 folds average
Test 5
17. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
18. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
Introduction to metrics
i—™h metri™ shows — di'erent property of the model @rolt
et —lFD PHHSY pern—ndes et —lFD PHIHA
vow vs highX
Lower is better (error)
Higher is better (performance)
foundsX
Boundless
Between 0 and 1
Between 0 and 100%
19. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
20. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Numeric prediction metrics
‡here p —re predi™ted v—lues —nd a —re the —™tu—l v—luesF
we—nEsqu—red errorX outliers → me—n —˜solute errorF
‚el—tive squ—red errorX rel—tive to the me—n of —™tu—l v—luesF
gorrel—tion ™oe(™ientX ˜ounded ˜etween I —nd EIF
‚eprodu™ed from ‡itten —nd pr—nk @PHHSAF
21. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Root Mean Squared Error (RMSE)
(p − a)2
RMSE =
n
qoodness of (t ˜etween model —nd o˜serv—tionsF
„he ™loser to H the ˜etter is the (tF
sf ‚wƒi gre—ter th—n v—ri—n™e of o˜serv—tionsX poor modelF
‚eprodu™ed from ellen @PHHWA
22. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Nash Sutclie Model Eciency)
N 2
ME =I− n=1 (an − pn )
N 2
n=1 (an − a))
‚—tio of the model error to d—t— v—ri—˜ilityF
vevelsX bHFTS ex™ellentD bHFS very goodD bHFP goodD `HFP
poor wáre™h—l @PHHRAF
€roposed in x—sh —nd ƒut™li'e @IWUHAD reprodu™ed from ellen
@PHHWA
23. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Percentage Model Bias
N
Pbias = n=1 (an − pn ) ∗ IHH
N
n=1 (an )
ƒum of model error norm—lised ˜y the d—t—F
we—sure of underestim—tion or overestim—tion of o˜serv—tionsF
vevelsX `IH ex™ellentD `PH very goodD `RH goodD bRH poor
wáre™h—l @PHHRAF
‚eprodu™ed from ellen @PHHWA
24. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Pearson correlation coecient (R)
N
R = n=1 (an − a)(pn − p ) ∗ IHH
N 2 N 2
n=1 (an − a) n=1 (pn − p )
u—lity of (t of — model to o˜serv—tionsF
‚ a HD no rel—tionshipF
‚ a ID perfe™t (tF
ƒqu—re of the ™orrel—tion ™oe0™ient @R2 AX
per™ent—ge of the v—ri—˜ility in d—t— —™™ounted for ˜y the
modelF
‚eprodu™ed from ellen @PHHWAF
25. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Reliability Index (RI)
N
I an
RI = exp (log )2
n pn
n =1
p—™tor of divergen™e ˜etween predi™tions —nd d—t—F
‚s a PD me—ns — divergen™e on —ver—ge within of —
multipli™—tive f—™tor of PF
‚s the ™loser to I the ˜etterF
‚eprodu™ed from ellen @PHHWA
26. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Cost functions
ho —ll errors h—ve the s—me weightD ™ost or impli™—tionsc
ƒ™—ling of di'eren™es ˜etween p —nd aF
iFgF ‚wƒi s™—led ˜y the v—ri—n™e of d—t— @rolt et —lFD PHHSAF
hi'erent ™ost v—lues depending on the type of errorF
27. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
28. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
29. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
30. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
31. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
32. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
33. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Confusion matrix: accuracy and true positive
Accuracy = TPcases
#
+TN
True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF
34. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
Actual
High
35. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
36. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
37. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
38. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22
39. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22
p 4
0.4 0.5 0.1 (0.4-1)2 + (0.5-0)2 + (0.1-0)2 = 0.62
40. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA
yK = 1
l
yK = 0
l
Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22
p 4
0.4 0.5 0.1 (0.4-1)2 + (0.5-0)2 + (0.1-0)2 = 0.62
Brier Score: (0.14 + 0.06 +1.22 + 0.62) / 4 = 0.51
Normalized Brier Score: 0.51 / 2 = 0.255
41. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Percent Reduction in Error (PRE)
„he relev—n™e of — perform—n™e g—inF
e P7 g—in of —n —lre—dy highly —™™ur—te ™l—ssi(er @WH7A
FFF more relev—nt th—n with low st—rting —™™ur—™y @SH7A
EB − EA
PRE = IHH ·
EB
if is the error in the (rst method @irror feforeA
ie is in the se™ond method @irror efterA
42. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Accuracy paradox
w—inly with un˜—l—n™ed d—t—sets @hu —nd h—vidsonD PHHUY
e˜m—D PHHWAF
‚eprodu™ed from ‡ikipedi— @PHIIAF
43. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Minimum Description Length (MDL) principle
uiss ruleX ueep st ƒimple FFF y™™—m9s ‚—zorX
„he simplest expl—n—tion is the most likely to ˜e true FFF
FFF —nd is more e—sily —™™epted ˜y others FFF
FFF ˜utD it is not ne™ess—rily the truthF
„he more — sequen™e of d—t— ™—n ˜e ™ompressedD FFF
FFF the more regul—rity h—s ˜een dete™ted in the d—t—X
whvX winimum hes™ription vength @‚iss—nenD IWUVA
„r—deEo' ˜etween perform—n™e —nd ™omplexityF
ss whv f—lsec homingos @IWWWAY qrünw—ld et —lF @PHHSA
„r—deEo' ˜etween me™h—nism —nd ro˜ust p—r—metersF
sf two models h—ve s—me perform—n™e then keep the simplestF
44. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Example complex vs simple
45. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
€erform—n™ewe—sures
€erform—n™ewe—sures
Lift chart, ROC curve, recall-precision curve
46. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
47. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Corrected paired t-test
ƒt—tisti™—l ™omp—risons of the perform—n™eF
sde—lX test over sever—l d—t—sets of size NF
xull hypothesis th—t the me—n di'eren™e is zeroF irrorsX
„ype sX pro˜F the test reje™ts the null hypothesis in™orre™tly
„ype ssX pro˜F the null hypotF is not reje™ted with di'eren™eF
‚e—lityX only one d—t—set of size N to get —ll estim—tesF
€ro˜lemX „ype s errors ex™eed the signi(™—n™e level
ƒolutionX heuristi™ versions of the t-testF
@x—de—u —nd fengioD PHHQY w™gluskey —nd v—lkhenD PHHUY
uotsi—ntisD PHHUY pern—ndesD PHIIA
gomp—ring w…v„s€vi methods over yxi d—t—setsF
gomp—ring yxi methods over w…v„s€vi d—t—setsF
48. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Critical dierence diagrams
§
€roposed ˜y hems—r @PHHTA
‚evised priedm—n plus ƒh—'er9s st—ti™ postEho™ test @q—r™í—
—nd rerrer—D PHHVAF
gomp—ring w…v„s€vi methods over w…v„s€vi d—t—setsF
ƒhows —ver—ge r—nk of methods superiority in d—t—setsF
xo signi(™—nt di'eren™eX line ™onne™ting methodsF
wore d—t—setsX more e—sy to (nd signi(™—nt diferen™esF
49. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Taylor diagrams
2
E = σf + σr − Pσf σr R ; c 2 = a2 + b2 − Pab ™os ϕ
2 2
ƒimult—neouslyX ‚wƒ di'eren™eD ™orrel—tion —nd stdF devF
‚X ™orrel—tion p aY E X ‚wƒ di'FY σf σr X v—ri—n™es p aF
2 2
€roposed in „—ylor @PHHIAD reprodu™ed from ellen @PHHWAF
50. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Target diagrams
‚wƒi in ˆE—xisY fi—s in ‰E—xisF
p ƒtdF hevF l—rger @xbHA th—n aY fi—s positive @‰bHA or notF
‚eprodu™ed from tolli' et —lF @PHHWA —nd ellen @PHHWAF
51. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
wodelgomp—rison
Multivariate aproaches
…niEv—ri—te multiEv—ri—te metri™s summ—rize model skillF
wultiEv—ri—te —ppro—™hesX simult—neous ex—min—tion of sever—l
v—ri—˜les v—ri—tion to e—™h other sp—ti—lly —nd tempor—llyF
€rin™ip—l gomponet en—lysis @€geA @tolli'eD PHHPAF
ƒhow the rel—tionship ˜etween sever—l v—ri—˜les in Ph sp—™eF
wulti himension—l ƒ™—lling @whƒA @forg —nd qroenenD PHHSAF
ixploring simil—rities or dissimil—rities in d—t—
ƒelf org—nizing w—ps @ƒywA @uohonen —nd w—psD PHHIAF
€rodu™e — lowEdimension—l dis™retized represent—tion of the
o˜serv—tionsF
52. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
ix—mples
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
53. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
ix—mples
Zooplankton biomass models
ƒever—l models (ts with squ—red errorF
‚eprodu™ed from srigoien et —lF @PHHWAF
54. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
ix—mples
An example of anchovy recruitment
€erform—n™e reported depending on v—lid—tion s™hem—F
‚eprodu™ed from pern—ndes et —lF @PHIHAF
55. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
ix—mples
Phytoplankton classication
‡ithout @„—˜le sssA —nd with @„—˜le ssA st—tisti™—l di'eren™es
@™orre™ted p—ired tEtestAF
‚eprodu™ed from —r—uz et —lF @PHHWA —nd —r—uz et —lF
@PHHVAF
56. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
ix—mples
Zooplankton classication
‚eprodu™ed from pern—ndes et —lF @PHHWAF
57. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‡ek—
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
58. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‡ek—
Weka explorer
59. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‡ek—
Weka experimenter
60. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‡ek—
Weka knowledge ow
61. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‚eferen™es
Outline
1 wodel v—lid—tion
2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion
3 gomp—ring methodologies —nd models
4 ix—mples
5 ‡ek—X open sour™e d—t— mining tool
6 ‚eferen™es
62. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‚eferen™es
e˜m—D fF @PHHWAF Evaluation of requirements management tools with support for traceability-based
change impact analysisF €hh thesisD …niversity of „wenteD ins™hedeD „he xetherl—ndsF
ellenD tF @PHHWAF hPFU user guide —nd report outlining v—lid—tion methodologyF Deliverable in project
Marine Ecosystem Evolution in a Changing Enviroment (MEECE).
forgD sF —nd qroenenD €F @PHHSAF Modern multidimensional scaling: Theory and applicationsF ƒpringer
†erl—gF
fou™k—ertD ‚F ‚F —nd pr—nkD iF @PHHRAF iv—lu—ting the repli™—˜ility of signi(™—n™e tests for ™omp—ring
le—rning —lgorithmsF Lect. Notes Artif. Int.D p—ges Q!IPF
frierD qF ‡F @IWSHAF †eri(™—tion of fore™—sts expressed in terms of pro˜—˜ilityF Month. Weather Rev.D
UV@IAXI!QF
hems—rD tF @PHHTAF ƒt—tisti™—l ™omp—risons of ™l—ssi(ers over multiple d—t— setsF J. Mach. Learn. Res.D
§
UXI!QHF
homingosD €F @IWWWAF „he role of y™™—m9s r—zor in knowledge dis™overyF Data Min. Knowl. DiscD
Q@RAXRHW!RPSF
ifronD fF @IWUWAF footstr—p methodsX —nother look —t the j—™kknifeF Ann. Stat.D U@IAXI!PTF
pern—ndesD tF @PHIIAF Data analysis advances in marine science for sheries management: Supervised
classication applicationsF €hh thesisD …niversity of the f—sque gountryD ƒ—n ƒe˜—sti—nD quipuzko—D
ƒp—inF
pern—ndesD tF eFD srigoienD ˆFD foyr—D qFD voz—noD tF eFD —nd snz—D sF @PHHWAF yptimizing the num˜er of
™l—sses in —utom—ted zoopl—nkton ™l—ssi(™—tionF J. Plankton Res.D QI@IAXIW!PWF
pern—ndesD tF eFD srigoienD ˆFD qoikoetxe—D xFD voz—noD tF eFD snz—D sFD €érezD eFD —nd fodeD eF @PHIHAF
pish re™ruitment predi™tionD using ro˜ust supervised ™l—ssi(™—tion methodsF Ecol. Model.D
PPI@PAXQQV!QSPF
pr—n™isD ‚F sF gF @PHHTAF we—suring the strength of environmentEre™ruitment rel—tionshipsX the
import—n™e of in™luding predi™tor s™reening within ™rossEv—lid—tionsF ICES J. Mar. Sci.D TQ@RAXSWRF
q—r™í—D ƒF —nd rerrer—D pF @PHHVAF en extension on 9st—tisti™—l ™omp—risons of ™l—ssi(ers over multiple
d—t— sets9 for —ll p—irwise ™omp—risonsF J. Mach. Learn. Res.D WXPTUU!PTWRF
63. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‚eferen™es
qrünw—ldD €FD wyungD sFD —nd €ittD wF @PHHSAF Advances in minimum description length: Theory and
applicationsF „he ws„ €ressF
roltD tFD ellenD tFD €ro™torD ‚FD —nd qil˜ertD pF @PHHSAF irror qu—nti(™—tion of — highEresolution ™oupled
hydrodyn—mi™Ee™osystem ™o—st—lEo™e—n modelX €—rt I model overview —nd —ssessment of the
hydrodyn—mi™sF Journal of Marine SystemsD SU@IEPAXITU!IVVF
srigoienD ˆFD pern—ndesD tFD qrosje—nD €FD henisD uFD el˜—in—D eFD —nd ƒ—ntosD wF @PHHWAF ƒpring
zoopl—nkton distri˜ution in the f—y of fis™—y from IWWV to PHHT in rel—tion with —n™hovy
re™ruitmentF J. Plankton Res.D QI@IAXI!IUF
tolli'D tFD uindleD tFD ƒhulm—nD sFD €ent—D fFD priedri™hsD wFD rel˜erD ‚FD —nd ernoneD ‚F @PHHWAF
ƒumm—ry di—gr—ms for ™oupled hydrodyn—mi™Ee™osystem model skill —ssessmentF Journal of Marine
SystemsD UT@IEPAXTR!VPF
tolli'eD sF @PHHPAF €rin™ip—l ™omponent —n—lysisF Encyclopedia of Statistics in Behavioral ScienceF
uohonenD „F —nd w—psD ƒF @PHHIAF ƒpringer series in inform—tion s™ien™esF New York, New YorkF
uotsi—ntisD ƒF @PHHUAF ƒupervised w—™hine ve—rningX e ‚eview of gl—ssi(™—tion „e™hniquesF Inform.D
QIXPRW!PTVF
v—™hen˜ru™hD €F —nd wi™keyD wF @IWTVAF istim—tion of error r—tes in dis™rimin—nt —n—lysisF
TechnometricsD p—ges I!IIF
v—rsonD ƒF gF @IWQIAF „he shrink—ge of the ™oe0™ient of multiple ™orrel—tionF J. Educ. Psychol.D
PP@IAXRS!SSF
wáre™h—lD hF @PHHRAF A soil-based approach to rainfall-runo modelling in ungauged catchments for
England and WalesF €hh thesisD gr—n(eld …niversityD gr—n(eldD …uF
w™gluskeyD eF —nd v—lkhenD eF qF @PHHUAF ƒt—tisti™s ivX snterpreting the results of st—tisti™—l testsF
Continuing Education in Anaesthesia, Critical Care PainD U@TAXPHV!PIPF
wostellerD pF —nd „ukeyD tF pF @IWTVAF Data Analysis, Including StatisticsF sn qF vindzey —nd iF eronsonD
editorsF r—nd˜ook of ƒo™i—l €sy™hologyD †olF ssF eddisonE‡esleyD ‚e—dingD weD …ƒeF
x—de—uD gF —nd fengioD ‰F @PHHQAF snferen™e for the gener—liz—tion errorF Mach. Learn.D SP@QAXPQW!PVIF
64. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‚eferen™es
x—shD tF —nd ƒut™li'eD tF @IWUHAF ‚iver )ow fore™—sting through ™on™eptu—l models p—rt i!— dis™ussion of
prin™iplesF Journal of hydrologyD IH@QAXPVP!PWHF
€érezD eFD v—rr—ñ—g—D €FD —nd sFD sF @PHHSAF istim—rD des™omponer y ™omp—r—r el error de m—l—
™l—si(™—™iónF sn Primer Congreso Español de InformáticaF
‚iss—nenD tF @IWUVAF wodeling ˜y the shortest d—t— des™riptionF AutomaticaD IRXRTS!RUIF
‚odríguezD tF hFD €érezD eFD —nd voz—noD tF eF @PHIHAF ƒensitivity —n—lysis of kEfold ™rossEv—lid—tion in
predi™tion error estim—tionF IEEE Trans. Pattern Anal. Mach. Intell.D QP@QAXSTW!SUSF
ƒ™hirrip—D wF tF —nd gol˜ertD tF tF @PHHTAF snter—nnu—l ™h—nges in s—˜le(sh @enoplopom— (m˜ri—A
re™ruitment in rel—tion to o™e—nogr—phi™ ™onditions within the g—liforni— gurrent ƒystemF Fish.
Oceanogr.D IS@IAXPS!QTF
ƒtoneD wF @IWURAF grossEv—lid—tory ™hoi™e —nd —ssessment of st—tisti™—l predi™tionsF J. Roy. Statistical
Society, Series BD QTF
ƒtowD gFD tolli'D tFD w™qilli™uddy trD hFD honeyD ƒFD ellenD tFD priedri™hsD wFD —nd ‚oseD uF @PHHWAF ƒkill
—ssessment for ™oupled ˜iologi™—lGphysi™—l models of m—rine systemsF Journal of Marine SystemsD
UT@IEPAXR!ISF
„—ylorD uF @PHHIAF ƒumm—rizing multiple —spe™ts of model perform—n™e in — single di—gr—mF J. Geophys.
ResD IHT@hUAXUIVQ!UIWPF
v—n der q——gD vF gFD ‚enooijD ƒFD ‡ittem—nD gF vF wFD elem—nD fF wF €FD —nd „——lD fF qF @PHHPAF
€ro˜—˜ilities for — pro˜—˜ilisti™ networkX — ™—se study in oesoph—ge—l ™—n™erF Artif. Intell. Med.D
PS@PAXIPQ!IRVF
‡ikipedi— @PHIIAF e™™ur—™y p—r—doxF ‘ynlineY —™™essed ISEƒeptem˜erEPHII“F
‡ittenD sF rF —nd pr—nkD iF @PHHSAF Data mining: Practical Machine Learning Tools and Techniques
with Java Implementations, Morgan Kaufmann, San Francisco, CA, USAF
‰eungD uF ‰FD fumg—rnerD ‚F iFD —nd ‚—fteryD eF iF @PHHSAF f—yesi—n model —ver—gingX development of
—n improved multiE™l—ssD gene sele™tion —nd ™l—ssi(™—tion tool for mi™ro—rr—y d—t—F BioinformaticsD
PI@IHAXPQWR!PRHPF
65. i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development
‚eferen™es
—r—uzD vFD srigoienD ˆFD —nd pern—ndesD tF @PHHVAF wodelling the in)uen™e of —˜ioti™ —nd ˜ioti™ f—™tors
on pl—nkton distri˜ution in the f—y of fis™—yD during three ™onse™utive ye—rs @PHHREHTAF J. Plankton
Res.D QH@VAXVSUF
—r—uzD vFD srigoienD ˆFD —nd pern—ndesD tF @PHHWAF gh—nges in pl—nkton size stru™ture —nd ™ompositionD
during the gener—tion of — phytopl—nkton ˜loomD in the ™entr—l g—nt—˜ri—n se—F J. Plankton Res.D
QI@PAXIWQ!PHUF
huD ˆF —nd h—vidsonD sF @PHHUAF Knowledge discovery and data mining: challenges and realitiesF sgi
qlo˜—lF