Model Validation, performance measures, models comparison and Weka (open source software for data mining) by JA Fernandes

EURO-‐BASIN,
www.euro-‐basin.eu
Introduc)on
to
Sta)s)cal
Modelling
Tools
for
Habitat
Models
Development,
26-‐28th
Oct
2011

i…‚yEfeƒsx „r—ining ‡orkshop on sntrodu™tion to st—tisti™—l modelling toolsD for h—˜it—t models development

Outline

1 wodel v—lid—tion

2 €erform—n™e me—sures or metri™s
wetri™s in numeri™ predi™tion
wetri™s in ™l—ssi(™—tion

3 gomp—ring methodologies —nd models

4 ix—mples

5 ‡ek—X open sour™e d—t— mining tool

6 ‚eferen™es

wodel†—lid—tion

Outline




4 ix—mples


6 ‚eferen™es


Introduction

ƒlides ˜—sed m—inly in ‡itten —nd pr—nk @PHHSAY €érez et —lF
@PHHSAY ellen @PHHWAY pern—ndes @PHIIA
y˜je™tiveX to me—sure how well — model represents truthF
„ruth ™—nnot ˜e —™™ur—tely me—suredX o˜serv—tionsF

uestionsX
How well the model ts the observations (goodness-of-t)?
How well the model forecast new events (generalisation)?
How superior is one model compared to another?
Which is more important, precision or trend?

enswersX
Validation procedures.
Metrics or performance measures.
Statistical tests.


Model prediction (P), observations (O), true state (T)

—A model with no skill
˜A ide—l model
‚eprodu™ed from ƒtow et —lF @PHHWA —nd ellen @PHHWA


Goodness-of-t vs generalisation

pittingX
N: Total number of cases

Training-set
Test-set

Chances of over-tting.

qener—liz—tion → tr—inEtest splitX

Training-set Test-set

Hold-out (commonly 66%-33% split) (Larson, 1931)
Hold-out depends on how fortunate the train-test split is.


K-fold cross-validation (CV)

€erform—n™e is the —ver—ge of k models @v—™hen˜ru™h —nd
wi™keyD IWTVY ƒtoneD IWURAF
ell d—t— is eventu—lly used for testingF
ƒtill sensitive to d—t— splitX str—ti(edD repe—ted @fou™k—ert —nd
pr—nkD PHHRAF
‚eprodu™ed from €érez et —lF @PHHSAF


Leave-one-out cross-validation (LOOCV)

...
...
...
...

...

...
...
...
x modelsD xEI ™—ses for tr—ining —nd I ™—se for testing
@wosteller —nd „ukeyD IWTVAF
ƒuit—˜le for sm—ll d—t—setsD more ™omput—tion—lly expensiveF
†—ri—n™e of the error is the l—rgestD ˜ut less ˜i—sedF
st ™—n ˜e used for more st—˜le p—r—meters @less v—ri—n™eA


Bootstrapping (0.632 bootstrap)

e ™—se h—s — HFTQP pro˜—˜ility of ˜eing pi™ked for tr—iningEset
@ifronD IWUWAF
error a H.TQP ∗ etest @gener—lis—tionA C H.QTV ∗ etraining @(tAF
et le—st IHH res—mplingsD some studies suggest IHHHHF
‚eprodu™ed from €érez et —lF @PHHSAF


Sumarizing

Real performance Real performance
Estimated performance
Accuracy Estimated performance

Precision

sn™re—sing d—t— p—rtitions le—ds to FFF
more accurate performance estimation (+).
more variance in the performance estimation, less precise (-).
more computationally expensive (-).
uEfold ™rossEv—lid—tionX tr—deEo' @‚odríguez et —lFD PHIHAF


Pipeline validation in lter methods

Discretize Factor
Factors Selection

Naive Bayes



Discretize Factor
Factors Selection

Naive Bayes

Full 10x5cv
Dataset



Discretize Factor
Train 1
Factors Selection
Performance
Test 1 Naive Bayes estimation

...
(Fold 1)

...
...

Discretize Factor

...
Train 5
Factors Selection
Performance
Performance estimation
Full 10x5cv (Fold 5)
Dataset



Discretize Factor
Train 1
R Factors Selection
E Performance
P Test 1 Naive Bayes estimation
E

...
(Fold 1)

...
...
A
Discretize Factor

...
T Train 5
Factors Selection
1 Performance
(Repeat 1)
Full 10x5cv (Fold 5)
...

Dataset
Train 1 Whole methodology
R performance estimation
.
.
.
E 10 repeats average
P Test 1
E
...

A
T Train 5
(Repeat 10)
10
.
.
.

Test 5


Pipeline validation in wrapper methods
Model validation
Train 1
Discretize Discretize CFS with Model building
Class Predictors LOOCV

Test 1 Naive Bayes Performance estimation
Bootstrapping (Bootstrap 1)

...
...
(100)

...
...
Discretize Discretize CFS with Class
Train 100
Class Predictors LOOCV cut-off
points
Test 100 Naive Bayes evaluation
(Bootstrap 100)

Apply selected
class discretization
Train 1
Discretize CFS with
Predictors LOOCV
R
E Performance
P Test 1 Naive Bayes estimation
E (Fold 1)
...

A
T Train 5
1
.
.
.

Performance Performance estimation
Test 5 estimation (Repeat 1)
(Fold 5) 5 folds average
Full 10x5cv
...

Dataset
Train 1
Whole methodology
R
.
.
.

E performance estimation
P Test 1 10 repeats average
E
...

A
T Train 5 Performance estimation
10 (Repeat 10)
.
.
.

5 folds average
Test 5

€erform—n™ewe—sures

Outline




4 ix—mples


6 ‚eferen™es


Introduction to metrics

i—™h metri™ shows — di'erent property of the model @rolt
et —lFD PHHSY pern—ndes et —lFD PHIHA
vow vs highX
Lower is better (error)
Higher is better (performance)
foundsX
Boundless
Between 0 and 1
Between 0 and 100%


Outline




4 ix—mples


6 ‚eferen™es


Numeric prediction metrics

‡here p —re predi™ted v—lues —nd a —re the —™tu—l v—luesF
we—nEsqu—red errorX outliers → me—n —˜solute errorF
‚el—tive squ—red errorX rel—tive to the me—n of —™tu—l v—luesF
gorrel—tion ™oe(™ientX ˜ounded ˜etween I —nd EIF
‚eprodu™ed from ‡itten —nd pr—nk @PHHSAF


Root Mean Squared Error (RMSE)

(p − a)2
RMSE =
n

qoodness of (t ˜etween model —nd o˜serv—tionsF
„he ™loser to H the ˜etter is the (tF
sf ‚wƒi gre—ter th—n v—ri—n™e of o˜serv—tionsX poor modelF
‚eprodu™ed from ellen @PHHWA


Nash Sutclie Model Eciency)

N 2
ME =I− n=1 (an − pn )
N 2
n=1 (an − a))

‚—tio of the model error to d—t— v—ri—˜ilityF
vevelsX bHFTS ex™ellentD bHFS very goodD bHFP goodD `HFP
poor wáre™h—l @PHHRAF
€roposed in x—sh —nd ƒut™li'e @IWUHAD reprodu™ed from ellen
@PHHWA


Percentage Model Bias

N
Pbias = n=1 (an − pn ) ∗ IHH
N
n=1 (an )

ƒum of model error norm—lised ˜y the d—t—F
we—sure of underestim—tion or overestim—tion of o˜serv—tionsF
vevelsX `IH ex™ellentD `PH very goodD `RH goodD bRH poor
wáre™h—l @PHHRAF


Pearson correlation coecient (R)

N
R = n=1 (an − a)(pn − p ) ∗ IHH
N 2 N 2
n=1 (an − a) n=1 (pn − p )

u—lity of (t of — model to o˜serv—tionsF
‚ a HD no rel—tionshipF
‚ a ID perfe™t (tF
ƒqu—re of the ™orrel—tion ™oe0™ient @R2 AX
per™ent—ge of the v—ri—˜ility in d—t— —™™ounted for ˜y the
modelF
‚eprodu™ed from ellen @PHHWAF


Reliability Index (RI)

N
I an
RI = exp (log )2
n pn
n =1

p—™tor of divergen™e ˜etween predi™tions —nd d—t—F
‚s a PD me—ns — divergen™e on —ver—ge within of —
multipli™—tive f—™tor of PF
‚s the ™loser to I the ˜etterF


Cost functions
ho —ll errors h—ve the s—me weightD ™ost or impli™—tionsc
ƒ™—ling of di'eren™es ˜etween p —nd aF
iFgF ‚wƒi s™—led ˜y the v—ri—n™e of d—t— @rolt et —lFD PHHSAF
hi'erent ™ost v—lues depending on the type of errorF


Confusion matrix: accuracy and true positive

Accuracy = TPcases
#
+TN

True Positive Rate = TPTPFN
+
righer is ˜etter for ˜othF


Brier Score
@frierD IWSHY v—n der q——g et —lFD PHHPY ‰eung et —lFD PHHSA
#cases #classes k
Brier Score =
1
#cases k =1 l =1 (pl − ylk )2
vower is ˜etter @™ontr—ry to —™™ur—™y 8 true positiveA
vevelsX `HFIH ex™ellentD `PH superiorD `HFQH —dequ—teD `HFQS
—™™ept—˜leD bHFQS insu(™ient @pern—ndesD PHIIA

yK = 1
l

Actual
High


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22
p 4
0.4 0.5 0.1 (0.4-1)2 + (0.5-0)2 + (0.1-0)2 = 0.62


Brier Score
#cases #classes k
Brier Score =
1

yK = 1
l
yK = 0
l

Actual Otherwise
High Medium Low
p1 0.7 0.2 0.1 (0.7-1)2 + (0.2-0)2 + (0.1-0)2 = 0.14
p 2
0.8 0.1 0.1 (0.8-1)2 + (0.1-0)2 + (0.1-0)2 = 0.06
p3 0.1 0.5 0.4 (0.1-1)2 + (0.5-0)2 + (0.4-0)2 = 1.22
p 4
0.4 0.5 0.1 (0.4-1)2 + (0.5-0)2 + (0.1-0)2 = 0.62
Brier Score: (0.14 + 0.06 +1.22 + 0.62) / 4 = 0.51
Normalized Brier Score: 0.51 / 2 = 0.255


Percent Reduction in Error (PRE)

„he relev—n™e of — perform—n™e g—inF
e P7 g—in of —n —lre—dy highly —™™ur—te ™l—ssi(er @WH7A
FFF more relev—nt th—n with low st—rting —™™ur—™y @SH7A

EB − EA
PRE = IHH ·
EB

if is the error in the (rst method @irror feforeA
ie is in the se™ond method @irror efterA


Accuracy paradox

w—inly with un˜—l—n™ed d—t—sets @hu —nd h—vidsonD PHHUY
e˜m—D PHHWAF
‚eprodu™ed from ‡ikipedi— @PHIIAF


Minimum Description Length (MDL) principle

uiss ruleX ueep st ƒimple FFF y™™—m9s ‚—zorX
„he simplest expl—n—tion is the most likely to ˜e true FFF
FFF —nd is more e—sily —™™epted ˜y others FFF
FFF ˜utD it is not ne™ess—rily the truthF

„he more — sequen™e of d—t— ™—n ˜e ™ompressedD FFF
FFF the more regul—rity h—s ˜een dete™ted in the d—t—X
whvX winimum hes™ription vength @‚iss—nenD IWUVA

„r—deEo' ˜etween perform—n™e —nd ™omplexityF
ss whv f—lsec homingos @IWWWAY qrünw—ld et —lF @PHHSA
„r—deEo' ˜etween me™h—nism —nd ro˜ust p—r—metersF
sf two models h—ve s—me perform—n™e then keep the simplestF


Example complex vs simple


Lift chart, ROC curve, recall-precision curve

wodelgomp—rison

Outline




4 ix—mples


6 ‚eferen™es

wodelgomp—rison

Corrected paired t-test
ƒt—tisti™—l ™omp—risons of the perform—n™eF
sde—lX test over sever—l d—t—sets of size NF

xull hypothesis th—t the me—n di'eren™e is zeroF irrorsX
„ype sX pro˜F the test reje™ts the null hypothesis in™orre™tly
„ype ssX pro˜F the null hypotF is not reje™ted with di'eren™eF
‚e—lityX only one d—t—set of size N to get —ll estim—tesF
€ro˜lemX „ype s errors ex™eed the signi(™—n™e level
ƒolutionX heuristi™ versions of the t-testF

@x—de—u —nd fengioD PHHQY w™gluskey —nd v—lkhenD PHHUY
uotsi—ntisD PHHUY pern—ndesD PHIIA
gomp—ring w…v„s€vi methods over yxi d—t—setsF
gomp—ring yxi methods over w…v„s€vi d—t—setsF

wodelgomp—rison

Critical dierence diagrams

§
€roposed ˜y hems—r @PHHTA
‚evised priedm—n plus ƒh—'er9s st—ti™ postEho™ test @q—r™í—
—nd rerrer—D PHHVAF
gomp—ring w…v„s€vi methods over w…v„s€vi d—t—setsF
ƒhows —ver—ge r—nk of methods superiority in d—t—setsF
xo signi(™—nt di'eren™eX line ™onne™ting methodsF
wore d—t—setsX more e—sy to (nd signi(™—nt diferen™esF

wodelgomp—rison

Taylor diagrams

2
E = σf + σr − Pσf σr R ; c 2 = a2 + b2 − Pab ™os ϕ
2 2

ƒimult—neouslyX ‚wƒ di'eren™eD ™orrel—tion —nd stdF devF
‚X ™orrel—tion p aY E X ‚wƒ di'FY σf σr X v—ri—n™es p aF
2 2

€roposed in „—ylor @PHHIAD reprodu™ed from ellen @PHHWAF

wodelgomp—rison

Target diagrams

‚wƒi in ˆE—xisY fi—s in ‰E—xisF
p ƒtdF hevF l—rger @xbHA th—n aY fi—s positive @‰bHA or notF

‚eprodu™ed from tolli' et —lF @PHHWA —nd ellen @PHHWAF

wodelgomp—rison

Multivariate aproaches

…niEv—ri—te multiEv—ri—te metri™s summ—rize model skillF
wultiEv—ri—te —ppro—™hesX simult—neous ex—min—tion of sever—l
v—ri—˜les v—ri—tion to e—™h other sp—ti—lly —nd tempor—llyF

€rin™ip—l gomponet en—lysis @€geA @tolli'eD PHHPAF
ƒhow the rel—tionship ˜etween sever—l v—ri—˜les in Ph sp—™eF

wulti himension—l ƒ™—lling @whƒA @forg —nd qroenenD PHHSAF
ixploring simil—rities or dissimil—rities in d—t—

ƒelf org—nizing w—ps @ƒywA @uohonen —nd w—psD PHHIAF
€rodu™e — lowEdimension—l dis™retized represent—tion of the
o˜serv—tionsF

ix—mples

Outline




4 ix—mples


6 ‚eferen™es

ix—mples

Zooplankton biomass models

ƒever—l models (ts with squ—red errorF
‚eprodu™ed from srigoien et —lF @PHHWAF

ix—mples

An example of anchovy recruitment

€erform—n™e reported depending on v—lid—tion s™hem—F
‚eprodu™ed from pern—ndes et —lF @PHIHAF

ix—mples

Phytoplankton classication

‡ithout @„—˜le sssA —nd with @„—˜le ssA st—tisti™—l di'eren™es
@™orre™ted p—ired tEtestAF
‚eprodu™ed from —r—uz et —lF @PHHWA —nd —r—uz et —lF
@PHHVAF

ix—mples

Zooplankton classication

‚eprodu™ed from pern—ndes et —lF @PHHWAF

‡ek—

Outline




4 ix—mples


6 ‚eferen™es

‡ek—

Weka explorer

‡ek—

Weka experimenter

‡ek—

Weka knowledge ow

‚eferen™es

Outline




4 ix—mples


6 ‚eferen™es

‚eferen™es

e˜m—D fF @PHHWAF Evaluation of requirements management tools with support for traceability-based
change impact analysisF €hh thesisD …niversity of „wenteD ins™hedeD „he xetherl—ndsF
ellenD tF @PHHWAF hPFU user guide —nd report outlining v—lid—tion methodologyF Deliverable in project
Marine Ecosystem Evolution in a Changing Enviroment (MEECE).
forgD sF —nd qroenenD €F @PHHSAF Modern multidimensional scaling: Theory and applicationsF ƒpringer
†erl—gF
fou™k—ertD ‚F ‚F —nd pr—nkD iF @PHHRAF iv—lu—ting the repli™—˜ility of signi(™—n™e tests for ™omp—ring
le—rning —lgorithmsF Lect. Notes Artif. Int.D p—ges Q!IPF
frierD qF ‡F @IWSHAF †eri(™—tion of fore™—sts expressed in terms of pro˜—˜ilityF Month. Weather Rev.D
UV@IAXI!QF
hems—rD tF @PHHTAF ƒt—tisti™—l ™omp—risons of ™l—ssi(ers over multiple d—t— setsF J. Mach. Learn. Res.D
§
UXI!QHF
homingosD €F @IWWWAF „he role of y™™—m9s r—zor in knowledge dis™overyF Data Min. Knowl. DiscD
Q@RAXRHW!RPSF
ifronD fF @IWUWAF footstr—p methodsX —nother look —t the j—™kknifeF Ann. Stat.D U@IAXI!PTF
pern—ndesD tF @PHIIAF Data analysis advances in marine science for sheries management: Supervised
classication applicationsF €hh thesisD …niversity of the f—sque gountryD ƒ—n ƒe˜—sti—nD quipuzko—D
ƒp—inF
pern—ndesD tF eFD srigoienD ˆFD foyr—D qFD voz—noD tF eFD —nd snz—D sF @PHHWAF yptimizing the num˜er of
™l—sses in —utom—ted zoopl—nkton ™l—ssi(™—tionF J. Plankton Res.D QI@IAXIW!PWF
pern—ndesD tF eFD srigoienD ˆFD qoikoetxe—D xFD voz—noD tF eFD snz—D sFD €érezD eFD —nd fodeD eF @PHIHAF
pish re™ruitment predi™tionD using ro˜ust supervised ™l—ssi(™—tion methodsF Ecol. Model.D
PPI@PAXQQV!QSPF
pr—n™isD ‚F sF gF @PHHTAF we—suring the strength of environmentEre™ruitment rel—tionshipsX the
import—n™e of in™luding predi™tor s™reening within ™rossEv—lid—tionsF ICES J. Mar. Sci.D TQ@RAXSWRF
q—r™í—D ƒF —nd rerrer—D pF @PHHVAF en extension on 9st—tisti™—l ™omp—risons of ™l—ssi(ers over multiple
d—t— sets9 for —ll p—irwise ™omp—risonsF J. Mach. Learn. Res.D WXPTUU!PTWRF

‚eferen™es

qrünw—ldD €FD wyungD sFD —nd €ittD wF @PHHSAF Advances in minimum description length: Theory and
applicationsF „he ws„ €ressF
roltD tFD ellenD tFD €ro™torD ‚FD —nd qil˜ertD pF @PHHSAF irror qu—nti(™—tion of — highEresolution ™oupled
hydrodyn—mi™Ee™osystem ™o—st—lEo™e—n modelX €—rt I model overview —nd —ssessment of the
hydrodyn—mi™sF Journal of Marine SystemsD SU@IEPAXITU!IVVF
srigoienD ˆFD pern—ndesD tFD qrosje—nD €FD henisD uFD el˜—in—D eFD —nd ƒ—ntosD wF @PHHWAF ƒpring
zoopl—nkton distri˜ution in the f—y of fis™—y from IWWV to PHHT in rel—tion with —n™hovy
re™ruitmentF J. Plankton Res.D QI@IAXI!IUF
tolli'D tFD uindleD tFD ƒhulm—nD sFD €ent—D fFD priedri™hsD wFD rel˜erD ‚FD —nd ernoneD ‚F @PHHWAF
ƒumm—ry di—gr—ms for ™oupled hydrodyn—mi™Ee™osystem model skill —ssessmentF Journal of Marine
SystemsD UT@IEPAXTR!VPF
tolli'eD sF @PHHPAF €rin™ip—l ™omponent —n—lysisF Encyclopedia of Statistics in Behavioral ScienceF
uohonenD „F —nd w—psD ƒF @PHHIAF ƒpringer series in inform—tion s™ien™esF New York, New YorkF
uotsi—ntisD ƒF @PHHUAF ƒupervised w—™hine ve—rningX e ‚eview of gl—ssi(™—tion „e™hniquesF Inform.D
QIXPRW!PTVF
v—™hen˜ru™hD €F —nd wi™keyD wF @IWTVAF istim—tion of error r—tes in dis™rimin—nt —n—lysisF
TechnometricsD p—ges I!IIF
v—rsonD ƒF gF @IWQIAF „he shrink—ge of the ™oe0™ient of multiple ™orrel—tionF J. Educ. Psychol.D
PP@IAXRS!SSF
wáre™h—lD hF @PHHRAF A soil-based approach to rainfall-runo modelling in ungauged catchments for
England and WalesF €hh thesisD gr—n(eld …niversityD gr—n(eldD …uF
w™gluskeyD eF —nd v—lkhenD eF qF @PHHUAF ƒt—tisti™s ivX snterpreting the results of st—tisti™—l testsF
Continuing Education in Anaesthesia, Critical Care PainD U@TAXPHV!PIPF
wostellerD pF —nd „ukeyD tF pF @IWTVAF Data Analysis, Including StatisticsF sn qF vindzey —nd iF eronsonD
editorsF r—nd˜ook of ƒo™i—l €sy™hologyD †olF ssF eddisonE‡esleyD ‚e—dingD weD …ƒeF
x—de—uD gF —nd fengioD ‰F @PHHQAF snferen™e for the gener—liz—tion errorF Mach. Learn.D SP@QAXPQW!PVIF

‚eferen™es

x—shD tF —nd ƒut™li'eD tF @IWUHAF ‚iver )ow fore™—sting through ™on™eptu—l models p—rt i!— dis™ussion of
prin™iplesF Journal of hydrologyD IH@QAXPVP!PWHF
€érezD eFD v—rr—ñ—g—D €FD —nd sFD sF @PHHSAF istim—rD des™omponer y ™omp—r—r el error de m—l—
™l—si(™—™iónF sn Primer Congreso Español de InformáticaF
‚iss—nenD tF @IWUVAF wodeling ˜y the shortest d—t— des™riptionF AutomaticaD IRXRTS!RUIF
‚odríguezD tF hFD €érezD eFD —nd voz—noD tF eF @PHIHAF ƒensitivity —n—lysis of kEfold ™rossEv—lid—tion in
predi™tion error estim—tionF IEEE Trans. Pattern Anal. Mach. Intell.D QP@QAXSTW!SUSF
ƒ™hirrip—D wF tF —nd gol˜ertD tF tF @PHHTAF snter—nnu—l ™h—nges in s—˜le(sh @enoplopom— (m˜ri—A
re™ruitment in rel—tion to o™e—nogr—phi™ ™onditions within the g—liforni— gurrent ƒystemF Fish.
Oceanogr.D IS@IAXPS!QTF
ƒtoneD wF @IWURAF grossEv—lid—tory ™hoi™e —nd —ssessment of st—tisti™—l predi™tionsF J. Roy. Statistical
Society, Series BD QTF
ƒtowD gFD tolli'D tFD w™qilli™uddy trD hFD honeyD ƒFD ellenD tFD priedri™hsD wFD —nd ‚oseD uF @PHHWAF ƒkill
—ssessment for ™oupled ˜iologi™—lGphysi™—l models of m—rine systemsF Journal of Marine SystemsD
UT@IEPAXR!ISF
„—ylorD uF @PHHIAF ƒumm—rizing multiple —spe™ts of model perform—n™e in — single di—gr—mF J. Geophys.
ResD IHT@hUAXUIVQ!UIWPF
v—n der q——gD vF gFD ‚enooijD ƒFD ‡ittem—nD gF vF wFD elem—nD fF wF €FD —nd „——lD fF qF @PHHPAF
€ro˜—˜ilities for — pro˜—˜ilisti™ networkX — ™—se study in oesoph—ge—l ™—n™erF Artif. Intell. Med.D
PS@PAXIPQ!IRVF
‡ikipedi— @PHIIAF e™™ur—™y p—r—doxF ‘ynlineY —™™essed ISEƒeptem˜erEPHII“F
‡ittenD sF rF —nd pr—nkD iF @PHHSAF Data mining: Practical Machine Learning Tools and Techniques
with Java Implementations, Morgan Kaufmann, San Francisco, CA, USAF
‰eungD uF ‰FD fumg—rnerD ‚F iFD —nd ‚—fteryD eF iF @PHHSAF f—yesi—n model —ver—gingX development of
—n improved multiE™l—ssD gene sele™tion —nd ™l—ssi(™—tion tool for mi™ro—rr—y d—t—F BioinformaticsD
PI@IHAXPQWR!PRHPF

‚eferen™es

—r—uzD vFD srigoienD ˆFD —nd pern—ndesD tF @PHHVAF wodelling the in)uen™e of —˜ioti™ —nd ˜ioti™ f—™tors
on pl—nkton distri˜ution in the f—y of fis™—yD during three ™onse™utive ye—rs @PHHREHTAF J. Plankton
Res.D QH@VAXVSUF
—r—uzD vFD srigoienD ˆFD —nd pern—ndesD tF @PHHWAF gh—nges in pl—nkton size stru™ture —nd ™ompositionD
during the gener—tion of — phytopl—nkton ˜loomD in the ™entr—l g—nt—˜ri—n se—F J. Plankton Res.D
QI@PAXIWQ!PHUF
huD ˆF —nd h—vidsonD sF @PHHUAF Knowledge discovery and data mining: challenges and realitiesF sgi
qlo˜—lF

Model Validation, performance measures, models comparison and Weka (open source software for data mining) by JA Fernandes

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Model Validation, performance measures, models comparison and Weka (open source software for data mining) by JA Fernandes

Semelhante a Model Validation, performance measures, models comparison and Weka (open source software for data mining) by JA Fernandes (20)

Mais de DTU - Technical University of Denmark

Mais de DTU - Technical University of Denmark (12)

Último

Último (20)

Model Validation, performance measures, models comparison and Weka (open source software for data mining) by JA Fernandes