Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
A phylogenetic model of language diversification
1. A Phylogenetic Model of Language Diversification
Robin J. Ryder1 et Geoff K. Nicholls2
1 CEREMADE, Université Paris-Dauphine
2 Department of Statistics, University of Oxford
UCLA, March 2013
www.slideshare.net/robinryder
2. Gray and Atkinson’s tree(s)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 2 / 81
3. Caveats
I am not a linguist
Statistics: additional insight alongside the comparative method
I use the word "evolution" in a broad sense
"All models all false, but some are useful"
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 3 / 81
4. Advantages of statistical methods
Analyse (very) large datasets
Test multiple hypotheses
Cross-validation
Estimate uncertainty
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 4 / 81
5. Questions to answer
Topology of the tree
Age of ancestor nodes
Age of root: 6000-6500 BP or 8000-9500 BP (Before Present) ?
6000 BP: Kurgan horsemen ; 8000 BP: Anatolian farmers
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 5 / 81
6. Statistical method in a nutshell
1 Collect data
2 Design model
3 Perform inference (MCMC, ...)
4 Check convergence
5 In-model validation (is our inference method able to answer
questions from our model?)
6 Model mis-specification analysis (do we need a more complex
model?)
7 Conclude
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 6 / 81
7. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 7 / 81
8. Morris Swadesh and glottochronology
200/100 word list
Compares 2 languages (c=fraction of shared cognates)
Assumes r =fraction of shared cognates after 1000 years constant
for all languages (86%)
Infers age t of Most Recent Common Ancestor
ˆ = ln c
t
2 ln r
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 8 / 81
9. all dog grass long river split walk
and drink green louse road
warm
animal dry man root squeeze
guts
ashes rope stab wash
dull many
hair
at dust rotten stand water
hand meat
back ear round star
he moon
stick we
bad earth rub
head
bark eat mother salt stone wet
hear
egg heart sand what
because mountain straight
eye heavy say
belly mouth suck when
fall here name sun
big scratch where
far hit narrow swell
bird fat sea
hold near swim white
bite father see
horn neck tail
black fear seed who
how new ten
blood sew
hunt night that wide
blow feather sharp
nose there wife
bone few husband short
not they
breast fight I sing wind
old thick
fire ice
one sit thin wing
breathe fish if
other skin think
burn five in
sky wipe
child this
float kill
person sleep thou
claw flow knee with
play small three
cloud flower know
pull smell throw
cold fly lake woman
tie
come
R. Ryder & G. Nicholls (Dauphine & Oxford)
fog push
laugh Language phylogenies UCLA 2013 9 / 81
woods
10. Bergsland and Vogt (1962)
Found different rates for different pairs of languages: Old Norse
and Icelandic, Georgian and Mingrelian, Armenian and Old
Armenian
Discredited Glottochronology
Sankoff (1973): sample selection bias, no estimation of
uncertainty
Fair criticism
Bad observation protocol from Swadesh
Does not apply (so much) to modern methods
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 10 / 81
11. Core vocabulary
100 or 200 words, present in almost all languages: bird, hand, to
eat, red...
Borrowing can occur (evolution not along a tree), but:
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 11 / 81
12. Core vocabulary
100 or 200 words, present in almost all languages: bird, hand, to
eat, red...
Borrowing can occur (evolution not along a tree), but:
“Easy” to detect
Rare
Does not bias the results
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 11 / 81
13. Binary data: he dies, three, all
il meurt trois tout
Old English stierfþ þr¯eı ealle
Old High German stirbit, touwit dr¯ ı alle
Avestan miriiete ¯ ¯
þraiio vispe
Old Church Slavonic ı ˘
um˘retu tr˘je
ı v˘si
ı
Latin moritur ¯
tres omnes ¯
Oscan ? trís súllus
Cognacy classes (traits) for the
meaning he dies:
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 12 / 81
14. Binary data: he dies, three, all
il meurt trois tout
Old English stierfþ þr¯eı ealle
Old High German stirbit, touwit dr¯ ı alle
Avestan miriiete ¯ ¯
þraiio vispe
Old Church Slavonic ı ˘
um˘retu tr˘je
ı v˘si
ı
Latin moritur ¯
tres omnes ¯
Oscan ? trís súllus
Cognacy classes (traits) for the
meaning he dies:
1 {stierfþ, stirbit}
2 {touwit}
3 ı ˘
{miriiete, um˘retu, moritur}
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 12 / 81
15. Binary data: he dies, three, all
il meurt trois tout
Old English stierfþ þr¯eı ealle
Old High German stirbit, touwit dr¯ ı alle
Avestan miriiete ¯ ¯
þraiio vispe
Old Church Slavonic ı ˘
um˘retu tr˘je
ı v˘si
ı
Latin moritur ¯
tres omnes ¯
Oscan ? trís súllus
O. English 1 0 0 Cognacy classes (traits) for the
OH German 1 1 0 meaning he dies:
Avestan 0 0 1 1 {stierfþ, stirbit}
OC Slavonic 0 0 1 2 {touwit}
Latin 0 0 1 3 ı ˘
{miriiete, um˘retu, moritur}
Oscan ? ? ?
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 12 / 81
16. Binary data: he dies, three, all
il meurt trois tout
Old English stierfþ þr¯eı ealle
Old High German stirbit, touwit dr¯ ı alle
Avestan miriiete ¯ ¯
þraiio vispe
Old Church Slavonic ı ˘
um˘retu tr˘je
ı v˘si
ı
Latin moritur ¯
tres omnes ¯
Oscan ? trís súllus
O. English 1 0 0 1 Cognacy classes for
OH German 1 1 0 1 the meaning three:
Avestan 0 0 1 1 1 ¯ ¯ ı ¯
{þr¯e, dr¯, þraiio, tr˘je, tres, trís}
ı ı
V.-slave 0 0 1 1
Latin 0 0 1 1
Osque ? ? ? 1
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 12 / 81
17. Binary data: he dies, three, all
il meurt trois tout
Old English stierfþ þr¯eı ealle
Old High German stirbit, touwit dr¯ ı alle
Avestan miriiete ¯ ¯
þraiio vispe
Old Church Slavonic ı ˘
um˘retu tr˘je
ı v˘si
ı
Latin moritur ¯
tres omnes ¯
Oscan ? trís súllus
O. English 1 0 0 1 1 0 0 0 Cognacy classes
OH German 1 1 0 1 1 0 0 0 for all:
Avestan 0 0 1 1 0 1 0 0 1 {ealle, alle}
OC Slavonic 0 0 1 1 0 1 0 0 2 {vispe, v˘si}
ı
Latin 0 0 1 1 0 0 1 0 3 ¯
{omnes}
Oscan ? ? ? 1 0 0 0 1 4 {súllus}
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 12 / 81
18. Observation process
Old English 1 0 0 1 1 0 0 0
Old High German 1 1 0 1 1 0 0 0
Avestan 0 0 1 1 0 1 0 0
Old Church Slavonic 0 0 1 1 0 1 0 0
Latin 0 0 1 1 0 0 1 0
Oscan ? ? ? 1 0 0 0 1
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 13 / 81
19. Observation process
Old English 1 0 0 1 1 0 0 0
Old High German 1 1 0 1 1 0 0 0
Avestan 0 0 1 1 0 1 0 0
Old Church Slavonic 0 0 1 1 0 1 0 0
Latin 0 0 1 1 0 0 1 0
Oscan ? ? ? 1 0 0 0 1
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 13 / 81
20. Observation process
Old English 1 0 1 1 0
Old High German 1 0 1 1 0
Avestan 0 1 1 0 1
Old Church Slavonic 0 1 1 0 1
Latin 0 1 1 0 0
Oscan ? ? 1 0 0
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 13 / 81
21. Constraints
Constraints on the tree topology
30 constraints on the age of some nodes or ancient languages
These constraits are used to estimate the evolution rates and the
age.
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 14 / 81
22. Constraints
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 15 / 81
23. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 16 / 81
24. Model (1): birth-death process
Traits are born at rate
λ
Traits die at rate µ
λ and µ are constant
1 1 0 0 0 0 0 0 0
2 1 0 1 0 0 0 0 0
3 1 0 0 0 0 0 0 1
4 0 0 0 0 1 0 0 0
5 0 0 0 0 1 0 0 0
6 1 1 0 0 0 1 1 0
7 1 1 0 0 0 1 0 0
8 1 0 0 0 0 0 0 0
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 17 / 81
25. Model (2): catastrophic rate heterogeneity
Catastrophes occur at rate ρ
At a catastrophe, each trait dies
with probability κ and Poiss(ν)
traits are born.
λ/µ = ν/κ : the number of traits
is constant on average.
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
2 1 0 1 0 0 0 0 0 0 0 0 0 0 1
3 0 0 0 0 0 0 0 0 0 1 1 0 0 0
4 0 0 0 0 1 0 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0 0 0 0 0
6 1 0 0 0 0 1 1 0 0 0 0 0 1 0
7 1 0 0 0 0 1 0 0 0 0 0 0 1 0
8 1 0 0 0 0 0 0 0 0 0 0 0 1 0
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 18 / 81
26. Model (3): missing data
Observation process: each
point goes missing with
probability ξi
Some traits are not observed
and are thinned out of the data
1 1000?00000?000
2 ?01000?000000?
3 0?00?000011000
4 0000?0?0000?00
5 00?01?00000000
6 10000??0?000?0
7 ?0000?0?000010
8 10000000000010
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 19 / 81
32. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 23 / 81
33. TraitLab software
Bayesian inference
Markov Chain Monte Carlo
(Almost) uniform prior over the age of the root
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 24 / 81
34. Why be Bayesian?
In the settings described in this talk, it usually makes sense to use
Bayesian inference, because:
The models are complex
Estimating uncertainty is paramount
The output of one model is used as the input of another
We are interested in complex functions of our parameters
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 25 / 81
35. Frequentist statistics
Statistical inference deals with estimating an unknown parameter
θ given some data D.
In the frequentist view of statistics, θ has a true fixed
(deterministic) value.
Uncertainty is measured by confidence intervals, which are not
intuitive to interpret: if I get a 95% CI of [80 ; 120] (i.e. 100 ± 20)
for θ, I cannot say that there is a 95% probability that θ belongs to
the interval [80 ; 120].
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 26 / 81
36. Frequentist statistics
Statistical inference deals with estimating an unknown parameter
θ given some data D.
In the frequentist view of statistics, θ has a true fixed
(deterministic) value.
Uncertainty is measured by confidence intervals, which are not
intuitive to interpret: if I get a 95% CI of [80 ; 120] (i.e. 100 ± 20)
for θ, I cannot say that there is a 95% probability that θ belongs to
the interval [80 ; 120].
Frequentist statistics often use the maximum likelihood estimator:
for which value of θ would the data be most likely (under our
model)?
L(θ|D) = P[D|θ]
ˆ
θ = arg max L(θ|D)
θ
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 26 / 81
37. Bayesian statistics
In the Bayesian framework, the parameter θ is seen as inherently
random: it has a distribution.
Before I see any data, I have a prior distribution on π(θ), usually
uninformative.
Once I take the data into account, I get a posterior distribution,
which is hopefully more informative.
π(θ|D) ∝ π(θ)L(θ|D)
Different people have different priors, hence different posteriors.
But with enough data, the choice of prior matters little.
We are now allowed to make probability statements about θ, such
as "there is a 95% probability that θ belongs to the interval
[78 ; 119]" (credible interval)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 27 / 81
38. Advantages and drawbacks of Bayesian statistics
More intuitive interpretation of the results
Easier to think about uncertainty
In a hierarchical setting, it becomes easier to take into account all
the sources of variability
Prior specification: need to check that changing your prior does
not change your result
Computationally intensive
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 28 / 81
39. Prior and inference
Parameter Prior Note on prior Method
Tree g fG marginally uniform on MCMC
root age, uniform on
topologies
Death rate µ 1/µ improper; invariant by MCMC
scale change
Birth rate λ 1/λ improper; invariant by integration
scale change
Birth time Z PPP Poisson process+ ob- integration
servatoin process (pruning)
Catastrophe time k PPP Total per edge MCMC
Catastrophe rate ρ fR , Γ IC 95%: 1/tree – MCMC
1/edge
Catastrophe death U(0, 1) MCMC
rate κ
Missing data rate ξ U(0, 1)L MCMC
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 29 / 81
42. MCMC
Fit the model to the data
Trees that make the data likely
Obtain a sample of trees and dates
Samples weighted by quality of fit to data
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 32 / 81
43. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 33 / 81
44. Tests on synthetic data
Figure: True tree, 40
words/language Figure: Consensus tree
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 34 / 81
45. Tests on synthetic data (2)
Figure: Death rate (µ)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 35 / 81
46. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 36 / 81
47. Initial model: no catastrophes
Traits are born at rate
λ
Traits die at rate µ
λ and µ are constant
1 1 0 0 0 0 0 0 0
2 1 0 1 0 0 0 0 0
3 1 0 0 0 0 0 0 1
4 0 0 0 0 1 0 0 0
5 0 0 0 0 1 0 0 0
6 1 1 0 0 0 1 1 0
7 1 1 0 0 0 1 0 0
8 1 0 0 0 0 0 0 0
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 37 / 81
49. Influence of borrowing (1)
Figure: True tree, 40
words/language, 10% Figure: Consensus tree
d’emprunts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 39 / 81
50. Influence of borrowing (2)
Figure: True tree, 40
words/language, 50% Figure: Consensus tree
d’emprunts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 40 / 81
51. Influence of borrowing (3)
The topology is reconstructed well
Dates are under-estimated
Figure: Root age Figure: Death rate (µ)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 41 / 81
53. Mis-specifications
Heterogeneity between traits Analyse subset of data+ sim-
ulated data
Heterogeneity in time/space Simulated data analysis with
(non catastrophic) edge rate from a Γ distribution
Borrowing Simulated data analysis +
check level of borrowing
Data missing in blocks Simulated data analysis
Non-empty meaning cate- Simulated data analysis
gories
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 43 / 81
54. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 44 / 81
55. Data
Indo-European languages
Core vocabulary (Swadesh 100 ou 207)
Two (almost) independent data sets
Dyen et al. (1997) : 87 languages, mostly modern
Ringe et al. (2002) : 24 languages, mostly ancient
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 45 / 81
56. Cross-validation
Predict age of nodes for which we have a constraint: would we
reject the truth?
Γ space of trees which respect all constraints
Γ−c : remove constraint c = 1 . . . 30
M0 : g ∈ Γ, M1 ; g ∈ Γ−c . Bayes factor:
P[g ∈ Γ|D, g ∈ Γ−c ]
B (c) =
P[g ∈ Γ|Γ−c ]
Constraint c conflicts with the model if 2 log B (c) < −5.
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 46 / 81
57. Cross validation
100
10
5
2
0
−2
−5
−10
−100
HI TA TB LU LY OI UM OS LA GK AR GO ON OE OG OS PR AV PE VE CE IT GE WG NW BS BA IR II TG
0
2000
4000
6000
8000
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 47 / 81
58. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 48 / 81
59. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 49 / 81
60. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 50 / 81
61. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 51 / 81
62. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 52 / 81
63. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 53 / 81
64. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 54 / 81
65. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 55 / 81
66. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 56 / 81
67. R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 57 / 81
68. Consensus tree: modern languages (Dyen data)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 58 / 81
69. Consensus tree; ancient languages (Ringe data)
oldhighgerman
oldenglish
oldnorse
gothic
oscan
umbrian
66
latin
welsh
oldirish
85 oldpersian
avestan
vedic
58
lithuanian
latvian
oldprussian
oldcslavonic
greek
78
armenian
lycian
luvian
hittite
62
tocharian_b
tocharian_a
albanian
8000 7000 6000 5000 4000 3000 2000 1000 0
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 59 / 81
70. Root age
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 60 / 81
71. Conclusions
Strong support for Anatolian farming hypothesis: root around 8000
BP
Statistics reconstruct known linguistic facts and answer
unresolved questions
TraitLab: it’s free! (Though Matlab is not...)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 61 / 81
72. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 62 / 81
73. Semitic lexical data
Data: Kitchen et al. (2009)
25 languages, 96 meanings, 674 cognacy classes
Questions of interest: root age (constraint known), topology,
outgroup
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 63 / 81
74. Model validation
Thin bar: constraint. Thick bar: 95% posterior HPD. (Red bar: 95%
prior HPD)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 64 / 81
76. Conclusions
Root age 95% HPD: 4400 – 5100 BP
Akkadian outgroup: 67% (Syrian homeland?)
Zero catastrophes: 33%
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 66 / 81
77. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 67 / 81
78. Back to Bergsland and Vogt
Norse family, 8 languages.
Selection bias
Claim that the rate of change is significantly different for these
data.
B&V included words used only in literary Icelandic, which we
exclude
We can handle polymorphism
Do not include catastrophes
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 68 / 81
79. Known history
Gjestal
Sandnes
Riksmal
X XI XII XIII
Icelandic
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 69 / 81
80. Tests
Two possible ways to test whether the same model parameters apply
to this example and to Indo-European:
1 Assume parameters are the same as for the general
Indo-European tree, and estimate ancestral ages.
2 Use Norse constraints to estimate parameters, and compare to
parameter estimates from general Indo-European tree
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 70 / 81
81. Results
If we use parameter values from another analysis, we can try to
estimate the age of 13th century Norse.
True constraint: 660–760 BP. Our HPD: 615 – 872 BP.
If we analyse the Norse data on its own, we estimate parameters.
Value of µ for Norse: 2.47 ± 0.4 · 10−4
Value of µ for IE: 1.86 ± 0.39 · 10−4 (Dyen), 2.37 ± 0.21 · 10−4
(Ringe)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 71 / 81
82. But...
We can also try to estimate the age of Icelandic (which is 0 BP)
Find 439–560 BP, far from the true value
B&V were right: there was significantly less change on the branch
leading to Icelandic than average
However, we are still able to estimate internal node ages.
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 72 / 81
83. Georgian
Second data set: Georgian and Mingrelian
Age of ancestor: last millenium BC
Code data given by B&V, discarding borrowed items
Use rate estimate from Ringe et al. analysis
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 73 / 81
84. Georgian
Second data set: Georgian and Mingrelian
Age of ancestor: last millenium BC
Code data given by B&V, discarding borrowed items
Use rate estimate from Ringe et al. analysis
95% HPD: 2065 – 3170 BP
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 73 / 81
85. B&V: conclusions
Third data set (Armenian) not clear enough to be recoded.
There is variation in the number of changes on an edge
Nonetheless, we are still able to estimate ancestral language age
Variation in borrowing rates
B& V: "we cannot estimate dates, and it follows that we cannot
estimate the topology either".
We can estimate dates, and even if we couldn’t, we might still be
able to estimate the topology
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 74 / 81
86. Outline
1 Data
2 Model
3 Inference
4 In-model validation
5 Model mis-specification
6 Results
7 Semitic lexical data
8 Bergsland and Vogt
9 Punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 75 / 81
87. Atkinson et al. (2008)
Hypothesis: when a language is founded by a migration, the
founder effect leads to fast change over a short period of time.
There is a catastrophe at each branching event.
Indirect estimation: correlation between number of changes
between root and leaf, and number of branching events along the
same path
Atkinson: 21% of changes in the history of IE are due to
punctuational bursts
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 76 / 81
88. Atkinson et al. (2008)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 77 / 81
89. Direct analysis
We force a catastrophe on each edge.
Infer size of catastrophes.
Find κ very close to 0.
Less than 1% of change can be attributed to punctuational bursts.
Reason for discrepancy unclear.
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 78 / 81
90. Conclusions
Strong support for age of PIE around 8000 BP
Statistical methods can help answer questions which traditional
methods cannot
Many more questions and models to come
TraitLab: it’s free! (although Matlab is not...)
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 79 / 81
91. Questions
otázky kesses
spørgsmåler cwestiwnau
pytania preguntes
preguntas vrae
kláusimai Fragen
voprosy quaestiones
˘
întrebari questions
vragen ρωτ η σ ις
´
zapitanni spurningar
domande spørsmåler
questões frågor
vprašanja
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 80 / 81
92. References
R. J. Ryder & G. K. Nicholls, Missing data in a stochastic Dollo
model for cognate data, and its application to the dating of
Proto-Indo-European (2011), JRSS C
G. K. Nicholls, Horses or farmers? The tower of Babel and
confidence in trees (2008), Significance (popular science)
G. K. Nicholls & R. J. Ryder, Phylogenetic models for Semitic
vocabulary (2011), IWSM
R. J. Ryder, Phylogenetic Models of Language Diversification
(2010), DPhil. thesis, University of Oxford
R. Ryder & G. Nicholls (Dauphine & Oxford) Language phylogenies UCLA 2013 81 / 81