A copula model to analyze minimum admission scores
1. A copula model to analyze minimum admission scores
Mariela Fern´andez1 and Ver´onica A. Gonz´alez-L´opez2
Institute of Mathematics, Statistics and Computing Science
University of Campinas
11th
ICNAAM , 21-27 September 2013, Rhodes, Greece
1
FAPESP Post-doctoral Grant 2011/18285-6.
2
(a) USP project “Mathematics, computation, language and the brain”; (b)
FAPESP’s project“Portuguese in time and space: linguistic contact, grammars in
competition and parametric change, 2012/06078-9”’; (c) FAPESP’s project “Research,
Innovation and Dissemination Center for Neuromathematics - NeuroMat,
2013/07699-0’.’
4. Motivation Copula theory Application Conclusions References
Problem
How to set minimum admission scores in an efficiently way.
Solution
To use the statistical measure
E[Language|Mathematics ≥ m0]
and
E[Mathematics|Language ≥ l0]
where m0 is a Mathematics minimum score and l0 is a Language minimum
score.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 3 / 15
5. Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F(x) and G(y) and joint distribution H(X, Y )
where X :=Language score and Y :=Mathematics score. Recalling
that
Y |X ≥ x0 ∼ GX≥x0 (y) = P(Y ≤ y|X ≥ x0) =
G(y) − H(x0, y)
1 − F(x0)
.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 4 / 15
6. Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F(x) and G(y) and joint distribution H(X, Y )
where X :=Language score and Y :=Mathematics score. Recalling
that
Y |X ≥ x0 ∼ GX≥x0 (y) = P(Y ≤ y|X ≥ x0) =
G(y) − H(x0, y)
1 − F(x0)
.
Actually, it is more useful to work with the marginals quantiles than
the marginals scores: “Manager’s control variable”
e.g. F(x0) = 0.25 means that we will admit 75% of the candidates.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 4 / 15
7. Motivation Copula theory Application Conclusions References
E[Mathematics|Language ≥ l0]
We need to know:
Marginals distribution F(x) and G(y) and joint distribution H(X, Y )
where X :=Language score and Y :=Mathematics score. Recalling
that
Y |X ≥ x0 ∼ GX≥x0 (y) = P(Y ≤ y|X ≥ x0) =
G(y) − H(x0, y)
1 − F(x0)
.
Actually, it is more useful to work with the marginals quantiles than
the marginals scores: “Manager’s control variable”
e.g. F(x0) = 0.25 means that we will admit 75% of the candidates.
By taking U = scaling ranks of X and V = scaling ranks of Y , we
search for the joint density of (U, V ).
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 4 / 15
8. Motivation Copula theory Application Conclusions References
Cumulative conditional expectation in a copula framework
Definition
A bivariate Copula is a bivariate joint distribution with uniform marginals,
denoted by C(u, v) for (u, v) ∈ [0, 1] × [0, 1].
Sklar’s Theorem
Let H be a joint distribution function with margins F and G. Then there
exists a copula C such that
H(x, y) = C(F(x), G(y)). (1)
If F and G are continuous, then C is unique; otherwise, C is uniquely
determined on RanF × RanG. Conversely, if C is a copula and F and G
are distribution functions, then the function H defined by (1) is a joint
distribution with margins F and G.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 5 / 15
9. Motivation Copula theory Application Conclusions References
Some common bivariate Copulas
Product C(u, v) = uv.
Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1 − u)(1 − v), for
α ∈ [−1, 1].
Clayton C(u, v) = max{0, (u−α + v−α − 1)−1/α}, for α ∈ (0, ∞).
Gumbel C(u, v) = exp − (− ln u)α + (− ln v)α 1/α
, for
α ∈ [1, ∞].
Some applications
Actuarial science, e.g. Frees et al. (1996) and Frees et al. (2005).
Finance and risk management, e.g. Cherubini et al. (2004) and
Embrechts et al. (2003).
Hydrology, e.g. Genest and Frave (2007).
Deforestation (spatio-temporal dependence), e.g. Gr¨aler et al. (2010).
Linguistic, e.g. Garc´ıa et al. (2012).
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 6 / 15
10. Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simple
cross sections (i.e. simple expression for the intersection of the copula with
the plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =
v − C(u0, v)
1 − u0
to compute E[V |U ≥ u0].
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 7 / 15
11. Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simple
cross sections (i.e. simple expression for the intersection of the copula with
the plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =
v − C(u0, v)
1 − u0
to compute E[V |U ≥ u0].
%Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1 − u)(1 − v), for
α ∈ [−1, 1]. Quadratic cross sections in both variables, weak dependence
and exchangeable copula, i.e. C(u, v) = C(v, u).
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 7 / 15
12. Motivation Copula theory Application Conclusions References
Copula model selection according to the problem’s characteristic: simple
cross sections (i.e. simple expression for the intersection of the copula with
the plane u = u0, C(u0, v)) since we need
P[V ≤ v|U ≥ u0] =
v − C(u0, v)
1 − u0
to compute E[V |U ≥ u0].
!Asymmetric Cubic Section Copula (ACS) introduced by Nelsen et al.
(1997)
C(u, v) = uv + uv(1 − u)(1 − v)[(a − b)v(1 − u) + b]
where |b| ≤ 1, b−3−
√
9+6b−3b2
2 ≤ a ≤ 1 and a = b. Cubic cross sections in
both variables, weak dependence and non-exchangeable copula.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 7 / 15
13. Motivation Copula theory Application Conclusions References
Copula parameters estimation
Bayesian approach through an uniform conjugate prior
ˆa = K−1
1
−1
1
R(b)
aπ(a, b|u)dadb,
ˆb = K−1
1
−1
1
R(b)
bπ(a, b|u)dadb
where π(a, b|u) is the posterior distribution in (a, b), u is the sample data,
K =
1
−1
1
R(b) π(a, b|u)dadb and R(b) = b−3−
√
9+6b−3b2
2 .
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 8 / 15
14. Motivation Copula theory Application Conclusions References
Cumulative conditional expectation for the ACS copula family
E[V |U ≥ u] =
1
0 vdP(v|U ≥ u) =
1
12
6 + (a + b)u + (b − a)u2
,
E[U|V ≥ v] =
1
0 udP(u|V ≥ v) =
1
2
+
b
6
v +
a − b
12
v2
.
Property
i) The vertex of the function E[V |U ≥ u] is u0 = −a−b
2(b−a) . It is a minimum
if b > a and it is maximum if b < a.
ii) The vertex of the function E[U|V ≥ v] is v0 = −b
a−b . It is a minimum if
a > b and it is maximum if a < b.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 9 / 15
15. Motivation Copula theory Application Conclusions References
Admission score decisions
Data
Mathematics and Portuguese scores of each student who succeeded
at the admission test for the undergraduate course of Electrical
Engineering at University of Campinas in Brazil, from 2010 to 2011.
X = Portuguese score and Y = Mathematics score. An annual
standardization was used to avoid the effect of different tests applied
each year.
We compute the pseudo-observations
ˆui = ˆF(xi)
N
N + 1
=
rankx
i
N + 1
and ˆvi = ˆG(yi)
N
N + 1
=
ranky
i
N + 1
where N is the size of the sample and ˆF and ˆG are the empirical
distribution of X and Y respectively.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 10 / 15
16. Motivation Copula theory Application Conclusions References
E[V|U ≥ u] E[U|V ≥ v]
Year Students ˆτ ˆa ˆb Vertex Type Vertex Type
2010 68 -0.0507 -2.2658 0.3253 0.374 min 0.125 max
2011 67 -0.2684 -0.5808 -0.7153 – decreas – decreas
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 11 / 15
17. Motivation Copula theory Application Conclusions References
Final remarks
We have explored
Copula theory applied to educational data.
Cumulative conditional expectation as a measure for decision making.
Work in progress
Mathematical and statistical properties of the cumulative conditional
expectation.
Relation between the cumulative conditional expectation and the
directional dependency given by E[V |U = u0].
Analytical expressions for others copula families, for example the
Generalized Farlie-Gumbel-Morgenstern C(u, v) = uv + f(u)g(v).
Application to data from other courses.
mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th
ICNAAM 12 / 15
18. References
Cherubini, U., Luciano, E. e Vecchiato, W. (2004). Copula Methods in
Finance. John Wiley & Sons.
Embrechts, P., Lindskog, F. e McNeil, A. (2003). Modelling
Dependence with Copulas and Applications to Risk Management.
Handbook of Heavy Tailed Distribution in Finance. Elsevier.
Frees, E., Carriere, J. e Valdez, E. (1996). Annuity valuation with
dependent mortality. Journal of Risk and Insurance 63, 229-261.
Frees, E. e Wang, P. (2005). Credibility using copulas. North
American Actuarial Journal 9 (2), 31-48.
Garc´ıa, J. E., Gonz´alez-L´opez, V. A.; Viola, M. L. L.(2012) Robust
model selection and the statistical classification of languages. AIP
Conference Proceedings: 11th Brazilian Bayesian Statistics Meeting v.
1490. p. 160-170.
19. References
Genest, C. e Frave, A. C. (2007). Everything you always wanted to
know about copula modeling but were afraid to ask. Journal of
Hydrologic Engineering 12, 347-368.
Gr¨aler, B., Kazianka, H. e M. de Espindola, G. (2010). Copulas, a
novel approach to model spatial and spatio-temporal dependence.
GIScience for Environmental Change Symposium Proceedings 40,
49-54.
Nelsen, R. B., Quesada Molina, J. J., Rodr´ıguez Lallena, J. A. (1997).
Bivariate copulas with cubic sections. J Nonparametr Statist 7,
205-220.