SlideShare uma empresa Scribd logo
1 de 66
Baixar para ler offline
Interpretable Sparse Sliced Inverse Regression for
digitized functional data
Victor Picheny, Rémi Servien & Nathalie Villa-Vialaneix
nathalie.villa@toulouse.inra.fr
http://www.nathalievilla.org
Séminaire Institut de Mathématiques de Bordeaux
8 avril 2016
Nathalie Villa-Vialaneix | IS-SIR 1/26
Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 2/26
Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 3/26
A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Nathalie Villa-Vialaneix | IS-SIR 4/26
A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
Nathalie Villa-Vialaneix | IS-SIR 4/26
A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
Nathalie Villa-Vialaneix | IS-SIR 4/26
A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
useful for realistic predictions but not to understand the link between
the inputs and the outputs.
Nathalie Villa-Vialaneix | IS-SIR 4/26
A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
useful for realistic predictions but not to understand the link between
the inputs and the outputs.
Metamodeling: train a simplified, fast and interpretable model which can
be used as a proxy for the agronomic model.
Nathalie Villa-Vialaneix | IS-SIR 4/26
A first case study: SUNFLO [Casadebaig et al., 2011]
Inputs: 5 daily time series (length: one year) and 8 phenotypes for different
sunflower types
Output: sunflower yield
Data: 1000 sunflower types × 190 climatic series (different places and
years) (n = 190 000) of variables in R5×183
× R8
Nathalie Villa-Vialaneix | IS-SIR 5/26
Main facts obtained from a preliminary study
R. Kpekou internship
The study focused on the influence of the climate on the yield: 5 functional
variables digitized at 183 points.
Nathalie Villa-Vialaneix | IS-SIR 6/26
Main facts obtained from a preliminary study
R. Kpekou internship
The study focused on the influence of the climate on the yield: 5 functional
variables digitized at 183 points.
Main result: Using summary of the variables (mean, sd...) on several
weeks and an automatic aggregating procedure in a random forest
method, led to obtain good accuracy in prediction.
Nathalie Villa-Vialaneix | IS-SIR 6/26
Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Nathalie Villa-Vialaneix | IS-SIR 7/26
Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Data: n i.i.d. observations (xi, yi)i=1,...,n.
xi is not perfectly known but sampled at (fixed) points
xi = (xi(t1), . . . , xi(tp))T
∈ Rp
. We denote: X =


xT
1
...
xT
n


.
Nathalie Villa-Vialaneix | IS-SIR 7/26
Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Data: n i.i.d. observations (xi, yi)i=1,...,n.
xi is not perfectly known but sampled at (fixed) points
xi = (xi(t1), . . . , xi(tp))T
∈ Rp
. We denote: X =


xT
1
...
xT
n


.
Question: Find a model which is easily interpretable and points out
relevant intervals for the prediction within the range of X.
Nathalie Villa-Vialaneix | IS-SIR 7/26
Related works (variable selection in FDA)
LASSO / L1
regularization in linear models
[Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation
points), [Matsui and Konishi, 2011] (selects elements of an expansion
basis), [James et al., 2009] (sparsity on derivatives: piecewise
constant predictors)
[Fraiman et al., 2015] (blinding approach useable for various
problems: PCA, regression...)
[Gregorutti et al., 2015] adaptation of the importance of variables in
random forest for groups of variables
Nathalie Villa-Vialaneix | IS-SIR 8/26
Related works (variable selection in FDA)
LASSO / L1
regularization in linear models
[Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation
points), [Matsui and Konishi, 2011] (selects elements of an expansion
basis), [James et al., 2009] (sparsity on derivatives: piecewise
constant predictors)
[Fraiman et al., 2015] (blinding approach useable for various
problems: PCA, regression...)
[Gregorutti et al., 2015] adaptation of the importance of variables in
random forest for groups of variables
Our proposal: a semi-parametric (not entirely linear) model which selects
relevant intervals combined with an automatic procedure to define the
intervals.
Nathalie Villa-Vialaneix | IS-SIR 8/26
Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 9/26
SIR in multidimensional framework
SIR: a semi-parametric regression model for X ∈ Rp
Y = F(aT
1 X, . . . , aT
d X, )
for a1, . . . , ad ∈ Rp
(to be estimated), F : Rd+1
→ R, unknown, and , an
error, independant from X.
Standard assumption for SIR
Y X | PA (X)
in which A is the so-called EDR space, spanned by (ak )k=1,...,d.
Nathalie Villa-Vialaneix | IS-SIR 10/26
Estimation
Equivalence between SIR and eigendecomposition
Nathalie Villa-Vialaneix | IS-SIR 11/26
Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Nathalie Villa-Vialaneix | IS-SIR 11/26
Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
Nathalie Villa-Vialaneix | IS-SIR 11/26
Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
split the range of Y into H different slices: τ1, ... τH and estimate
ˆE(X|Y) = 1
nh i: yi∈τh
xi
h=1,...,H
, with nh = |{i : yi ∈ τh}|,
ˆΓ = ˆE(X|Y)T
DˆE(X|Y) with D = Diag n1
n , . . . , nH
n
Nathalie Villa-Vialaneix | IS-SIR 11/26
Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
split the range of Y into H different slices: τ1, ... τH and estimate
ˆE(X|Y) = 1
nh i: yi∈τh
xi
h=1,...,H
, with nh = |{i : yi ∈ τh}|,
ˆΓ = ˆE(X|Y)T
DˆE(X|Y) with D = Diag n1
n , . . . , nH
n
solving the eigendecomposition problem ˆΓa = λˆΣa gives the
eigenvectors a1, . . . , ad ⇒ ˆA = (a1, . . . , ad), p × d
Nathalie Villa-Vialaneix | IS-SIR 11/26
Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Nathalie Villa-Vialaneix | IS-SIR 12/26
Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
Nathalie Villa-Vialaneix | IS-SIR 12/26
Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008]
shows that SIR rewrites as the double optimisation problem
maxaj,φ Cor(φ(Y), aT
j
X), where φ is any function R → R and (aj)j are
Σ-orthonormal.
Nathalie Villa-Vialaneix | IS-SIR 12/26
Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008]
shows that SIR rewrites as the double optimisation problem
maxaj,φ Cor(φ(Y), aT
j
X), where φ is any function R → R and (aj)j are
Σ-orthonormal.
Rk: The solution is shown to satisfy φ(y) = aT
j
E(X|Y = y) and aj is
also obtained as the solution of the mean square error problem:
min
aj
E φ(Y) − aT
j X
2
Nathalie Villa-Vialaneix | IS-SIR 12/26
SIR in large dimensions: problem
In large dimension (or in Functional Data Analysis), n < p and ˆΣ is
ill-conditionned and does not have an inverse ⇒ Z = (X − InX
T
)ˆΣ−1/2
can
not be computed.
Nathalie Villa-Vialaneix | IS-SIR 13/26
SIR in large dimensions: problem
In large dimension (or in Functional Data Analysis), n < p and ˆΣ is
ill-conditionned and does not have an inverse ⇒ Z = (X − InX
T
)ˆΣ−1/2
can
not be computed.
Different solutions have been proposed in the litterature based on:
prior dimension reduction (e.g., PCA) [Ferré and Yao, 2003] (in the
framework of FDA)
regularization (ridge...) [Li and Yin, 2008, Bernard-Michel et al., 2008]
sparse SIR
[Li and Yin, 2008, Li and Nachtsheim, 2008, Ni et al., 2005]
Nathalie Villa-Vialaneix | IS-SIR 13/26
SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
,
Nathalie Villa-Vialaneix | IS-SIR 14/26
SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
+µ2
H
h=1
ˆph ACh
2
,
[Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a
high dimensional setting.
Nathalie Villa-Vialaneix | IS-SIR 14/26
SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
+µ2
H
h=1
ˆph ACh
2
,
[Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a
high dimensional setting.
They also show that this problem is equivalent to finding the eigenvectors
of the generalized eigenvalue problem
ˆΓa = λ ˆΣ + µ2Ip a.
Nathalie Villa-Vialaneix | IS-SIR 14/26
SIR in large dimensions: sparse versions
Specific issue to introduce sparsity in SIR
sparsity on a multiple-index model. Most authors use shrinkage
approaches.
First version: sparse penalization of the ridge solution
If (ˆA, ˆC) are the solutions of the ridge SIR as described in the previous
slide, [Ni et al., 2005, Li and Yin, 2008] propose to shrink this solution by
minimizing
Es,1(α) =
H
h=1
ˆph Xh − X − ˆΣDiag(α)ˆA ˆCh
2
+ µ1 α L1
(regression formulation of SIR)
Nathalie Villa-Vialaneix | IS-SIR 15/26
SIR in large dimensions: sparse versions
Specific issue to introduce sparsity in SIR
sparsity on a multiple-index model. Most authors use shrinkage
approaches.
Second version: [Li and Nachtsheim, 2008] derive the sparse optimization
problem from the correlation formulation of SIR:
min
as
j
n
i=1
Pˆaj
(X|yi) − (as
j )T
xi
2
+ µ1,j as
j L1
,
in which Pˆaj
is the projection of ˆE(X|Y = yi) = Xh onto the space spanned
by the solution of the ridge problem.
Nathalie Villa-Vialaneix | IS-SIR 15/26
Characteristics of the different approaches and possible
extensions
[Li and Yin, 2008] [Li and Nachtsheim, 2008]
sparsity on shrinkage coefficients estimates
nb optimization pb 1 d
sparsity common to all dims specific to each dim
Nathalie Villa-Vialaneix | IS-SIR 16/26
Characteristics of the different approaches and possible
extensions
[Li and Yin, 2008] [Li and Nachtsheim, 2008]
sparsity on shrinkage coefficients estimates
nb optimization pb 1 d
sparsity common to all dims specific to each dim
Extension to block-sparse SIR (like in PCA)?
Nathalie Villa-Vialaneix | IS-SIR 16/26
Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 17/26
IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
Nathalie Villa-Vialaneix | IS-SIR 18/26
IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
First step: Solve the ridge problem on the digitized functions (viewed as
high dimensional vectors) to obtain ˆA and ˆC:
min
A,C
H
h=1
ˆph Xh − X − ˆΣACh
2
+ µ2
H
h=1
ˆph ACh
2
Nathalie Villa-Vialaneix | IS-SIR 18/26
IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
First step: Solve the ridge problem on the digitized functions (viewed as
high dimensional vectors) to obtain ˆA and ˆC:
min
A,C
H
h=1
ˆph Xh − X − ˆΣACh
2
+ µ2
H
h=1
ˆph ACh
2
Second step: Sparse shrinkage using the intervals. If
PˆA (E(X|Y = yi)) = (Xh − X)T ˆA for h st yi ∈ τh and if Pi = (P1
i
, . . . , Pd
i
)T
and Pj
= (Pj
1
, . . . , Pj
n)T
, we solve:
arg min
α∈RD
d
j=1
Pj
− (X∆(ˆaj)) α 2
+ µ1 α L1
with ∆(ˆaj) the (p × D)-matrix such that ∆kl(ˆaj) = ˆajl if tl ∈ Ik and 0
otherwise.
Nathalie Villa-Vialaneix | IS-SIR 18/26
IS-SIR: Characteristics
uses the approach based on the correlation formulation (because the
dimensionality of the optimization problem is smaller);
uses a shrinkage approach and optimizes shrinkage coefficients in a
single optimization problem;
handles functional setting by penalizing entire intervals and not just
isolated points.
Nathalie Villa-Vialaneix | IS-SIR 19/26
Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
Nathalie Villa-Vialaneix | IS-SIR 20/26
Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough) Note that GCV as described in
[Li and Yin, 2008] can not be used since the current version of the L2
penalty involves the use of an estimate of Σ−1
.
Nathalie Villa-Vialaneix | IS-SIR 20/26
Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough)
using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of
R(d) = d − E Tr Πd
ˆΠd ,
in which Πd and ˆΠd are the projector onto the first d dimensions of the
EDR space and its estimate, is derived similarly as in
[Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied
to select a relevant d.
Nathalie Villa-Vialaneix | IS-SIR 20/26
Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough)
using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of
R(d) = d − E Tr Πd
ˆΠd ,
in which Πd and ˆΠd are the projector onto the first d dimensions of the
EDR space and its estimate, is derived similarly as in
[Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied
to select a relevant d.
µ1 (LASSO) glmnet is used, in which µ1 is selected by CV along the
regularization path.
Nathalie Villa-Vialaneix | IS-SIR 20/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
Nathalie Villa-Vialaneix | IS-SIR 21/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
along the regularization path, select three values for µ1:
Nathalie Villa-Vialaneix | IS-SIR 21/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
along the regularization path, select three values for µ1: P% of the
coefficients are zero, P% of the coefficients are non zero, best GCV.
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
Nathalie Villa-Vialaneix | IS-SIR 21/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
Nathalie Villa-Vialaneix | IS-SIR 21/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
3 Output: Collection of models (first with p intervals, last with 1), M∗
D
(optimal for GCV) and corresponding GCVD versus D (number of
intervals).
Nathalie Villa-Vialaneix | IS-SIR 21/26
An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
3 Output: Collection of models (first with p intervals, last with 1), M∗
D
(optimal for GCV) and corresponding GCVD versus D (number of
intervals).
Final solution: Minimize GCVD over D.
Nathalie Villa-Vialaneix | IS-SIR 21/26
Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 22/26
Simulation framework
Data generated with:
Y = d
j=1 log X, aj with X(t) = Z(t) + in which Z is a Gaussian
process with mean µ(t) = −5 + 4t − 4t2
and the Matern 3/2
covariance function with parameters σ = 0.1 and θ = 0.2/
√
3, is a
centered Gaussian variable independant of Z, with standard deviation
0.1.;
aj = sin
t(2+j)π
2 −
(j−1)π
3 IIj
(t)
two models: (M1), d = 1, I1 = [0.2, 0.4]. For (M2), d = 3 and
I1 = [0, 0.1], I2 = [0.5, 0.65] and I3 = [0.65, 0.78].
Nathalie Villa-Vialaneix | IS-SIR 23/26
Simulation framework
Nathalie Villa-Vialaneix | IS-SIR 23/26
Simulation framework
Nathalie Villa-Vialaneix | IS-SIR 23/26
Ridge step (model M1)
Selection of µ2: µ2 = 1
Nathalie Villa-Vialaneix | IS-SIR 24/26
Ridge step (model M1)
Selection of d: d = 1
Nathalie Villa-Vialaneix | IS-SIR 24/26
Definition of the intervals
D = 200 (initial state)
0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.8
a^
1
Nathalie Villa-Vialaneix | IS-SIR 25/26
Definition of the intervals
D = 147 (retained solution)
0.2 0.4 0.6 0.8 1.0
0.000.020.040.060.08
a^
1
Nathalie Villa-Vialaneix | IS-SIR 25/26
Definition of the intervals
D = 43
0.2 0.4 0.6 0.8 1.0
−0.050.000.05
a^
1
Nathalie Villa-Vialaneix | IS-SIR 25/26
Definition of the intervals
D = 5
0.2 0.4 0.6 0.8 1.0
−0.04−0.020.000.020.040.060.08
a^
1
Nathalie Villa-Vialaneix | IS-SIR 25/26
Definition of the intervals
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0 50 100 150 200
0.0190.0200.0210.0220.023
Number of intervals
CVerror
Nathalie Villa-Vialaneix | IS-SIR 25/26
Conclusion
IS-SIR:
sparse dimension reduction model adapted to functional framework;
fully automated definition of relevant intervals in the range of the
predictors
Nathalie Villa-Vialaneix | IS-SIR 26/26
Conclusion
IS-SIR:
sparse dimension reduction model adapted to functional framework;
fully automated definition of relevant intervals in the range of the
predictors
Perspective:
application to real data
block-wise sparse SIR?
Nathalie Villa-Vialaneix | IS-SIR 26/26
Aneiros, G. and Vieu, P. (2014).
Variable in infinite-dimensional problems.
Statistics and Probability Letters, 94:12–20.
Bernard-Michel, C., Gardes, L., and Girard, S. (2008).
A note on sliced inverse regression with regularizations.
Biometrics, 64(3):982–986.
Casadebaig, P., Guilioni, L., Lecoeur, J., Christophe, A., Champolivier, L., and Debaeke, P. (2011).
SUNFLO, a model to simulate genotype-specific performance of the sunflower crop in contrasting environments.
Agricultural and Forest Meteorology, 151(2):163–178.
Ferraty, F., Hall, P., and Vieu, P. (2010).
Most-predictive design points for functiona data predictors.
Biometrika, 97(4):807–824.
Ferré, L. and Yao, A. (2003).
Functional sliced inverse regression analysis.
Statistics, 37(6):475–488.
Fraiman, R., Gimenez, Y., and Svarc, M. (2015).
Feature selection for functional data.
Journal of Multivariate Analysis.
In Press.
Gregorutti, B., Michel, B., and Saint-Pierre, P. (2015).
Grouped variable importance with random forests and application to multiple functional data analysis.
Computational Statistics and Data Analysis, 90:15–35.
James, G., Wang, J., and Zhu, J. (2009).
Functional linear regression that’s interpretable.
Annals of Statistics, 37(5A):2083–2108.
Li, L. and Nachtsheim, C. (2008).
Nathalie Villa-Vialaneix | IS-SIR 26/26
Sparse sliced inverse regression.
Technometrics, 48(4):503–510.
Li, L. and Yin, X. (2008).
Sliced inverse regression with regularizations.
Biometrics, 64:124–131.
Liquet, B. and Saracco, J. (2012).
A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approches.
Computational Statistics, 27(1):103–125.
Matsui, H. and Konishi, S. (2011).
Variable selection for functional regression models via the l1 regularization.
Computational Statistics and Data Analysis, 55(12):3304–3310.
Ni, L., Cook, D., and Tsai, C. (2005).
A note on shrinkage sliced inverse regression.
Biometrika, 92(1):242–247.
Nathalie Villa-Vialaneix | IS-SIR 26/26

Mais conteúdo relacionado

Mais procurados

FDA and Statistical learning theory
FDA and Statistical learning theoryFDA and Statistical learning theory
FDA and Statistical learning theorytuxette
 
Machine Learning for Actuaries
Machine Learning for ActuariesMachine Learning for Actuaries
Machine Learning for ActuariesArthur Charpentier
 
Slides econometrics-2018-graduate-4
Slides econometrics-2018-graduate-4Slides econometrics-2018-graduate-4
Slides econometrics-2018-graduate-4Arthur Charpentier
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity datatuxette
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?Christian Robert
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Visionzukun
 
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Christian Robert
 
Quadratics in vertex form
Quadratics in vertex formQuadratics in vertex form
Quadratics in vertex formDouglas Agyei
 
Calibrating Probability with Undersampling for Unbalanced Classification
Calibrating Probability with Undersampling for Unbalanced ClassificationCalibrating Probability with Undersampling for Unbalanced Classification
Calibrating Probability with Undersampling for Unbalanced ClassificationAndrea Dal Pozzolo
 
Tbs910 regression models
Tbs910 regression modelsTbs910 regression models
Tbs910 regression modelsStephen Ong
 
When is undersampling effective in unbalanced classification tasks?
When is undersampling effective in unbalanced classification tasks?When is undersampling effective in unbalanced classification tasks?
When is undersampling effective in unbalanced classification tasks?Andrea Dal Pozzolo
 
How mathematicians predict the future?
How mathematicians predict the future?How mathematicians predict the future?
How mathematicians predict the future?Mattia Zanella
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and ApplicationUniversity of Salerno
 

Mais procurados (20)

FDA and Statistical learning theory
FDA and Statistical learning theoryFDA and Statistical learning theory
FDA and Statistical learning theory
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Machine Learning for Actuaries
Machine Learning for ActuariesMachine Learning for Actuaries
Machine Learning for Actuaries
 
Slides econometrics-2018-graduate-4
Slides econometrics-2018-graduate-4Slides econometrics-2018-graduate-4
Slides econometrics-2018-graduate-4
 
Learning from (dis)similarity data
Learning from (dis)similarity dataLearning from (dis)similarity data
Learning from (dis)similarity data
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?(Approximate) Bayesian computation as a new empirical Bayes (something)?
(Approximate) Bayesian computation as a new empirical Bayes (something)?
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Vision
 
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
Discussion of ABC talk by Francesco Pauli, Padova, March 21, 2013
 
MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty fo...
MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty fo...MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty fo...
MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty fo...
 
Quadratics in vertex form
Quadratics in vertex formQuadratics in vertex form
Quadratics in vertex form
 
Calibrating Probability with Undersampling for Unbalanced Classification
Calibrating Probability with Undersampling for Unbalanced ClassificationCalibrating Probability with Undersampling for Unbalanced Classification
Calibrating Probability with Undersampling for Unbalanced Classification
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Tbs910 regression models
Tbs910 regression modelsTbs910 regression models
Tbs910 regression models
 
04 regression
04 regression04 regression
04 regression
 
When is undersampling effective in unbalanced classification tasks?
When is undersampling effective in unbalanced classification tasks?When is undersampling effective in unbalanced classification tasks?
When is undersampling effective in unbalanced classification tasks?
 
How mathematicians predict the future?
How mathematicians predict the future?How mathematicians predict the future?
How mathematicians predict the future?
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
 

Destaque

Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014tuxette
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans Rtuxette
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression networktuxette
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningtuxette
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOtuxette
 
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningtuxette
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetstuxette
 

Destaque (9)

Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014Slides Lycée Jules Fil 2014
Slides Lycée Jules Fil 2014
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans RVisualiser et fouiller des réseaux - Méthodes et exemples dans R
Visualiser et fouiller des réseaux - Méthodes et exemples dans R
 
Mining co-expression network
Mining co-expression networkMining co-expression network
Mining co-expression network
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Graph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph miningGraph mining 2: Statistical approaches for graph mining
Graph mining 2: Statistical approaches for graph mining
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learningIntegrating Tara Oceans datasets using unsupervised multiple kernel learning
Integrating Tara Oceans datasets using unsupervised multiple kernel learning
 
Multiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasetsMultiple kernel learning applied to the integration of Tara oceans datasets
Multiple kernel learning applied to the integration of Tara oceans datasets
 

Semelhante a Interpretable Sparse Sliced Inverse Regression for digitized functional data

About functional SIR
About functional SIRAbout functional SIR
About functional SIRtuxette
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...tuxette
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
ijcai09submodularity.ppt
ijcai09submodularity.pptijcai09submodularity.ppt
ijcai09submodularity.ppt42HSQuangMinh
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDAtuxette
 
A Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusA Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusYoshihiro Mizoguchi
 
Numerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodNumerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodAlexander Decker
 
Refresher probabilities-statistics
Refresher probabilities-statisticsRefresher probabilities-statistics
Refresher probabilities-statisticsSteve Nouri
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsVjekoslavKovac1
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problemjfrchicanog
 

Semelhante a Interpretable Sparse Sliced Inverse Regression for digitized functional data (20)

About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Side 2019, part 2
Side 2019, part 2Side 2019, part 2
Side 2019, part 2
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
ijcai09submodularity.ppt
ijcai09submodularity.pptijcai09submodularity.ppt
ijcai09submodularity.ppt
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Slides risk-rennes
Slides risk-rennesSlides risk-rennes
Slides risk-rennes
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDA
 
cswiercz-general-presentation
cswiercz-general-presentationcswiercz-general-presentation
cswiercz-general-presentation
 
Slides ub-3
Slides ub-3Slides ub-3
Slides ub-3
 
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
 
A Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusA Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational Calculus
 
Slides ACTINFO 2016
Slides ACTINFO 2016Slides ACTINFO 2016
Slides ACTINFO 2016
 
Numerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodNumerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis method
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
 
Refresher probabilities-statistics
Refresher probabilities-statisticsRefresher probabilities-statistics
Refresher probabilities-statistics
 
Side 2019, part 1
Side 2019, part 1Side 2019, part 1
Side 2019, part 1
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurations
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
 

Mais de tuxette

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathstuxette
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènestuxette
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquestuxette
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-Ctuxette
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?tuxette
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquestuxette
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeantuxette
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...tuxette
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquestuxette
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...tuxette
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...tuxette
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation datatuxette
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?tuxette
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysistuxette
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricestuxette
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Predictiontuxette
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random foresttuxette
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICStuxette
 

Mais de tuxette (20)

Racines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en mathsRacines en haut et feuilles en bas : les arbres en maths
Racines en haut et feuilles en bas : les arbres en maths
 
Méthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènesMéthodes à noyaux pour l’intégration de données hétérogènes
Méthodes à noyaux pour l’intégration de données hétérogènes
 
Méthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiquesMéthodologies d'intégration de données omiques
Méthodologies d'intégration de données omiques
 
Projets autour de l'Hi-C
Projets autour de l'Hi-CProjets autour de l'Hi-C
Projets autour de l'Hi-C
 
Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?Can deep learning learn chromatin structure from sequence?
Can deep learning learn chromatin structure from sequence?
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
ASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiquesASTERICS : une application pour intégrer des données omiques
ASTERICS : une application pour intégrer des données omiques
 
Autour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWeanAutour des projets Idefics et MetaboWean
Autour des projets Idefics et MetaboWean
 
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
Rserve, renv, flask, Vue.js dans un docker pour intégrer des données omiques ...
 
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiquesApprentissage pour la biologie moléculaire et l’analyse de données omiques
Apprentissage pour la biologie moléculaire et l’analyse de données omiques
 
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
Quelques résultats préliminaires de l'évaluation de méthodes d'inférence de r...
 
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
Intégration de données omiques multi-échelles : méthodes à noyau et autres ap...
 
Journal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation dataJournal club: Validation of cluster analysis results on validation data
Journal club: Validation of cluster analysis results on validation data
 
Overfitting or overparametrization?
Overfitting or overparametrization?Overfitting or overparametrization?
Overfitting or overparametrization?
 
Selective inference and single-cell differential analysis
Selective inference and single-cell differential analysisSelective inference and single-cell differential analysis
Selective inference and single-cell differential analysis
 
SOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatricesSOMbrero : un package R pour les cartes auto-organisatrices
SOMbrero : un package R pour les cartes auto-organisatrices
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Présentation du projet ASTERICS
Présentation du projet ASTERICSPrésentation du projet ASTERICS
Présentation du projet ASTERICS
 

Último

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Último (20)

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 

Interpretable Sparse Sliced Inverse Regression for digitized functional data

  • 1. Interpretable Sparse Sliced Inverse Regression for digitized functional data Victor Picheny, Rémi Servien & Nathalie Villa-Vialaneix nathalie.villa@toulouse.inra.fr http://www.nathalievilla.org Séminaire Institut de Mathématiques de Bordeaux 8 avril 2016 Nathalie Villa-Vialaneix | IS-SIR 1/26
  • 2. Sommaire 1 Background and motivation 2 Presentation of SIR 3 Our proposal 4 Simulations Nathalie Villa-Vialaneix | IS-SIR 2/26
  • 3. Sommaire 1 Background and motivation 2 Presentation of SIR 3 Our proposal 4 Simulations Nathalie Villa-Vialaneix | IS-SIR 3/26
  • 4. A typical case study: meta-model in agronomy climate (daily time series: rain, temperature...) plant phenotypes predictions (yield, N leaching...) Agronomic model Nathalie Villa-Vialaneix | IS-SIR 4/26
  • 5. A typical case study: meta-model in agronomy climate (daily time series: rain, temperature...) plant phenotypes predictions (yield, N leaching...) Agronomic model Agronomic model: based on biological and chemical knowledge; Nathalie Villa-Vialaneix | IS-SIR 4/26
  • 6. A typical case study: meta-model in agronomy climate (daily time series: rain, temperature...) plant phenotypes predictions (yield, N leaching...) Agronomic model Agronomic model: based on biological and chemical knowledge; computationaly expensive to use; Nathalie Villa-Vialaneix | IS-SIR 4/26
  • 7. A typical case study: meta-model in agronomy climate (daily time series: rain, temperature...) plant phenotypes predictions (yield, N leaching...) Agronomic model Agronomic model: based on biological and chemical knowledge; computationaly expensive to use; useful for realistic predictions but not to understand the link between the inputs and the outputs. Nathalie Villa-Vialaneix | IS-SIR 4/26
  • 8. A typical case study: meta-model in agronomy climate (daily time series: rain, temperature...) plant phenotypes predictions (yield, N leaching...) Agronomic model Agronomic model: based on biological and chemical knowledge; computationaly expensive to use; useful for realistic predictions but not to understand the link between the inputs and the outputs. Metamodeling: train a simplified, fast and interpretable model which can be used as a proxy for the agronomic model. Nathalie Villa-Vialaneix | IS-SIR 4/26
  • 9. A first case study: SUNFLO [Casadebaig et al., 2011] Inputs: 5 daily time series (length: one year) and 8 phenotypes for different sunflower types Output: sunflower yield Data: 1000 sunflower types × 190 climatic series (different places and years) (n = 190 000) of variables in R5×183 × R8 Nathalie Villa-Vialaneix | IS-SIR 5/26
  • 10. Main facts obtained from a preliminary study R. Kpekou internship The study focused on the influence of the climate on the yield: 5 functional variables digitized at 183 points. Nathalie Villa-Vialaneix | IS-SIR 6/26
  • 11. Main facts obtained from a preliminary study R. Kpekou internship The study focused on the influence of the climate on the yield: 5 functional variables digitized at 183 points. Main result: Using summary of the variables (mean, sd...) on several weeks and an automatic aggregating procedure in a random forest method, led to obtain good accuracy in prediction. Nathalie Villa-Vialaneix | IS-SIR 6/26
  • 12. Question and mathematical framework A functional regression problem: X: random variable (functional) & Y: random real variable E(Y|X)? Nathalie Villa-Vialaneix | IS-SIR 7/26
  • 13. Question and mathematical framework A functional regression problem: X: random variable (functional) & Y: random real variable E(Y|X)? Data: n i.i.d. observations (xi, yi)i=1,...,n. xi is not perfectly known but sampled at (fixed) points xi = (xi(t1), . . . , xi(tp))T ∈ Rp . We denote: X =   xT 1 ... xT n   . Nathalie Villa-Vialaneix | IS-SIR 7/26
  • 14. Question and mathematical framework A functional regression problem: X: random variable (functional) & Y: random real variable E(Y|X)? Data: n i.i.d. observations (xi, yi)i=1,...,n. xi is not perfectly known but sampled at (fixed) points xi = (xi(t1), . . . , xi(tp))T ∈ Rp . We denote: X =   xT 1 ... xT n   . Question: Find a model which is easily interpretable and points out relevant intervals for the prediction within the range of X. Nathalie Villa-Vialaneix | IS-SIR 7/26
  • 15. Related works (variable selection in FDA) LASSO / L1 regularization in linear models [Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation points), [Matsui and Konishi, 2011] (selects elements of an expansion basis), [James et al., 2009] (sparsity on derivatives: piecewise constant predictors) [Fraiman et al., 2015] (blinding approach useable for various problems: PCA, regression...) [Gregorutti et al., 2015] adaptation of the importance of variables in random forest for groups of variables Nathalie Villa-Vialaneix | IS-SIR 8/26
  • 16. Related works (variable selection in FDA) LASSO / L1 regularization in linear models [Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation points), [Matsui and Konishi, 2011] (selects elements of an expansion basis), [James et al., 2009] (sparsity on derivatives: piecewise constant predictors) [Fraiman et al., 2015] (blinding approach useable for various problems: PCA, regression...) [Gregorutti et al., 2015] adaptation of the importance of variables in random forest for groups of variables Our proposal: a semi-parametric (not entirely linear) model which selects relevant intervals combined with an automatic procedure to define the intervals. Nathalie Villa-Vialaneix | IS-SIR 8/26
  • 17. Sommaire 1 Background and motivation 2 Presentation of SIR 3 Our proposal 4 Simulations Nathalie Villa-Vialaneix | IS-SIR 9/26
  • 18. SIR in multidimensional framework SIR: a semi-parametric regression model for X ∈ Rp Y = F(aT 1 X, . . . , aT d X, ) for a1, . . . , ad ∈ Rp (to be estimated), F : Rd+1 → R, unknown, and , an error, independant from X. Standard assumption for SIR Y X | PA (X) in which A is the so-called EDR space, spanned by (ak )k=1,...,d. Nathalie Villa-Vialaneix | IS-SIR 10/26
  • 19. Estimation Equivalence between SIR and eigendecomposition Nathalie Villa-Vialaneix | IS-SIR 11/26
  • 20. Estimation Equivalence between SIR and eigendecomposition A is included in the space spanned by the first d Σ-orthogonal eigenvectors of the generalized eigendecomposition problem: Γa = λΣa, with Σ = E (X − E(X|Y)))T E(X|Y) and Γ = E E(X|Y)T E(X|Y) Nathalie Villa-Vialaneix | IS-SIR 11/26
  • 21. Estimation Equivalence between SIR and eigendecomposition A is included in the space spanned by the first d Σ-orthogonal eigenvectors of the generalized eigendecomposition problem: Γa = λΣa, with Σ = E (X − E(X|Y)))T E(X|Y) and Γ = E E(X|Y)T E(X|Y) Estimation (when n > p) compute X = 1 n n i=1 xi and ˆΣ = 1 n XT (X − X) Nathalie Villa-Vialaneix | IS-SIR 11/26
  • 22. Estimation Equivalence between SIR and eigendecomposition A is included in the space spanned by the first d Σ-orthogonal eigenvectors of the generalized eigendecomposition problem: Γa = λΣa, with Σ = E (X − E(X|Y)))T E(X|Y) and Γ = E E(X|Y)T E(X|Y) Estimation (when n > p) compute X = 1 n n i=1 xi and ˆΣ = 1 n XT (X − X) split the range of Y into H different slices: τ1, ... τH and estimate ˆE(X|Y) = 1 nh i: yi∈τh xi h=1,...,H , with nh = |{i : yi ∈ τh}|, ˆΓ = ˆE(X|Y)T DˆE(X|Y) with D = Diag n1 n , . . . , nH n Nathalie Villa-Vialaneix | IS-SIR 11/26
  • 23. Estimation Equivalence between SIR and eigendecomposition A is included in the space spanned by the first d Σ-orthogonal eigenvectors of the generalized eigendecomposition problem: Γa = λΣa, with Σ = E (X − E(X|Y)))T E(X|Y) and Γ = E E(X|Y)T E(X|Y) Estimation (when n > p) compute X = 1 n n i=1 xi and ˆΣ = 1 n XT (X − X) split the range of Y into H different slices: τ1, ... τH and estimate ˆE(X|Y) = 1 nh i: yi∈τh xi h=1,...,H , with nh = |{i : yi ∈ τh}|, ˆΓ = ˆE(X|Y)T DˆE(X|Y) with D = Diag n1 n , . . . , nH n solving the eigendecomposition problem ˆΓa = λˆΣa gives the eigenvectors a1, . . . , ad ⇒ ˆA = (a1, . . . , ad), p × d Nathalie Villa-Vialaneix | IS-SIR 11/26
  • 24. Equivalent formulations SIR as a regression problem [Li and Yin, 2008] shows that SIR is equivalent to the (double) minimization of E(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 for Xh = 1 nh i: yi∈τh , A a (p × d)-matrix and C a vector in Rd . Nathalie Villa-Vialaneix | IS-SIR 12/26
  • 25. Equivalent formulations SIR as a regression problem [Li and Yin, 2008] shows that SIR is equivalent to the (double) minimization of E(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 for Xh = 1 nh i: yi∈τh , A a (p × d)-matrix and C a vector in Rd . Rk: Given A, C is obtained as the solution of an ordinary least square problem... Nathalie Villa-Vialaneix | IS-SIR 12/26
  • 26. Equivalent formulations SIR as a regression problem [Li and Yin, 2008] shows that SIR is equivalent to the (double) minimization of E(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 for Xh = 1 nh i: yi∈τh , A a (p × d)-matrix and C a vector in Rd . Rk: Given A, C is obtained as the solution of an ordinary least square problem... SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008] shows that SIR rewrites as the double optimisation problem maxaj,φ Cor(φ(Y), aT j X), where φ is any function R → R and (aj)j are Σ-orthonormal. Nathalie Villa-Vialaneix | IS-SIR 12/26
  • 27. Equivalent formulations SIR as a regression problem [Li and Yin, 2008] shows that SIR is equivalent to the (double) minimization of E(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 for Xh = 1 nh i: yi∈τh , A a (p × d)-matrix and C a vector in Rd . Rk: Given A, C is obtained as the solution of an ordinary least square problem... SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008] shows that SIR rewrites as the double optimisation problem maxaj,φ Cor(φ(Y), aT j X), where φ is any function R → R and (aj)j are Σ-orthonormal. Rk: The solution is shown to satisfy φ(y) = aT j E(X|Y = y) and aj is also obtained as the solution of the mean square error problem: min aj E φ(Y) − aT j X 2 Nathalie Villa-Vialaneix | IS-SIR 12/26
  • 28. SIR in large dimensions: problem In large dimension (or in Functional Data Analysis), n < p and ˆΣ is ill-conditionned and does not have an inverse ⇒ Z = (X − InX T )ˆΣ−1/2 can not be computed. Nathalie Villa-Vialaneix | IS-SIR 13/26
  • 29. SIR in large dimensions: problem In large dimension (or in Functional Data Analysis), n < p and ˆΣ is ill-conditionned and does not have an inverse ⇒ Z = (X − InX T )ˆΣ−1/2 can not be computed. Different solutions have been proposed in the litterature based on: prior dimension reduction (e.g., PCA) [Ferré and Yao, 2003] (in the framework of FDA) regularization (ridge...) [Li and Yin, 2008, Bernard-Michel et al., 2008] sparse SIR [Li and Yin, 2008, Li and Nachtsheim, 2008, Ni et al., 2005] Nathalie Villa-Vialaneix | IS-SIR 13/26
  • 30. SIR in large dimensions: ridge penalty / L2-regularization of ˆΣ Following [Li and Yin, 2008] which shows that SIR is equivalent to the minimization of E2(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 , Nathalie Villa-Vialaneix | IS-SIR 14/26
  • 31. SIR in large dimensions: ridge penalty / L2-regularization of ˆΣ Following [Li and Yin, 2008] which shows that SIR is equivalent to the minimization of E2(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 +µ2 H h=1 ˆph ACh 2 , [Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a high dimensional setting. Nathalie Villa-Vialaneix | IS-SIR 14/26
  • 32. SIR in large dimensions: ridge penalty / L2-regularization of ˆΣ Following [Li and Yin, 2008] which shows that SIR is equivalent to the minimization of E2(A, C) = H h=1 ˆph Xh − X − ˆΣACh 2 +µ2 H h=1 ˆph ACh 2 , [Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a high dimensional setting. They also show that this problem is equivalent to finding the eigenvectors of the generalized eigenvalue problem ˆΓa = λ ˆΣ + µ2Ip a. Nathalie Villa-Vialaneix | IS-SIR 14/26
  • 33. SIR in large dimensions: sparse versions Specific issue to introduce sparsity in SIR sparsity on a multiple-index model. Most authors use shrinkage approaches. First version: sparse penalization of the ridge solution If (ˆA, ˆC) are the solutions of the ridge SIR as described in the previous slide, [Ni et al., 2005, Li and Yin, 2008] propose to shrink this solution by minimizing Es,1(α) = H h=1 ˆph Xh − X − ˆΣDiag(α)ˆA ˆCh 2 + µ1 α L1 (regression formulation of SIR) Nathalie Villa-Vialaneix | IS-SIR 15/26
  • 34. SIR in large dimensions: sparse versions Specific issue to introduce sparsity in SIR sparsity on a multiple-index model. Most authors use shrinkage approaches. Second version: [Li and Nachtsheim, 2008] derive the sparse optimization problem from the correlation formulation of SIR: min as j n i=1 Pˆaj (X|yi) − (as j )T xi 2 + µ1,j as j L1 , in which Pˆaj is the projection of ˆE(X|Y = yi) = Xh onto the space spanned by the solution of the ridge problem. Nathalie Villa-Vialaneix | IS-SIR 15/26
  • 35. Characteristics of the different approaches and possible extensions [Li and Yin, 2008] [Li and Nachtsheim, 2008] sparsity on shrinkage coefficients estimates nb optimization pb 1 d sparsity common to all dims specific to each dim Nathalie Villa-Vialaneix | IS-SIR 16/26
  • 36. Characteristics of the different approaches and possible extensions [Li and Yin, 2008] [Li and Nachtsheim, 2008] sparsity on shrinkage coefficients estimates nb optimization pb 1 d sparsity common to all dims specific to each dim Extension to block-sparse SIR (like in PCA)? Nathalie Villa-Vialaneix | IS-SIR 16/26
  • 37. Sommaire 1 Background and motivation 2 Presentation of SIR 3 Our proposal 4 Simulations Nathalie Villa-Vialaneix | IS-SIR 17/26
  • 38. IS-SIR: a two step approach Background: Back to the functional setting, we suppose that t1, ..., tp are split into D intervals I1, ..., ID. Nathalie Villa-Vialaneix | IS-SIR 18/26
  • 39. IS-SIR: a two step approach Background: Back to the functional setting, we suppose that t1, ..., tp are split into D intervals I1, ..., ID. First step: Solve the ridge problem on the digitized functions (viewed as high dimensional vectors) to obtain ˆA and ˆC: min A,C H h=1 ˆph Xh − X − ˆΣACh 2 + µ2 H h=1 ˆph ACh 2 Nathalie Villa-Vialaneix | IS-SIR 18/26
  • 40. IS-SIR: a two step approach Background: Back to the functional setting, we suppose that t1, ..., tp are split into D intervals I1, ..., ID. First step: Solve the ridge problem on the digitized functions (viewed as high dimensional vectors) to obtain ˆA and ˆC: min A,C H h=1 ˆph Xh − X − ˆΣACh 2 + µ2 H h=1 ˆph ACh 2 Second step: Sparse shrinkage using the intervals. If PˆA (E(X|Y = yi)) = (Xh − X)T ˆA for h st yi ∈ τh and if Pi = (P1 i , . . . , Pd i )T and Pj = (Pj 1 , . . . , Pj n)T , we solve: arg min α∈RD d j=1 Pj − (X∆(ˆaj)) α 2 + µ1 α L1 with ∆(ˆaj) the (p × D)-matrix such that ∆kl(ˆaj) = ˆajl if tl ∈ Ik and 0 otherwise. Nathalie Villa-Vialaneix | IS-SIR 18/26
  • 41. IS-SIR: Characteristics uses the approach based on the correlation formulation (because the dimensionality of the optimization problem is smaller); uses a shrinkage approach and optimizes shrinkage coefficients in a single optimization problem; handles functional setting by penalizing entire intervals and not just isolated points. Nathalie Villa-Vialaneix | IS-SIR 19/26
  • 42. Parameter estimation H (number of slices): usually, SIR is known to be not very sensitive to the number of slices (> d + 1). We took H = 10 (i.e., 10/30 observations per slice); Nathalie Villa-Vialaneix | IS-SIR 20/26
  • 43. Parameter estimation H (number of slices): usually, SIR is known to be not very sensitive to the number of slices (> d + 1). We took H = 10 (i.e., 10/30 observations per slice); µ2 and d (ridge estimate ˆA): L-fold CV for µ2 (for a d0 large enough) Note that GCV as described in [Li and Yin, 2008] can not be used since the current version of the L2 penalty involves the use of an estimate of Σ−1 . Nathalie Villa-Vialaneix | IS-SIR 20/26
  • 44. Parameter estimation H (number of slices): usually, SIR is known to be not very sensitive to the number of slices (> d + 1). We took H = 10 (i.e., 10/30 observations per slice); µ2 and d (ridge estimate ˆA): L-fold CV for µ2 (for a d0 large enough) using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of R(d) = d − E Tr Πd ˆΠd , in which Πd and ˆΠd are the projector onto the first d dimensions of the EDR space and its estimate, is derived similarly as in [Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied to select a relevant d. Nathalie Villa-Vialaneix | IS-SIR 20/26
  • 45. Parameter estimation H (number of slices): usually, SIR is known to be not very sensitive to the number of slices (> d + 1). We took H = 10 (i.e., 10/30 observations per slice); µ2 and d (ridge estimate ˆA): L-fold CV for µ2 (for a d0 large enough) using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of R(d) = d − E Tr Πd ˆΠd , in which Πd and ˆΠd are the projector onto the first d dimensions of the EDR space and its estimate, is derived similarly as in [Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied to select a relevant d. µ1 (LASSO) glmnet is used, in which µ1 is selected by CV along the regularization path. Nathalie Villa-Vialaneix | IS-SIR 20/26
  • 46. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 47. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } 2 Iterate along the regularization path, select three values for µ1: Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 48. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } 2 Iterate along the regularization path, select three values for µ1: P% of the coefficients are zero, P% of the coefficients are non zero, best GCV. define: D− (“strong zeros”) and D+ (“strong non zeros”) Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 49. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } 2 Iterate define: D− (“strong zeros”) and D+ (“strong non zeros”) merge consecutive “strong zeros” (or “strong non zeros”) or “strong zeros” (resp. “strong non zeros”) separated by a few numbers of intervals which are of undetermined type. Until no more iterations can be performed. Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 50. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } 2 Iterate define: D− (“strong zeros”) and D+ (“strong non zeros”) merge consecutive “strong zeros” (or “strong non zeros”) or “strong zeros” (resp. “strong non zeros”) separated by a few numbers of intervals which are of undetermined type. Until no more iterations can be performed. 3 Output: Collection of models (first with p intervals, last with 1), M∗ D (optimal for GCV) and corresponding GCVD versus D (number of intervals). Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 51. An automatic approach to define intervals 1 Initial state: ∀ k = 1, . . . , p, τk = {tk } 2 Iterate define: D− (“strong zeros”) and D+ (“strong non zeros”) merge consecutive “strong zeros” (or “strong non zeros”) or “strong zeros” (resp. “strong non zeros”) separated by a few numbers of intervals which are of undetermined type. Until no more iterations can be performed. 3 Output: Collection of models (first with p intervals, last with 1), M∗ D (optimal for GCV) and corresponding GCVD versus D (number of intervals). Final solution: Minimize GCVD over D. Nathalie Villa-Vialaneix | IS-SIR 21/26
  • 52. Sommaire 1 Background and motivation 2 Presentation of SIR 3 Our proposal 4 Simulations Nathalie Villa-Vialaneix | IS-SIR 22/26
  • 53. Simulation framework Data generated with: Y = d j=1 log X, aj with X(t) = Z(t) + in which Z is a Gaussian process with mean µ(t) = −5 + 4t − 4t2 and the Matern 3/2 covariance function with parameters σ = 0.1 and θ = 0.2/ √ 3, is a centered Gaussian variable independant of Z, with standard deviation 0.1.; aj = sin t(2+j)π 2 − (j−1)π 3 IIj (t) two models: (M1), d = 1, I1 = [0.2, 0.4]. For (M2), d = 3 and I1 = [0, 0.1], I2 = [0.5, 0.65] and I3 = [0.65, 0.78]. Nathalie Villa-Vialaneix | IS-SIR 23/26
  • 56. Ridge step (model M1) Selection of µ2: µ2 = 1 Nathalie Villa-Vialaneix | IS-SIR 24/26
  • 57. Ridge step (model M1) Selection of d: d = 1 Nathalie Villa-Vialaneix | IS-SIR 24/26
  • 58. Definition of the intervals D = 200 (initial state) 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.8 a^ 1 Nathalie Villa-Vialaneix | IS-SIR 25/26
  • 59. Definition of the intervals D = 147 (retained solution) 0.2 0.4 0.6 0.8 1.0 0.000.020.040.060.08 a^ 1 Nathalie Villa-Vialaneix | IS-SIR 25/26
  • 60. Definition of the intervals D = 43 0.2 0.4 0.6 0.8 1.0 −0.050.000.05 a^ 1 Nathalie Villa-Vialaneix | IS-SIR 25/26
  • 61. Definition of the intervals D = 5 0.2 0.4 0.6 0.8 1.0 −0.04−0.020.000.020.040.060.08 a^ 1 Nathalie Villa-Vialaneix | IS-SIR 25/26
  • 62. Definition of the intervals q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q 0 50 100 150 200 0.0190.0200.0210.0220.023 Number of intervals CVerror Nathalie Villa-Vialaneix | IS-SIR 25/26
  • 63. Conclusion IS-SIR: sparse dimension reduction model adapted to functional framework; fully automated definition of relevant intervals in the range of the predictors Nathalie Villa-Vialaneix | IS-SIR 26/26
  • 64. Conclusion IS-SIR: sparse dimension reduction model adapted to functional framework; fully automated definition of relevant intervals in the range of the predictors Perspective: application to real data block-wise sparse SIR? Nathalie Villa-Vialaneix | IS-SIR 26/26
  • 65. Aneiros, G. and Vieu, P. (2014). Variable in infinite-dimensional problems. Statistics and Probability Letters, 94:12–20. Bernard-Michel, C., Gardes, L., and Girard, S. (2008). A note on sliced inverse regression with regularizations. Biometrics, 64(3):982–986. Casadebaig, P., Guilioni, L., Lecoeur, J., Christophe, A., Champolivier, L., and Debaeke, P. (2011). SUNFLO, a model to simulate genotype-specific performance of the sunflower crop in contrasting environments. Agricultural and Forest Meteorology, 151(2):163–178. Ferraty, F., Hall, P., and Vieu, P. (2010). Most-predictive design points for functiona data predictors. Biometrika, 97(4):807–824. Ferré, L. and Yao, A. (2003). Functional sliced inverse regression analysis. Statistics, 37(6):475–488. Fraiman, R., Gimenez, Y., and Svarc, M. (2015). Feature selection for functional data. Journal of Multivariate Analysis. In Press. Gregorutti, B., Michel, B., and Saint-Pierre, P. (2015). Grouped variable importance with random forests and application to multiple functional data analysis. Computational Statistics and Data Analysis, 90:15–35. James, G., Wang, J., and Zhu, J. (2009). Functional linear regression that’s interpretable. Annals of Statistics, 37(5A):2083–2108. Li, L. and Nachtsheim, C. (2008). Nathalie Villa-Vialaneix | IS-SIR 26/26
  • 66. Sparse sliced inverse regression. Technometrics, 48(4):503–510. Li, L. and Yin, X. (2008). Sliced inverse regression with regularizations. Biometrics, 64:124–131. Liquet, B. and Saracco, J. (2012). A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approches. Computational Statistics, 27(1):103–125. Matsui, H. and Konishi, S. (2011). Variable selection for functional regression models via the l1 regularization. Computational Statistics and Data Analysis, 55(12):3304–3310. Ni, L., Cook, D., and Tsai, C. (2005). A note on shrinkage sliced inverse regression. Biometrika, 92(1):242–247. Nathalie Villa-Vialaneix | IS-SIR 26/26