SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
.
.
Universal Prediction
without assuming either Discrete or Continuous
Joe Suzuki
Osaka University
November 13, 2012
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 1 / 16
Problem
What is the probability that the sun will rise tomorrow?
Predict xn+1 ∈ {0, 1} given xn := (x1, · · · , xn) ∈ {0, 1}n
.
.
Construct a computable Q(xn+1|xn) → P(xn+1|xn)
such as
1 Q(xn+1|xn
) =
c
n
2 For a, b > 0, Q(xn+1|xn
) =
c + a
n + a + b
 
c: the number of xn+1 in xn.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 2 / 16
Problem
Open Problems raised by Tom Cover in 1975, Moscow
In the betting, obtain 2 dollars if you win, or lose 1 dollar otherwise.
 
Problem 1: Existence of a universal gambling scheme
.
Is there any Qn s.t.
1
n
log[2n
Qn
(xn
)] →
1
n
log[2n
Pn
(xn
)]
a.s. n → ∞ for any unknown stationary ergodic Pn ?
Betting without knowledge converges to one with knowledge
(Bayesian strategy realizes the property)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 3 / 16
Problem
Problem 2: Existence of a universal prediction scheme
.
.
Is there any Q s.t. for x ∈ {0, 1}
Q(x|x−1
−n ) → P(x|x−1
−∞)
a.s. n → ∞ for any unknown stationary ergodic P ?
Ornstein 1978 (discrete, Non-Bayesian)
Algoet 1992 (extended to the Polish spaces, Non-Bayesian)
x−1
−∞ ∈ {0, 1}∞ → ({sk}, {tk}), s0 < s1 < · · · , t0 < t1 < · · · s.t.
Q(x|x−1
−tk
) =
#Ik(x) + 1/2
#Ik(0) + #Ik(1) + 1
Ik(x) = {1 ≤ τ ≤ sk|x = x−τ , x−1
−tk
= x−τ−1
−τ−tk
}
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 4 / 16
Problem
Bayesian for binary i.i.d. sources
Qn
(xn
) =
∫
w(θ)P(xn
|θ)dθ , P(xn
|θ) = θc
(1 − θ)n−c
For a, b > 0,
w(θ) ∝ θ−a
(1 − θ)−b
⇐⇒ Q(xn+1|xn
) =
Qn+1(xn+1)
Qn(xn)
=
c + a
n + a + b
For a = b = 1/2 (Krichevsky-Trofimov),
−
1
n
log Qn
(xn
) → H :=
∑
x∈A
−P(x) log P(x)
−
1
n
log Pn
(xn
) =
1
n
n∑
i=1
− log P(xi ) → E[− log P(xi )] = H
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 5 / 16
Problem
Universality
There exists Qn s.t. for any Pn
1
Q(x|x−1
−n ) → P(x|x−1
−∞) (1)
2
1
n
log
Pn(xn)
Qn(xn)
→ 0 (2)
m-nary (m ≥ 2) rather than binary
stationary ergodic rather than i.i.d.
Ornstein 1978 (1)
Bayesian (2) as well as (1)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 6 / 16
Problem
Problem
Construct Qn satisfying (2) for the genaral case
.
.
Xn should be stationary ergodic but can be either
discrete,
continuous, or
neither of them
Counting how many (X = xi+1, Xi = xi ) occurs does not help.
Algoet 1992 does not imply (2) for the general case.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 7 / 16
Density Functions
Suppose a density function f exists for X
A: the range of X
A0 := {A}
Aj+1 is a refinement of Aj
Example 1: Quantize f over A = [0, 1) to obtain histogram approximations
f1 over A1 = {[0, 1/2), [1/2, 1)}
f2 over A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)}
. . .
fj over Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)}
. . .
Pn
j (an) =
∏n
i=1 Pj (ai ), the probability of an = (a1, · · · , an) ∈ An
j
Qn
j : a Bayesian measure
1
n
log
Pn
j (an)
Qn
j (an)
→ 0 as n → ∞
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 8 / 16
Density Functions
λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b)
(x1, · · · , xn) ∈ (a1, · · · , an) ∈ An
j
=⇒



f n
j (xn
) := fj (x1) · · · fj (xn) =
Pj (a1) · · · Pj (an)
λ(a1) . . . λ(an)
gn
j (xn
) :=
Qn
j (a1, · · · , an)
λ(a1) · · · λ(an)
For {ωj }∞
j=1:
∑
ωj = 1, ωj > 0, gn
(xn
) :=
∞∑
j=1
ωj gn
j (xn
)
If we choose {Aj } such that fj → f as j → ∞, for any f , almost surely
1
n
log
f n(xn)
gn(xn)
→ 0 (3)
B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009.
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 9 / 16
Generalized Density Functions
Exactly when does density function exist?
B: the Borel sets of R
µ(D): the probabbility of D ∈ B
When a density function exists
.
The following are equivalent (µ ≪ λ):
for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0
∃ B-measurable
dµ
dλ
:= f s.t. µ(D) =
∫
D
f (t)dλ(t)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 10 / 16
Generalized Density Functions
Estimating generalized density functions
Radon-Nikodym’s Theorem
.
.
The following are equivalent (µ ≪ η):
for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0
∃ B-measurable
dµ
dη
:= f s.t. µ(D) =
∫
D
f (t)dη(t)
Example 2: µ({k}) > 0, η({k}) :=
1
k(k + 1)
, k ∈ B := {1, 2, · · · }
µ(D) =
∑
k∈D
f (k)η({k}) , D ⊆ B
µ ≪ η =⇒
dµ
dη
(k) = f (k) =
µ({k})
η({k})
= k(k + 1)µ({k})
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 11 / 16
Generalized Density Functions
f1 over B1 := {{1}, {2, 3, · · · }}
f2 over B2 := {{1}, {2}, {3, 4, · · · }}
. . .
fk over Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }}
. . .
(y1, · · · , yn) ∈ (b1, · · · , bn) ∈ Bn
k =⇒ gn
k (yn
) :=
Qn
k (b1, · · · , bn)
η(b1) · · · η(bn)
gn
(yn
) :=
∞∑
k=1
ωkgn
k (yn
)
If we choose {Bk} s.t. fk → f , for any f , almost surely
1
n
log
f n(yn)
gn(yn)
→ 0 (4)
gn(yn)
∏n
i=1 ηn({yi }) estimates P(yn) = f n(yn)
∏n
i=1 ηn({yi })
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 12 / 16
Generalized Density Functions
The original case was contained as a special case
For C = {0, 1, · · · , m − 1}, if we quantize
C1 = C2 = · · · = {{0}, {1}, · · · , {m − 1}}
η({0}) = · · · η({m − 1}) = 1/m
then µ ≪ η and
zn
∈ Cn
⇐⇒ cn
∈ Cn
1 = Cn
2 = · · ·
=⇒



f n
(zn
) =
Pn(cn)
(1/m)n
,
gn
1 (zn
) = gn
2 (zn
) = · · · = gn
(zn
) =
∞∑
l=1
ωl gn
l (zn
) =
Qn(cn)
(1/m)n
=⇒
1
n
log
f n(zn)
gn(zn)
=
1
n
log
Pn(cn)
Qn(cn)
→ 0
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 13 / 16
The Solution
Universality in the generalized sense
If µn ≪ ηn, there exists gn without depending on f n s.t.
1
n
log
f n(zn)
gn(zn)
→ 0
µn
(Dn
) :=
∫
D
f n
(zn
)dηn
(zn
) , νn
(Dn
) :=
∫
D
gn
(zn
)dηn
(zn
)
f n(zn)
gn(zn)
=
dµn
dηn
(zn
)/
dνn
dηn
(zn
) =
dµn
dνn
(zn
)
Theorem (Suzuki, 2011)
1
n
log
dµn
dνn
(zn
) → 0
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 14 / 16
The Solution
Universal Prediction in the generalized sense
The generalzed universal density function tells everything:
g(xn+1|xn
) =
gn+1(xn+1)
gn(xn)
→ f (xn+1|xn
) =
f n+1(xn+1)
f n(xn)
 
For any D ∈ B,
ν(D|xn
) =
∫
D
g(x|xn
)dη(x)
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 15 / 16
Summary
Summary and Discussion
Universal Prediction
.
.
Connection to Universal Bayesian Measures
Generalization without assuming Discrete or Continuous
Stronger universality in the sense of Bayes.
Many Applications except Prediction
Bayesian network structure estimation (DCC 2012)
The Bayesian Chow-Liu Algorithm (PGM 2012)
Markov order estimation even when {Xi } is continuous
Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 16 / 16

Mais conteúdo relacionado

Mais procurados

Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence TestJoe Suzuki
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted ParaproductsVjekoslavKovac1
 
Bellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsBellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsVjekoslavKovac1
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisVjekoslavKovac1
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureVjekoslavKovac1
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsVjekoslavKovac1
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flowsVjekoslavKovac1
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremSteven Duplij (Stepan Douplii)
 
The Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmThe Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmJoe Suzuki
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017Fred J. Hickernell
 
Paraproducts with general dilations
Paraproducts with general dilationsParaproducts with general dilations
Paraproducts with general dilationsVjekoslavKovac1
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsVjekoslavKovac1
 
Generarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsGenerarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsAlexander Decker
 
Some Examples of Scaling Sets
Some Examples of Scaling SetsSome Examples of Scaling Sets
Some Examples of Scaling SetsVjekoslavKovac1
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
Elementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions ManualElementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions Manualzuxigytix
 
orlando_fest
orlando_festorlando_fest
orlando_festAndy Hone
 

Mais procurados (20)

Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence Test
 
Multilinear Twisted Paraproducts
Multilinear Twisted ParaproductsMultilinear Twisted Paraproducts
Multilinear Twisted Paraproducts
 
Bellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproductsBellman functions and Lp estimates for paraproducts
Bellman functions and Lp estimates for paraproducts
 
Scattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysisScattering theory analogues of several classical estimates in Fourier analysis
Scattering theory analogues of several classical estimates in Fourier analysis
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structure
 
cheb_conf_aksenov.pdf
cheb_conf_aksenov.pdfcheb_conf_aksenov.pdf
cheb_conf_aksenov.pdf
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cube
 
A sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentialsA sharp nonlinear Hausdorff-Young inequality for small potentials
A sharp nonlinear Hausdorff-Young inequality for small potentials
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S. Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
 
The Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu AlgorithmThe Universal Bayesian Chow-Liu Algorithm
The Universal Bayesian Chow-Liu Algorithm
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017
 
Paraproducts with general dilations
Paraproducts with general dilationsParaproducts with general dilations
Paraproducts with general dilations
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
 
Generarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphsGenerarlized operations on fuzzy graphs
Generarlized operations on fuzzy graphs
 
Some Examples of Scaling Sets
Some Examples of Scaling SetsSome Examples of Scaling Sets
Some Examples of Scaling Sets
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
Elementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions ManualElementary Linear Algebra 5th Edition Larson Solutions Manual
Elementary Linear Algebra 5th Edition Larson Solutions Manual
 
orlando_fest
orlando_festorlando_fest
orlando_fest
 

Semelhante a Universal Prediction without assuming either Discrete or Continuous

Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Joe Suzuki
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Frank Nielsen
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Frank Nielsen
 
Slides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfSlides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfMohammad Khajenejad
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
 
On approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withinOn approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withineSAT Publishing House
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsFrank Nielsen
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihoodDeep Learning JP
 

Semelhante a Universal Prediction without assuming either Discrete or Continuous (20)

Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...Bayesian network structure estimation based on the Bayesian/MDL criteria when...
Bayesian network structure estimation based on the Bayesian/MDL criteria when...
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Ece3075 a 8
Ece3075 a 8Ece3075 a 8
Ece3075 a 8
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
 
Slides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdfSlides_Resilient_State_Estimation_CDC23.pdf
Slides_Resilient_State_Estimation_CDC23.pdf
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
On approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials withinOn approximate bounds of zeros of polynomials within
On approximate bounds of zeros of polynomials within
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histograms
 
RuFiDiM
RuFiDiMRuFiDiM
RuFiDiM
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Tutorial7
Tutorial7Tutorial7
Tutorial7
 
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
 
2014 9-16
2014 9-162014 9-16
2014 9-16
 
Bayes gauss
Bayes gaussBayes gauss
Bayes gauss
 

Mais de Joe Suzuki

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較するJoe Suzuki
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研Joe Suzuki
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...Joe Suzuki
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減するJoe Suzuki
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定Joe Suzuki
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityJoe Suzuki
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップJoe Suzuki
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要Joe Suzuki
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from DataJoe Suzuki
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionJoe Suzuki
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)Joe Suzuki
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...Joe Suzuki
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Joe Suzuki
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Joe Suzuki
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定Joe Suzuki
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJoe Suzuki
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐるJoe Suzuki
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後Joe Suzuki
 

Mais de Joe Suzuki (20)

RとPythonを比較する
RとPythonを比較するRとPythonを比較する
RとPythonを比較する
 
R集会@統数研
R集会@統数研R集会@統数研
R集会@統数研
 
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...E-learning Development of Statistics and in Duex: Practical Approaches and Th...
E-learning Development of Statistics and in Duex: Practical Approaches and Th...
 
分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する分枝限定法でモデル選択の計算量を低減する
分枝限定法でモデル選択の計算量を低減する
 
連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定連続変量を含む条件付相互情報量の推定
連続変量を含む条件付相互情報量の推定
 
E-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka UniversityE-learning Design and Development for Data Science in Osaka University
E-learning Design and Development for Data Science in Osaka University
 
UAI 2017
UAI 2017UAI 2017
UAI 2017
 
AMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップAMBN2017 サテライトワークショップ
AMBN2017 サテライトワークショップ
 
CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要CRAN Rパッケージ BNSLの概要
CRAN Rパッケージ BNSLの概要
 
Forest Learning from Data
Forest Learning from DataForest Learning from Data
Forest Learning from Data
 
A Bayesian Approach to Data Compression
A Bayesian Approach to Data CompressionA Bayesian Approach to Data Compression
A Bayesian Approach to Data Compression
 
研究紹介(学生向け)
研究紹介(学生向け)研究紹介(学生向け)
研究紹介(学生向け)
 
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...Efficietly Learning Bayesian Network Structuresbased on the B&B Strategy: A ...
Efficietly Learning Bayesian Network Structures based on the B&B Strategy: A ...
 
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
Forest Learning based on the Chow-Liu Algorithm and its Application to Genom...
 
2016 7-13
2016 7-132016 7-13
2016 7-13
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
 
連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定連続変量を含む相互情報量の推定
連続変量を含む相互情報量の推定
 
Jeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model SelectionJeffreys' and BDeu Priors for Model Selection
Jeffreys' and BDeu Priors for Model Selection
 
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる離散と連続の入り混じった相互情報量を推定して、SNP と遺伝子発現量の因果関係をさぐる
離散と連続の入り混じった相互情報量を推定して、 SNP と遺伝子発現量の因果関係をさぐる
 
MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後MaCaulay2 Miuraパッケージの開発と今後
MaCaulay2 Miuraパッケージの開発と今後
 

Último

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 

Último (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 

Universal Prediction without assuming either Discrete or Continuous

  • 1. . . Universal Prediction without assuming either Discrete or Continuous Joe Suzuki Osaka University November 13, 2012 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 1 / 16
  • 2. Problem What is the probability that the sun will rise tomorrow? Predict xn+1 ∈ {0, 1} given xn := (x1, · · · , xn) ∈ {0, 1}n . . Construct a computable Q(xn+1|xn) → P(xn+1|xn) such as 1 Q(xn+1|xn ) = c n 2 For a, b > 0, Q(xn+1|xn ) = c + a n + a + b   c: the number of xn+1 in xn. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 2 / 16
  • 3. Problem Open Problems raised by Tom Cover in 1975, Moscow In the betting, obtain 2 dollars if you win, or lose 1 dollar otherwise.   Problem 1: Existence of a universal gambling scheme . Is there any Qn s.t. 1 n log[2n Qn (xn )] → 1 n log[2n Pn (xn )] a.s. n → ∞ for any unknown stationary ergodic Pn ? Betting without knowledge converges to one with knowledge (Bayesian strategy realizes the property) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 3 / 16
  • 4. Problem Problem 2: Existence of a universal prediction scheme . . Is there any Q s.t. for x ∈ {0, 1} Q(x|x−1 −n ) → P(x|x−1 −∞) a.s. n → ∞ for any unknown stationary ergodic P ? Ornstein 1978 (discrete, Non-Bayesian) Algoet 1992 (extended to the Polish spaces, Non-Bayesian) x−1 −∞ ∈ {0, 1}∞ → ({sk}, {tk}), s0 < s1 < · · · , t0 < t1 < · · · s.t. Q(x|x−1 −tk ) = #Ik(x) + 1/2 #Ik(0) + #Ik(1) + 1 Ik(x) = {1 ≤ τ ≤ sk|x = x−τ , x−1 −tk = x−τ−1 −τ−tk } Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 4 / 16
  • 5. Problem Bayesian for binary i.i.d. sources Qn (xn ) = ∫ w(θ)P(xn |θ)dθ , P(xn |θ) = θc (1 − θ)n−c For a, b > 0, w(θ) ∝ θ−a (1 − θ)−b ⇐⇒ Q(xn+1|xn ) = Qn+1(xn+1) Qn(xn) = c + a n + a + b For a = b = 1/2 (Krichevsky-Trofimov), − 1 n log Qn (xn ) → H := ∑ x∈A −P(x) log P(x) − 1 n log Pn (xn ) = 1 n n∑ i=1 − log P(xi ) → E[− log P(xi )] = H Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 5 / 16
  • 6. Problem Universality There exists Qn s.t. for any Pn 1 Q(x|x−1 −n ) → P(x|x−1 −∞) (1) 2 1 n log Pn(xn) Qn(xn) → 0 (2) m-nary (m ≥ 2) rather than binary stationary ergodic rather than i.i.d. Ornstein 1978 (1) Bayesian (2) as well as (1) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 6 / 16
  • 7. Problem Problem Construct Qn satisfying (2) for the genaral case . . Xn should be stationary ergodic but can be either discrete, continuous, or neither of them Counting how many (X = xi+1, Xi = xi ) occurs does not help. Algoet 1992 does not imply (2) for the general case. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 7 / 16
  • 8. Density Functions Suppose a density function f exists for X A: the range of X A0 := {A} Aj+1 is a refinement of Aj Example 1: Quantize f over A = [0, 1) to obtain histogram approximations f1 over A1 = {[0, 1/2), [1/2, 1)} f2 over A2 = {[0, 1/4), [1/4, 1/2), [1/2, 3/4), [3/4, 1)} . . . fj over Aj = {[0, 2−(j−1)), [2−(j−1), 2 · 2−(j−1)), · · · , [(2j−1 − 1)2−(j−1), 1)} . . . Pn j (an) = ∏n i=1 Pj (ai ), the probability of an = (a1, · · · , an) ∈ An j Qn j : a Bayesian measure 1 n log Pn j (an) Qn j (an) → 0 as n → ∞ Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 8 / 16
  • 9. Density Functions λ : R → B (Lebesgue measure, a = [b, c) =⇒ λ(a) = c − b) (x1, · · · , xn) ∈ (a1, · · · , an) ∈ An j =⇒    f n j (xn ) := fj (x1) · · · fj (xn) = Pj (a1) · · · Pj (an) λ(a1) . . . λ(an) gn j (xn ) := Qn j (a1, · · · , an) λ(a1) · · · λ(an) For {ωj }∞ j=1: ∑ ωj = 1, ωj > 0, gn (xn ) := ∞∑ j=1 ωj gn j (xn ) If we choose {Aj } such that fj → f as j → ∞, for any f , almost surely 1 n log f n(xn) gn(xn) → 0 (3) B. Ryabko. IEEE Trans. on Inform. Theory, 55, 9, 2009. Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 9 / 16
  • 10. Generalized Density Functions Exactly when does density function exist? B: the Borel sets of R µ(D): the probabbility of D ∈ B When a density function exists . The following are equivalent (µ ≪ λ): for each D ∈ B, λ(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dλ := f s.t. µ(D) = ∫ D f (t)dλ(t) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 10 / 16
  • 11. Generalized Density Functions Estimating generalized density functions Radon-Nikodym’s Theorem . . The following are equivalent (µ ≪ η): for each D ∈ B, η(D) = 0 =⇒ µ(D) = 0 ∃ B-measurable dµ dη := f s.t. µ(D) = ∫ D f (t)dη(t) Example 2: µ({k}) > 0, η({k}) := 1 k(k + 1) , k ∈ B := {1, 2, · · · } µ(D) = ∑ k∈D f (k)η({k}) , D ⊆ B µ ≪ η =⇒ dµ dη (k) = f (k) = µ({k}) η({k}) = k(k + 1)µ({k}) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 11 / 16
  • 12. Generalized Density Functions f1 over B1 := {{1}, {2, 3, · · · }} f2 over B2 := {{1}, {2}, {3, 4, · · · }} . . . fk over Bk := {{1}, {2}, · · · , {k}, {k + 1, k + 2, · · · }} . . . (y1, · · · , yn) ∈ (b1, · · · , bn) ∈ Bn k =⇒ gn k (yn ) := Qn k (b1, · · · , bn) η(b1) · · · η(bn) gn (yn ) := ∞∑ k=1 ωkgn k (yn ) If we choose {Bk} s.t. fk → f , for any f , almost surely 1 n log f n(yn) gn(yn) → 0 (4) gn(yn) ∏n i=1 ηn({yi }) estimates P(yn) = f n(yn) ∏n i=1 ηn({yi }) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 12 / 16
  • 13. Generalized Density Functions The original case was contained as a special case For C = {0, 1, · · · , m − 1}, if we quantize C1 = C2 = · · · = {{0}, {1}, · · · , {m − 1}} η({0}) = · · · η({m − 1}) = 1/m then µ ≪ η and zn ∈ Cn ⇐⇒ cn ∈ Cn 1 = Cn 2 = · · · =⇒    f n (zn ) = Pn(cn) (1/m)n , gn 1 (zn ) = gn 2 (zn ) = · · · = gn (zn ) = ∞∑ l=1 ωl gn l (zn ) = Qn(cn) (1/m)n =⇒ 1 n log f n(zn) gn(zn) = 1 n log Pn(cn) Qn(cn) → 0 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 13 / 16
  • 14. The Solution Universality in the generalized sense If µn ≪ ηn, there exists gn without depending on f n s.t. 1 n log f n(zn) gn(zn) → 0 µn (Dn ) := ∫ D f n (zn )dηn (zn ) , νn (Dn ) := ∫ D gn (zn )dηn (zn ) f n(zn) gn(zn) = dµn dηn (zn )/ dνn dηn (zn ) = dµn dνn (zn ) Theorem (Suzuki, 2011) 1 n log dµn dνn (zn ) → 0 Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 14 / 16
  • 15. The Solution Universal Prediction in the generalized sense The generalzed universal density function tells everything: g(xn+1|xn ) = gn+1(xn+1) gn(xn) → f (xn+1|xn ) = f n+1(xn+1) f n(xn)   For any D ∈ B, ν(D|xn ) = ∫ D g(x|xn )dη(x) Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 15 / 16
  • 16. Summary Summary and Discussion Universal Prediction . . Connection to Universal Bayesian Measures Generalization without assuming Discrete or Continuous Stronger universality in the sense of Bayes. Many Applications except Prediction Bayesian network structure estimation (DCC 2012) The Bayesian Chow-Liu Algorithm (PGM 2012) Markov order estimation even when {Xi } is continuous Joe Suzuki (Osaka University) Universal Prediction without assuming either Discrete or ContinuousNovember 13, 2012 16 / 16