SlideShare uma empresa Scribd logo
1 de 101
Baixar para ler offline
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Jialin LIU
advised by: Olivier Teytaud & Marc Schoenauer
TAO, Inria, Univ. Paris-Saclay, UMR CNRS 8623, France
December 11, 2015
1 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Motivation
Motivation
Why noisy optimization (i.e. optim. in front of a stochastic model) ?
Not that many works on noisy optimization
faults in networks: you can not use an average over 50 years
(many lines would be 100% guaranteed) ⇒ you need a (stochastic) model of faults
Why adversarial (i.e. worst case) problems ?
Critical problems with uncertainties
(technological breakthroughs, CO2 penalization ...)
Why portfolio (i.e. combining/selecting solvers) ?
Great in combinatorial optimization → let us generalize :)
Why MCTS ?
Great recent tool
Still many things to do
All related ?
All applicable to games
All applicable to power systems
Nash ⇒ mixed strategy portfolio
2 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization criteria for black-box noisy optimization
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
3 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization criteria for black-box noisy optimization
Black-box Noisy Optimization Framework
f : x → f(x, ω)
from a domain D ⊂ Rd
→ Continuous optimization
to R
with random variable ω.
Goal
x∗
= argmin
x∈Rd
Eωf(x, ω)
i.e. access to independent evaluations of f.
Black-Box case:
→ do not use any internal property of f
→ access to f(x) only, not f(x)
→ for a given x: randomly samples ω and returns f(x, ω)
→ for its nth
request, returns f(x, ωn)
x −→ −→ f(x, ω)
4 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization criteria: State-of-the-art
Noise-free case: log-linear convergence [Auger, 2005, Rechenberg, 1973]
log ||xn − x∗
||
n
∼ A < 0 (1)
Noisy case: log-log convergence [Fabian, 1967]
log ||xn − x∗
||
log(n)
∼ A < 0 (2)
Figure: y-axis: log ||xn − x∗||, x-axis:#eval for log-linear convergence in noise-free case or
log #eval for log-log convergence in noisy case.
5 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization criteria: Convergence rates
Slopes for Uniform Rate, Simple Regret1
[Bubeck et al., 2011] and Cumulative Regret
x∗
: the optimum of f
xn: the nth
evaluated search point
˜xn: the optimum estimated after nth
evaluation
Uniform Rate URn = ||xn − x∗
|| → all search points matter
Simple Regret SRn = Eωf(˜xn, ω) − Eωf(x∗
, ω) → final recommendation matters
Cumulative Regret CRn =
j≤n
(Eωf(xj , ω) − Eωf(x∗
, ω)) → all recommendations matter
Convergence rates:
Slope(UR) = lim sup
n→∞
log(URn)
log(n)
(3)
Slope(SR) = lim sup
n→∞
log(SRn)
log(n)
(4)
Slope(CR) = lim sup
n→∞
log(CRn)
log(n)
. (5)
1
Simple Regret = difference between expected payoff recommended vs optimal. 6 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
7 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Tricks for handling noise:
Resampling: average multiple evaluations
Large population
Surrogate models
Specific methods (stochastic gradient descent with finite differences)
Here: focus on resampling
Resampling number: how many times do we resample noise ?
8 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Resampling methods: Non-adaptive resampling methods
[Recall] log-log convergence: log ||xn−x∗
||
log(n)
∼ A < 0, n is evaluation number
Non-adaptive rules:
Exponential rules with ad hoc parameters
⇒ log-log convergence (mathematically proved by us)
Other rules as a function of #iter: square root, linear rules, polynomial rules
Other rules as a function of #iter and dimension
9 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Resampling methods: Adaptive resampling methods
Adaptive rules: Bernstein [Mnih et al., 2008, Heidrich-Meisner and Igel, 2009]
Here:
FOR a pair of search points x, x to be compared DO
WHILE computation time is not elapsed DO
1000 resamplings for x and x
IF mean(difference) >> std THEN
break
ENDIF
ENDWHILE ENDFOR
10 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Resampling methods: Comparison
With Continuous Noisy Optimization (CNO)
With Evolution Strategies (ES)
With Differential Evolution (DE)
11 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Comparison with CNO
Continuous Noisy Optimization: we propose
Iterative Noisy Optimization Algorithm (INOA)
as a general framework for noisy optimization.
Key points:
Sampler which chooses a sampling around the current approximation,
Opt which updates the approximation of the optimum,
resampling number rn = B nβ
and sampling step-size σn = A/nα
Main application: finite differences sampling + quadratic model
12 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Comparison with CNO: State-of-the-art and our results
3 types of noise: constant, linear or quadratic as a function of the SR:
Var(f(x, ω)) = O [Eωf(x, ω) − Eωf(x∗
, ω)]
z
(6)
with z ∈ {0, 1, 2}.
z
optimized for CR optimized for SR
slope(SR) slope(CR) slope(SR) slope(CR)
0 (constant var) − 1
2
1
2
− 2
3 2
3[Fabian, 1967] [Dupaˇc, 1957]
[Shamir, 2013]
0 and −1
∞-differentiable [Fabian, 1967]
0 and “quadratic”
−1
[Dupaˇc, 1957]
1 (linear var)
−1
0 −1 0
[Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010]
2 (quadratic var)
−∞ 0 −∞ 0
[Jebalia and Auger, 2008] [Jebalia and Auger, 2008] [Jebalia and Auger, 2008] [Jebalia and Auger, 2008]
Table: State-of-the-art: Convergence rates. Blue: existing results, we also achieved. Red: new
results by us.
Main application: finite differences sampling + quadratic model
Various (new, proved) rates depending on assumptions
Recovers existing rates (with a same algorithm) and beyond
13 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Comparison with CNO: Results & Discussion
Our proposed algorithm (provably) reaches the same rate
as Kiefer-Wolfowitz algorithm when the noise has constant variance
as Bernstein-races optimization algorithms when the noise variance decreases
linearly as a function of the simple regret
as Evolution Strategies when the noise variance decreases quadratically as a
function of the simple regret
⇒ no details here, focus on ES and DE.
14 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
What about evolutionary algorithms ? Experiments with variance noise =
constant (hard case)
Algorithms:
ES + resampling
DE + resamplnig
Results: slope(SR) = −1
2
in both cases
(with e.g. rules depending on #iter and dimension)
5 10 15 20 25
5
10
15
N1.01exp
N1.1exp
N2exp
Nscale
Figure: Modified function F4 of CEC2005, dimension 2. x-axis: log(#eval); y-axis: log(SR).
15 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Resampling methods: Partial conclusion
Conclusion:
Adaptation of Newton’s algorithm for noisy fitness ( f and Hf approximated by
finite differences+resamplings)
→ leads to fast convergence rates + recovers many rates in one alg. + generic
framework (but no proved application besides quadratic surrogate model)
Non-adaptive methods lead to log-log convergence (math+xp) in ES
Nscale = d−2
exp( 4n
5d
) ok (slope(SR) = −1
2
) for both ES and DE
(nb: −1 possible with large mutation + small inheritance)
In progress:
Adaptive resampling methods might be merged with bounds on resampling
numbers ⇒ in progress, unclear benefit for the moment.
16 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
17 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Variance reduction techniques
Monte Carlo [Hammersley and Handscomb, 1964, Billingsley, 1986]
ˆEf(x, ω) =
1
n
n
i=1
f(x, ωi ) → Eωf(x, ω). (7)
Quasi Monte Carlo [Cranley and Patterson, 1976, Niederreiter, 1992,
Wang and Hickernell, 2000, Mascagni and Chi, 2004]
Use samples aimed at being as uniform as possible over the domain.
18 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Variance reduction techniques: white-box
Antithetic variates
Ensure some regularity of the sampling by using symmetries
ˆEωf(x, ω)=1
n
n/2
i=1 (f(x, ωi ) + f(x, −ωi )) .
Importance sampling
Instead of sampling ω with density dP, we sample ω with density dP
ˆEωf(x, ω)=1
n
n
i=1
dP(ωi )
dP (ωi )
f(x, ωi ).
Control variates
Instead of estimating Eωf(x, ω), we estimate Eω (f(x, ω) − g(x, ω))
using Eωf(x, ω) = Eωg(x, ω)
A
+Eω (f(x, ω) − g(x, ω))
B
.
19 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Variance reduction techniques: grey-box
Common random numbers (CRN) or pairing
Use the same samples ω1, . . . , ωn for all the population xn,1, . . . , xn,λ.
Seedn = {seedn,1, . . . , seedn,mn }.
Eωf(xn,k , ω) is then approximated as
1
mn
mn
i=1
f(xn,k , seedn,i ).
Different forms of pairing:
Seedn is the same for all n
mn increases and nested sets Seedn, i.e.
∀n, i ≤ mn, mn+1 ≥ mn, seedn,i = seedn+1,i
all individuals in an offspring use the same seeds,
+ seeds are 100% changed between offspring
20 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Noisy Optimization
Optimization methods
Pairing: Partial conclusion
No details, just our conclusion:
“almost” black-box
easy to implement
applicable for most applications
On the realistic problem, pairing provided a great improvement
But there are counterexamples in which it is detrimental.
21 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio: state of the art
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
22 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio: state of the art
Portfolio of optimization algorithms
Usually:
Portfolio → Combinatorial Optimization (SAT Competition)
Recently:
Portfolio → Continuous Optimization [Baudiˇs and Poˇs´ık, 2014]
This work:
Portfolio → Noisy Optimization
→ Portfolio = choosing, online, between several algorithms
23 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Relationship between portfolio and noisy optimization
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
24 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Relationship between portfolio and noisy optimization
Why portfolio in Noisy Optimization?
Stochastic problem
limited budget (time or total number of evaluations)
target: anytime convergence to the optimum
black-box
2
How to choose a suitable solver?
2
Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html
25 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Relationship between portfolio and noisy optimization
Why portfolio in Noisy Optimization?
Stochastic problem
limited budget (time or total number of evaluations)
target: anytime convergence to the optimum
black-box
2
How to choose a suitable solver?
Algorithm Portfolios:
Select automatically the best in a finite set of solvers
2
Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html
25 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
26 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Portfolio of noisy optimization methods: proposal
A finite number of given noisy optimization solvers, “orthogonal”
Unfair distribution of budget
Information sharing (not very helpful here...)
→ Performs almost as well as the best solver
27 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Portfolio of noisy optimization methods: NOPA
Algorithm 1 Noisy Optimization Portfolio Algorithm (NOPA).
1: Input noisy optimization solvers Solver1, Solver2 . . . , SolverM
2: Input a lag function LAG : N+
→ N+
3: Input a non-decreasing integer sequence r1, r2, . . . Periodic comparisons
4: Input a non-decreasing integer sequence s1, s2, . . . Number of resamplings
5: n ← 1 Number of selections
6: m ← 1 NOPA’s iteration number
7: i∗
← null Index of recommended solver
8: x∗
← null Recommendation
9: while budget is not exhausted do
10: if m ≥ rn then
11: i∗
= arg min
i∈{1,...,M}
ˆEsn [f(˜xi,LAG(rn))] Algorithm selection
12: n ← n + 1
13: else
14: for i ∈ {1, . . . , M} do
15: Apply one evaluation for Solveri
16: end for
17: m ← m + 1
18: end if
19: x∗
= ˜xi∗,m Update recommendation
20: end while 28 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Portfolio of noisy optimization methods: compare solvers early
lag function:
LAG(n) ≤ n: lag
∀i ∈ {1, . . . , M}, xi,LAG(n) = or = xi,n
29 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Portfolio of noisy optimization methods: compare solvers early
lag function:
LAG(n) ≤ n: lag
∀i ∈ {1, . . . , M}, xi,LAG(n) = or = xi,n
Why this lag ?
algorithms’ ranking is usually stable → no use comparing the very last
it’s much cheaper to compare old points:
comparing good (i.e. recent) points → comparing points with similar fitness
comparing points with similar fitness → very expensive
29 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Portfolio of noisy optimization methods: Theorem with fair budget distribution
Theorem with fair budget distribution
Assume that
each solver i ∈ {1, . . . , M} has simple regret SRi,n = (1 + o(1)) Ci
nαi (as usual)
and noise variance = constant.
Then for some universal rn, sn, LAGn, a.s. there exists n0 such that, for n ≥ n0:
portfolio always chooses an optimal solver (optimal αi and Ci );
the portfolio uses ≤ M · rn(1 + o(1)) evaluations ⇒ M times more than the best
solver.
Interpretation
Negligible comparison budget (thanks to lag)
On classical log-log graphs, the portfolio should perform similarly to the best
solver, within the log(M) shift (proved)
30 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
INOPA: introducing an unfair budget
NOPA: same budget for all solvers.
Remark:
we compare old recommendations (LAGn << n)
they were known long ago, before spending all this budget
therefore, except selected solvers, most of the budget is wasted :(
⇒ Lazy evaluation paradigm: evaluate f(.) only when you need it for your output
⇒ Improved NOPA (INOPA): unfaired budget distribution
Use only LAG(rn) evaluations (negligible) on the sub-optimal solvers (INOPA)
log(M ) shift with M the number of optimal solvers (proved)
31 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Experiments: Unimodal case
Noisy Optimization Algorithms (NOAs):
SA-ES: Self-Adaptive Evolution Strategy
Fabian’s algorithm: a first-order method using gradients estimated by finite
differences [Dvoretzky et al., 1956, Fabian, 1967]
Noisy Newton’s algorithm: a second-order method using a Hessian matrix
approximated also by finite differences (our contribution in CNO)
Solvers z = 0 (constant var) z = 1 (linear var) z = 2 (quadratic var)
RSAES .114 ± .002 .118 ± .003 .113 ± .003
Fabian1 −.838 ± .003 −1.011 ± .003 −1.016 ± .003
Fabian2 .108 ± .003 −1.339 ± .003 −2.481 ± .003
Newton −.070 ± .003 −.959 ± .092 −2.503 ± .285
NOPA no lag −.377 ± .048 −.978 ± .013 −2.106 ± .003
NOPA −.747 ± .003 −.937 ± .005 −2.515 ± .095
INOPA −.822 ± .003 −1.359 ± .027 −3.528 ± .144
Table: Slope(SR) for f(x) = ||x||2 + ||x||z N in dimension 15. Computation time = 40s.
32 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Portfolio of noisy optimization methods
Experiments: Stochastic unit commitment problem
Solver d = 45 d = 63 d = 105 d = 125
RSAES .485 ± .071 .870 ± .078 .550 ± .097 .274 ± .097
Fabian1 1.339 ± .043 1.895 ± .040 1.075 ± .047 .769 ± .047
Fabian2 .394 ± .058 .521 ± .083 .436 ± .097 .307 ± .097
Newton .749 ± .101 1.138 ± .128 .590 ± .147 .312 ± .147
INOPA .394 ± .059 .547 ± .080 .242 ± .101 .242 ± .101
Table: Stochastic unit commitment problem (minimization). Computation time = 320s.
What’s more:
Given a same budget, a INOPA of identical solvers can outperform its mono-solvers.
33 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Conclusion
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
34 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Conclusion
Portfolio and noisy optimization: Conclusion
Main conclusion:
portfolios also great in noisy opt.
(because in noisy opt., with lag, comparison cost = small)
We show mathematically and empirically a log(M) shift when using M solvers, on
a classical log-log scale
Bound improved to log(M ) shift, with M = nb. of optimal solvers, with unfair
distribution of budget (INOPA)
35 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Portfolio and noisy optimization
Conclusion
Portfolio and noisy optimization: Conclusion
Main conclusion:
portfolios also great in noisy opt.
(because in noisy opt., with lag, comparison cost = small)
We show mathematically and empirically a log(M) shift when using M solvers, on
a classical log-log scale
Bound improved to log(M ) shift, with M = nb. of optimal solvers, with unfair
distribution of budget (INOPA)
Take-home messages
portfolio = little overhead
unfair budget = no overhead if “orthogonal” portfolio (orthogonal → M = 1)
We mathematically confirmed the idea of orthogonality found in
[Samulowitz and Memisevic, 2007]
35 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
36 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
Framework: Zero-sum matrix games
Game defined by matrix M
I choose (privately) i
Simultaneously, you choose (privately) j
I earn Mi,j
You earn −Mi,j
So this is zero-sum.
Figure: 0-sum matrix game.
rock paper scissors
rock 0.5 0 1
paper 1 0.5 0
scissors 0 1 0.5
Table: Example of 1-sum matrix game:
Rock-paper-scissors.
37 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
Framework: Nash Equilibrium (NE)
Definition (Nash Equilibrium)
Zero-sum matrix game M
My strategy = probability distrib. on rows = x
Your strategy = probability distrib. on cols = y
Expected reward = xT
My
There exists x∗
, y∗
such that ∀x, y,
xT
My∗
≤ x∗T
My∗
≤ x∗T
My. (8)
(x∗
, y∗
) is a Nash Equilibrium (no unicity).
Definition (Approximate -Nash Equilibria)
(x∗
, y∗
) such that
xT
My∗
− ≤ x∗T
My∗
≤ x∗T
My+ . (9)
Example: The NE of Rock-paper-scissors is unique: (1/3, 1/3, 1/3).
38 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
39 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
Methods for computing Nash Equilibrium
Algorithm Complexity Exact solution? Confidence Time
LP [von Stengel, 2002] O(Kα), α > 6 yes 1 constant
[Grigoriadis and Khachiyan, 1995] O(
K log(K)
2
) no 1 random
[Grigoriadis and Khachiyan, 1995]
O(
log2(K)
2
)
no 1 random
with K
log(K)
processors
EXP3 [Auer et al., 1995] O(
K log(K)
2
) no 1 − δ constant
Inf [Audibert and Bubeck, 2009] O(
K log(K)
2
) no 1 − δ constant
Our algorithm O(k3k K log K) yes 1 − δ constant
(if NE is k-sparse)
Table: State-of-the-art of computing Nash Equilibrium for ESMG MK×K .
40 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Adversarial bandit
Adversarial bandit algorithm Exp3.P
Algorithm 2 Exp3.P: variant of Exp3. η and γ are two parameters.
1: Input η ∈ R how much the distribution becomes peaked
2: Input γ ∈ (0, 1] exploration rate
3: Input a time horizon (computational budget) T ∈ N+
and the number of arms K ∈ N+
4: Output a Nash-optimal policy p
5: y ← 0
6: for i ← 1 to K do initialization
7: ωi ← exp( ηγ
3
T
K
)
8: end for
9: for t ← 1 to T do
10: for i ← 1 to K do
11: pi ← (1 − γ)
ωi
K
j=1
ωj
+ γ
K
12: end for
13: Generate it according to (p1, p2, . . . , pK )
14: Compute reward Rit ,t
15: for i ← 1 to K do
16: if i == it then
17: ˆRi ←
Rit ,t
pi
18: else
19: ˆRi ← 0
20: end if
21: ωi ← ωi exp γ
3K
(ˆRi + η
pi
√
TK
)
22: end for
23: end for
24: Return probability distribution (p1, p2, . . . , pK )
41 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Contribution for computing Nash Equilibrium
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
42 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Contribution for computing Nash Equilibrium
Sparse Nash Equilibria (1/2)
Considering x∗
a Nash-optimal policy for ZSMG MK×K :
Let us assume that x∗
is unique and has at most k non-zero components (sparsity).
Let us show that x∗
is “discrete”:
(Remark: Nash = solution of linear programming problem)
⇒ x∗
= also NE of a k × k submatrix: Mk×k
⇒ x∗
= solution of LP in dimension k
⇒ x∗
= solution of k lin. eq. with coefficients in {−1, 0, 1}
⇒ x∗
= inv-matrix × vector
⇒ x∗
= obtained by “cofactors / det matrix”
⇒ x∗
has denominator at most k
k
2
By Hadamard determinant bound [Hadamard, 1893], [Brenner and Cummings, 1972]
43 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Contribution for computing Nash Equilibrium
Sparse Nash Equilibria (2/2)
Computation of sparse Nash Equilibria
Under assumption that the Nash is sparse:
x∗
is rational with “small” denominator (previous slide!)
So let us compute an -Nash (with small enough!) (sublinear time!)
And let us compute its closest approximation with “small denominator”
(Hadamard)
Two new algorithms for exact Nash:
Rounding-EXP3: switch to closest approximation
Truncation-EXP3: remove small components and work on the remaining
submatrix (exact solving)
(requested precision k−3k/2
only ⇒ compl. k3k
K log K)
44 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Contribution for computing Nash Equilibrium
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
45 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Contribution for computing Nash Equilibrium
Our proposal: Parameter-free adversarial bandit
No details here; in short:
We compare various existing parametrizations of EXP3
We select the best
We add sparsity as follows:
for a budget of T rounds of EXP3, threshold = max
i∈{1,...,m}
(Txi )α
T
⇒ we get a parameter-free bandit for adversarial problems
46 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
47 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Scenarios
Policies
Simulator
Average perf.
Robustness
Average cost
technological berakthrough
CO2 penalization
Maintain a connection
Create new connection,...
Scenarios
Policy
R(k, s)
Examples of scenario: CO2 penalization, gas curtailment in Eastern Europe,
technological breakthrough
Examples of policy: massive nuclear power plant building, massive renewable
energies
48 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Nash-planning for scenario-based decision making
Decision tools
METHOD
EXTRACTION
EXTRACTION
COMPUTATIONAL
INTERPRETATION
OF POLICIES
OF CRITICAL
COST
SCENARIOS
Wald One One per policy K × S
Nature decides later,
minimizing our reward
Savage One One per policy K × S
Nature decides later,
maximizing our regret
Scenarios Handcrafted Handcrafted K × S Human expertise
our proposal: Nash Nash-optimal Nash-optimal (K + S) × log(K + S)(∗) Nature decides
privately, before us
Table: Comparison between several tools for decision under uncertainty. K = |K| and S = |S|.
⇒ in this case sparsity performs very well. (*)improved if sparse, by our previous result!
Nash ⇒ fast selection of scenarios and options: sparsity both
fastens the NE computation and
makes the output more readable (smaller matrix)
49 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: Testcase and parameterization
We consider (big toy problem):
310
investment policies (k)
39
scenarios (s)
reward: (k, s) → R(k, s)
50 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: Testcase and parameterization
We consider (big toy problem):
310
investment policies (k)
39
scenarios (s)
reward: (k, s) → R(k, s)
We
use Nash Equilibria, for their principled nature (Nature decides first and privately!
that’s reasonable, right ?) and low computational cost in large scale settings
compute the equilibria thanks to EXP3 (tuned)...
... with sparsity, for
improving the precision
reducing the number of pure strategies in our recommendation (unreadable matrix
otherwise!)
50 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: Sparse-Nash algorithm
Algorithm 3 The Sparse-Nash algorithm for solving decision under uncertainty prob-
lems.
Input A family K of possible decisions k (investment policies).
Input A family S of scenarios s.
Input A mapping (k, s) → Rk,s, providing the rewards
Run truncated.Exp3.P on R, get
a probability distribution on K (support = key options) and
a probability distribution on S (support = critical scenarios).
Emphasize the policy with highest probability.
51 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: Results
α
AVERAGE SPARSITY LEVEL OVER 310 = 59049 ARMS
T = K T = 10K T = 50K T = 100K T = 500K T = 1000K
0.1 13804 ± 52 non-sparse non-sparse non-sparse non-sparse non-sparse
0.3 2810 ± 59 non-sparse non-sparse non-sparse non-sparse non-sparse
0.5 396 ± 16 non-sparse non-sparse 59049 ± 197 49819 ± 195 non-sparse
0.7 43 ± 3 58925 ± 27 55383 ± 1507 46000 ± 278 9065 ± 160 non-sparse
0.9 4 ± 0 993 ± 64 797 ± 42 504 ± 25 98 ± 5 52633 ± 523
0.99 1 ± 0 2 ± 0 3 ± 0 2 ± 0 2 ± 0 7 ± 1
α
ROBUST SCORE: WORST REWARD AGAINST PURE STRATEGIES
T = K T = 10K T = 50K T = 100K T = 500K T = 1000K
NT 4.922e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01
0.1 4.948e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01
0.3 5.004e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01
0.5 5.059e-01 4.928e-01 4.956e-01 4.991e-01 5.242e-01 4.938e-01
0.7 5.054e-01 4.928e-01 4.965e-01 5.031e-01 5.317e-01 4.938e-01
0.9 4.281e-01 5.137e-01 5.151e-01 5.140e-01 5.487e-01 4.960e-01
0.99 3.634e-01 4.357e-01 4.612e-01 4.683e-01 5.242e-01 5.390e-01
Pure 3.505e-01 3.946e-01 4.287e-01 4.489e-01 5.143e-01 4.837e-01
Table: Average sparsity level and robust score. α is the truncation parameter. T is the budget.
52 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: summary
Define long term scenarios (plenty!) ?
Build simulator R(k, s)
Classical solution (Savage): min
k∈K
max
s∈S
regret(k, s)
Our proposal (Nash): automatically select submatrix
53 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to robust optimization (power systems)
Application to power investment problem: summary
Define long term scenarios (plenty!) ?
Build simulator R(k, s)
Classical solution (Savage): min
k∈K
max
s∈S
regret(k, s)
Our proposal (Nash): automatically select submatrix
Our proposed tool has the following advantages:
Natural extraction of interesting policies and critical scenarios:
α = .7 provides stable (and proved) results,
but the extracted submatrix becomes easily readable (small enough) with larger
values of α.
Faster than Wald or Savage methodologies.
Take-home messages
We get a fast criterion, faster than Wald’s or Savage’s criteria, with a natural
interpretation, and more readable ⇒ but stochastic recommendation!
53 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
54 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Two parts:
Seeds matter: **choose** your seeds !
More tricky but worth the effort: position-specific seeds !
(towards a better asymptotic behavior of MCTS ?)
55 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing random seeds: Correlations
Figure: Success rate per seed (ranked) in 5x5 Domineering, with standard deviations on y-axis:
the seed has a significant impact.
Fact: the random seed matters !
56 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing random seeds: State-of-the-art
Stochastic algorithms randomly select their pseudo-random seed.
We propose to choose the seed(s), and to combine them.
State-of-the-art for combining random seeds:
[Nagarajan et al., 2015] combines several AIs
[Gaudel et al., 2010] uses Nash methods for combining several opening books
[Saint-Pierre and Teytaud, 2014] constructs several AIs from a single stochastic
one and combines them by the BestSeed and Nash approaches
57 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Trick: present results with one white seed per column and one black seed per
row
... ...
Column player gets 1-Mi,j
Row player
gets Mi,j Mi,j
M1,1
M2,1
MK,1
M1,2
M1,K
M2,2
...
... ...
...
...
...
...
...
...
MK,K
MK,2
M2,K
...
K random seed for White
KrandomseedsforBlack
... ...
...
Figure: One black seed per row, one white seed per column.
58 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Propositions: Nash & BestSeed
Nash
Nash = combines rows (more robust; we will see later)
BestSeed
BestSeed = just pick up the best row / best column
59 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Better than squared matrices: rectangle methods
Remark:
for choosing a row, if #rows = #cols, then #rows is more critical than #cols;
for a given budget, increase #rows and decrease #cols (same budget!)
K
K
Kt
x Kt Kt
Kt
Figure: Left: square matrix of a game; right: rectangles of a game (K >> Kt ).
60 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Does it work ? experiments on Domineering
The opponent uses seeds which have never been used during the learning of the
portfolio (cross-validation).
Figure: Results for domineering, with the BestSeed (left) and the Nash (right) approach, against the baseline (K = 1) and the
exploiter ( K > 1; opponent who “learns” very well). Kt = 900 in all experiments.
BestSeed performs well against the original algorithm (K = 1), but poorly
against the exploiter ( K > 1).
Nash outperforms the original algorithm both w.r.t K = 1 (all cases) and K > 1
(most cases).
61 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Beyond cross-validation: experiments with transfer in the game of Go
Learning: BestSeed is applied to GnuGo, with MCTS and a budget of 400
simulations.
Test: against “classical” GnuGo, i.e. the non-MCTS version of GnuGo.
Opponent Performance of BestSeed Performance with randomized seed
GnuGo-classical level 1 1. (± 0 ) .995 (± 0 )
GnuGo-classical level 2 1. (± 0 ) .995 (± 0 )
GnuGo-classical level 3 1. (± 0 ) .99 (± 0 )
GnuGo-classical level 4 1. (± 0 ) 1. (± 0 )
GnuGo-classical level 5 1. (± 0 ) 1. (± 0 )
GnuGo-classical level 6 1. (± 0 ) 1. (± 0 )
GnuGo-classical level 7 .73 (± .013 ) .061 (± .004 )
GnuGo-classical level 8 .73 (± .013 ) .106 (± .006 )
GnuGo-classical level 9 .73 (± .013 ) .095 (± .006 )
GnuGo-classical level 10 .73 (± .013 ) .07 (± .004 )
Table: Performance of “BestSeed” and “randomized seed” against “classical” GnuGo.
Previous slide: we win against the AI which we have trained (but different seeds!).
This slide: we improve the winning rate against another AI.
62 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing random seeds: Partial conclusion
Conclusion:
Seed optimization (NOT position specific) = can be seen as a simple and
effective tool for building an opening book with no development effort, no human
expertise, no storage of database.
“Rectangle” provides significant improvements.
The online computational overhead of the methods is negligible.
The boosted AIs significantly outperform the baselines.
BestSeed performs well, but can be overfitted ⇒ strength of Nash.
Further work:
The use of online bandit algorithms for dynamically choosing K/Kt .
Note:
The BestSeed and the Nash algorithms are not new.
The algorithm and analysis of rectangles is new.
The analysis of the impact of seeds is new.
The applications to Domineering, Atari-go and Breakthfrough are new.
63 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Two parts:
Seeds matter: **choose** your seeds !
More tricky but worth the effort: position-specific seeds !
(towards a better asymptotic behavior of MCTS ?)
64 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing position-based random seeds: Tsumego
Tsumego (by Yoji Ojima, Zen’s author)
Input: a Go position
Question: is this situation a win for white ?
Output: yes or no
65 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing position-based random seeds: Tsumego
Tsumego (by Yoji Ojima, Zen’s author)
Input: a Go position
Question: is this situation a win for white ?
Output: yes or no
Why so important?
At the heart of many game algorithms
In Go, Exptime complete [Robson, 1983]
65 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Classical algorithms
Monte Carlo (MC)
[Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007]
Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006]
Nested MC [Cazenave, 2009]
Voting scheme among MCTS [Gavin et al., ]
66 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Classical algorithms
Monte Carlo (MC)
[Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007]
Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006]
Nested MC [Cazenave, 2009]
Voting scheme among MCTS [Gavin et al., ]
⇒ here weighted voting scheme among MCTS
66 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Evaluation of the game value
Algorithm 4 Evaluation of the game value.
1: Input current state s
2: Input a policy πB for Black, depending on a seed in N+
3: Input a policy πW for White, depending on a seed in N+
4: for i ∈ {1, . . . , K} do
5: for j ∈ {1, . . . , K} do
6: Mi,j ← outcome of the game starting in s with πB playing as Black with seed
b(i) and πW playing as White with seed w(j)
7: end for
8: end for
9: Compute weights p for Black and q for White for the matrix M (either BestSeed,
Nash, or other)
10: Return pT
Mq approximate value of the game M
67 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Classical case (MC/MCTS): unpaired Monte Carlo averaging
b(1)
K*K random seeds for Black
b(i)b(2) b(K*K)... ...
w(1)
K*K random seeds for White
w(i)w(2) w(K*K)... ...
... ...
Column player gets 1-Mi,j
Row player
gets Mi,j Mi,j
M1,1
M2,1
MK,1
M1,2
M1,K
M2,2
...
... ...
...
...
...
...
...
...
MK,K
MK,2
M2,K
...
K random seed for White
KrandomseedsforBlack
... ...
...
Figure: Left: unpaired case (classical estimate by averaging); right: paired case: K seeds vs K
seeds.
68 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Experiments: Applied methods and setting
Compared methods for approximating v(s)
Three methods use K2
indep. batches of M MCTS-simulations using matrix of
seeds:
Nash reweighting = Nash-value
BestSeed reweighting = Intersection best row / best col
Paired MC estimate = Average of the matrix
69 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Experiments: Applied methods and setting
Compared methods for approximating v(s)
Three methods use K2
indep. batches of M MCTS-simulations using matrix of
seeds:
Nash reweighting = Nash-value
BestSeed reweighting = Intersection best row / best col
Paired MC estimate = Average of the matrix
One unpaired method: classical MC estimate (the average of K2
random
MCTS)
Baseline: a single long MCTS (=state of the art !)
→only one which is not K2
-parallel
69 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Experiments: Applied methods and setting
Compared methods for approximating v(s)
Three methods use K2
indep. batches of M MCTS-simulations using matrix of
seeds:
Nash reweighting = Nash-value
BestSeed reweighting = Intersection best row / best col
Paired MC estimate = Average of the matrix
One unpaired method: classical MC estimate (the average of K2
random
MCTS)
Baseline: a single long MCTS (=state of the art !)
→only one which is not K2
-parallel
Parameter setting: GnuGo-MCTS [Bayer et al., 2008]
setting A: 1 000 simulations per move
setting B: 80 000 simulations per move
69 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Experiments: Average results over 50 Tsumego problems
0 200 400 600 800 1000
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
Submatrix Size (N
2
)
Performance
Nash
Paired
Best
Unpaired
MCTS(1)
(a) setting A: 1 000 simulations per move.
0 200 400 600 800 1000
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
Submatrix Size (N
2
)
Performance
Nash
Paired
Best
Unpaired
MCTS(1)
(b) setting B: 80 000 simulations per move.
Figure: Average over 50 Tsumego problems. x-axis: #simulations, y-axis: %correct answers.
MCTS(1): one single MCTS run using all the budget.
Setting A (small budget): MCTS(1) outperforms weighted average of 81 MCTS
runs (but we are more parallel !)
Setting B (large budget): we outperform MCTS and all others by far
⇒ consistent with the limited scalability of MCTS for huge number of sim.
70 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing position-based random seeds: Partial conclusion
Main conclusion:
novel way of evaluating game values using Nash Equilibrium
(theoretical validation & experiments on 50 Tsumego problems)
Nash or BestSeed predictor requires far less simulations for finding accurate
results + sometimes consistent whereas original MC is not !
71 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing position-based random seeds: Partial conclusion
Main conclusion:
novel way of evaluating game values using Nash Equilibrium
(theoretical validation & experiments on 50 Tsumego problems)
Nash or BestSeed predictor requires far less simulations for finding accurate
results + sometimes consistent whereas original MC is not !
We outperformed
average of MCTS runs sharing the budget
a single MCTS using all the budget
→ For M large enough, our weighted averaging of 81 single MCTS runs with M
simulations is better than a MCTS run with 81M simulations :)
71 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Application to games
Optimizing position-based random seeds: Partial conclusion
Main conclusion:
novel way of evaluating game values using Nash Equilibrium
(theoretical validation & experiments on 50 Tsumego problems)
Nash or BestSeed predictor requires far less simulations for finding accurate
results + sometimes consistent whereas original MC is not !
We outperformed
average of MCTS runs sharing the budget
a single MCTS using all the budget
→ For M large enough, our weighted averaging of 81 single MCTS runs with M
simulations is better than a MCTS run with 81M simulations :)
Take-home messages
We classify positions (“black wins” vs “white wins”).
We use a WEIGHTED average of K2
MCTS runs of M simulations.
Our approach outperforms:
all tested voting schemes among K2
MCTS estimates of M simulations,
and a pure MCTS of K2
× M simulations,
when M is large and K2
= 81.
71 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Conclusion
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
72 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Conclusion
A work on sparsity, at the core of ZSMG
A parameter-free adversarial bandit, obtained by tuning (no details provided in
this talk) + sparsity
Applications of ZSMG:
Nash + Sparsity → faster + more readable robust decision making
73 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Adversarial portfolio
Conclusion
A work on sparsity, at the core of ZSMG
A parameter-free adversarial bandit, obtained by tuning (no details provided in
this talk) + sparsity
Applications of ZSMG:
Nash + Sparsity → faster + more readable robust decision making
Random seeds = new MCTS variants ?
validated as opening book learning (Go, Atari-Go, Domineering, Breakthrough,
Draughts,Phantom-Go. . . )
position-specific seeds validated on Tsumego
73 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
1 Motivation
2 Noisy Optimization
Optimization criteria for black-box noisy optimization
Optimization methods
Resampling methods
Pairing
3 Portfolio and noisy optimization
Portfolio: state of the art
Relationship between portfolio and noisy optimization
Portfolio of noisy optimization methods
Conclusion
4 Adversarial portfolio
Adversarial bandit
Adversarial Framework
State-of-the-art
Contribution for computing Nash Equilibrium
Sparsity: sparse NE can be computed faster
Parameter-free adversarial bandit for large-scale problems
Application to robust optimization (power systems)
Application to games
Conclusion
5 Conclusion
74 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Conclusion & Further work
Noisy opt:
An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to
other surrogate models
ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria
(UR); for other cases resamplings are not sufficient for optimal rates (“mutate large
inherit small” + huge population and/or surrogate models...)
75 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Conclusion & Further work
Noisy opt:
An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to
other surrogate models
ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria
(UR); for other cases resamplings are not sufficient for optimal rates (“mutate large
inherit small” + huge population and/or surrogate models...)
Portfolio:
Application to noisy opt.; great benefits with several solvers of a given model
Towards wider applications: portfolio of models ?
75 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Conclusion & Further work
Noisy opt:
An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to
other surrogate models
ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria
(UR); for other cases resamplings are not sufficient for optimal rates (“mutate large
inherit small” + huge population and/or surrogate models...)
Portfolio:
Application to noisy opt.; great benefits with several solvers of a given model
Towards wider applications: portfolio of models ?
Adversarial portfolio: successful use of sparsity; parameter-free bandits ?
75 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Conclusion & Further work
Noisy opt:
An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to
other surrogate models
ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria
(UR); for other cases resamplings are not sufficient for optimal rates (“mutate large
inherit small” + huge population and/or surrogate models...)
Portfolio:
Application to noisy opt.; great benefits with several solvers of a given model
Towards wider applications: portfolio of models ?
Adversarial portfolio: successful use of sparsity; parameter-free bandits ?
MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-)
75 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Conclusion & Further work
Noisy opt:
An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to
other surrogate models
ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria
(UR); for other cases resamplings are not sufficient for optimal rates (“mutate large
inherit small” + huge population and/or surrogate models...)
Portfolio:
Application to noisy opt.; great benefits with several solvers of a given model
Towards wider applications: portfolio of models ?
Adversarial portfolio: successful use of sparsity; parameter-free bandits ?
MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-)
Most works here → ROBUSTNESS by COMBINATION
(robust to solvers, to models, to parameters, to seeds ...)
75 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
Conclusion
Thanks for your attention !
Thanks to all the collaborators from Artelys, INRIA, CNRS, Univ.
Paris-Saclay, Univ. Paris-Dauphine, Univ. du Littoral, NDHU ...
76 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references I
Audibert, J.-Y. and Bubeck, S. (2009).
Minimax policies for adversarial and stochastic bandits.
In proceedings of the Annual Conference on Learning Theory (COLT).
Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. (1995).
Gambling in a rigged casino: the adversarial multi-armed bandit problem.
In Proceedings of the 36th Annual Symposium on Foundations of Computer
Science, pages 322–331. IEEE Computer Society Press, Los Alamitos, CA.
Auger, A. (2005).
Convergence results for the (1, λ)-sa-es using the theory of φ-irreducible markov
chains.
Theoretical Computer Science, 334(1):35–69.
Baudiˇs, P. and Poˇs´ık, P. (2014).
Online black-box algorithm portfolios for continuous optimization.
In Parallel Problem Solving from Nature–PPSN XIII, pages 40–49. Springer.
77 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references II
Bayer, A., Bump, D., Daniel, E. B., Denholm, D., Dumonteil, J., Farneb¨ack, G.,
Pogonyshev, P., Traber, T., Urvoy, T., and Wallin, I. (2008).
Gnu go 3.8 documentation.
Technical report, Free Software Fundation.
Billingsley, P. (1986).
Probability and Measure.
John Wiley and Sons.
Bouzy, B. (2004).
Associating shallow and selective global tree search with Monte Carlo for 9x9 Go.
In 4rd Computer and Games Conference, Ramat-Gan.
Brenner, J. and Cummings, L. (1972).
The Hadamard maximum determinant problem.
Amer. Math. Monthly, 79:626–630.
Bruegmann, B. (1993).
Monte-carlo Go (unpublished draft
http://www.althofer.de/bruegmann-montecarlogo.pdf).
78 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references III
Bubeck, S., Munos, R., and Stoltz, G. (2011).
Pure exploration in finitely-armed and continuous-armed bandits.
Theoretical Computer Science, 412(19):1832–1852.
Cazenave, T. (2006).
A phantom-go program.
In van den Herik, H. J., Hsu, S.-C., Hsu, T.-S., and Donkers, H. H. L. M., editors,
Proceedings of Advances in Computer Games, volume 4250 of Lecture Notes in
Computer Science, pages 120–125. Springer.
Cazenave, T. (2009).
Nested monte-carlo search.
In Boutilier, C., editor, IJCAI, pages 456–461.
Cazenave, T. and Borsboom, J. (2007).
Golois wins phantom go tournament.
ICGA Journal, 30(3):165–166.
79 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references IV
Coulom, R. (2006).
Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search.
In P. Ciancarini and H. J. van den Herik, editors, Proceedings of the 5th
International Conference on Computers and Games, Turin, Italy, pages 72–83.
Cranley, R. and Patterson, T. (1976).
Randomization of number theoretic methods for multiple integration.
SIAM J. Numer. Anal., 13(6):904,1914.
Dupaˇc, V. (1957).
O Kiefer-Wolfowitzovˇe aproximaˇcn´ı Methodˇe.
ˇCasopis pro pˇestov´an´ı matematiky, 082(1):47–75.
Dvoretzky, A., Kiefer, J., and Wolfowitz, J. (1956).
Asymptotic minimax character of the sample distribution function and of the
classical multinomial estimator.
Annals of Mathematical Statistics, 33:642–669.
Fabian, V. (1967).
Stochastic Approximation of Minima with Improved Asymptotic Speed.
Annals of Mathematical statistics, 38:191–200.
80 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references V
Gaudel, R., Hoock, J.-B., P´erez, J., Sokolovska, N., and Teytaud, O. (2010).
A Principled Method for Exploiting Opening Books.
In International Conference on Computers and Games, pages 136–144,
Kanazawa, Japon.
Gavin, C., Stewart, S., and Drake, P.
Result aggregation in root-parallelized computer go.
Grigoriadis, M. D. and Khachiyan, L. G. (1995).
A sublinear-time randomized approximation algorithm for matrix games.
Operations Research Letters, 18(2):53–58.
Hadamard, J. (1893).
R´esolution d’une question relative aux d´eterminants.
Bull. Sci. Math., 17:240–246.
Hammersley, J. and Handscomb, D. (1964).
Monte carlo methods, methuen & co.
Ltd., London, page 40.
81 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references VI
Heidrich-Meisner, V. and Igel, C. (2009).
Hoeffding and bernstein races for selecting policies in evolutionary direct policy
search.
In ICML ’09: Proceedings of the 26th Annual International Conference on
Machine Learning, pages 401–408, New York, NY, USA. ACM.
Jebalia, M. and Auger, A. (2008).
On multiplicative noise models for stochastic search.
In et a.l., G. R., editor, Conference on Parallel Problem Solving from Nature
(PPSN X), volume 5199, pages 52–61, Berlin, Heidelberg. Springer Verlag.
Liu, J., Saint-Pierre, D. L., Teytaud, O., et al. (2014).
A mathematically derived number of resamplings for noisy optimization.
In Genetic and Evolutionary Computation Conference (GECCO 2014).
Mascagni, M. and Chi, H. (2004).
On the scrambled halton sequence.
Monte-Carlo Methods Appl., 10(3):435–442.
82 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references VII
Mnih, V., Szepesv´ari, C., and Audibert, J.-Y. (2008).
Empirical Bernstein stopping.
In ICML ’08: Proceedings of the 25th international conference on Machine
learning, pages 672–679, New York, NY, USA. ACM.
Nagarajan, V., Marcolino, L. S., and Tambe, M. (2015).
Every team deserves a second chance: Identifying when things go wrong
(student abstract version).
In 29th Conference on Artificial Intelligence (AAAI 2015), Texas, USA.
Niederreiter, H. (1992).
Random Number Generation and Quasi-Monte Carlo Methods.
Rechenberg, I. (1973).
Evolutionstrategie: Optimierung Technischer Systeme nach Prinzipien des
Biologischen Evolution.
Fromman-Holzboog Verlag, Stuttgart.
Robson, J. M. (1983).
The complexity of go.
In IFIP Congress, pages 413–417.
83 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references VIII
Rolet, P. and Teytaud, O. (2010).
Adaptive noisy optimization.
In Di Chio, C., Cagnoni, S., Cotta, C., Ebner, M., Ek´art, A., Esparcia-Alcazar, A.,
Goh, C.-K., Merelo, J., Neri, F., Preuß, M., Togelius, J., and Yannakakis, G.,
editors, Applications of Evolutionary Computation, volume 6024 of Lecture Notes
in Computer Science, pages 592–601. Springer Berlin Heidelberg.
Saint-Pierre, D. L. and Teytaud, O. (2014).
Nash and the Bandit Approach for Adversarial Portfolios.
In CIG 2014 - Computational Intelligence in Games, Computational Intelligence
in Games, page 7, Dortmund, Germany. IEEE, IEEE.
Samulowitz, H. and Memisevic, R. (2007).
Learning to solve qbf.
In Proceedings of the 22nd National Conference on Artificial Intelligence, pages
255–260. AAAI.
Shamir, O. (2013).
On the complexity of bandit and derivative-free stochastic convex optimization.
In COLT 2013 - The 26th Annual Conference on Learning Theory, June 12-14,
2013, Princeton University, NJ, USA, pages 3–24.
84 / 76
PORTFOLIO METHODS IN UNCERTAIN CONTEXTS
References
Some references IX
Storn, R. (1996).
On the usage of differential evolution for function optimization.
In Fuzzy Information Processing Society, 1996. NAFIPS. 1996 Biennial
Conference of the North American, pages 519–523. IEEE.
von Stengel, B. (2002).
Computing equilibria for two-person games.
Handbook of Game Theory, 3:1723 – 1759.
Wang, X. and Hickernell, F. (2000).
Randomized halton sequences.
Math. Comput. Modelling, 32:887–899.
85 / 76

Mais conteúdo relacionado

Mais procurados

An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...IJMIT JOURNAL
 
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmOptimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmTasuku Soma
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaAlexander Litvinenko
 
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Christian Robert
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problemjfrchicanog
 
A Unifying Review of Gaussian Linear Models (Roweis 1999)
A Unifying Review of Gaussian Linear Models (Roweis 1999)A Unifying Review of Gaussian Linear Models (Roweis 1999)
A Unifying Review of Gaussian Linear Models (Roweis 1999)Feynman Liang
 
Lec09- AI
Lec09- AILec09- AI
Lec09- AIdrmbalu
 
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean OptimizationEfficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimizationjfrchicanog
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithmsGanesh Solanke
 
Maximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeMaximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeTasuku Soma
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydSri Ambati
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Guillaume Costeseque
 
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Kohei Hayashi
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation AlgorithmsJhoirene Clemente
 

Mais procurados (20)

An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...An improved spfa algorithm for single source shortest path problem using forw...
An improved spfa algorithm for single source shortest path problem using forw...
 
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmOptimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
Discussion of Fearnhead and Prangle, RSS&lt; Dec. 14, 2011
 
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization ProblemElementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
Elementary Landscape Decomposition of the Hamiltonian Path Optimization Problem
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
A Unifying Review of Gaussian Linear Models (Roweis 1999)
A Unifying Review of Gaussian Linear Models (Roweis 1999)A Unifying Review of Gaussian Linear Models (Roweis 1999)
A Unifying Review of Gaussian Linear Models (Roweis 1999)
 
Lec09- AI
Lec09- AILec09- AI
Lec09- AI
 
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean OptimizationEfficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
Efficient Hill Climber for Multi-Objective Pseudo-Boolean Optimization
 
Approximation algorithms
Approximation algorithmsApproximation algorithms
Approximation algorithms
 
Maximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeMaximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer Lattice
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...
 
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Lecture26
Lecture26Lecture26
Lecture26
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
 

Destaque (7)

Timeline of Analytics
Timeline of AnalyticsTimeline of Analytics
Timeline of Analytics
 
Can an SMS save your life?
Can an SMS save your life?Can an SMS save your life?
Can an SMS save your life?
 
Tick to Know
Tick to KnowTick to Know
Tick to Know
 
Ahmed Munif CV update
Ahmed Munif CV updateAhmed Munif CV update
Ahmed Munif CV update
 
Brochure-2015
Brochure-2015Brochure-2015
Brochure-2015
 
Stats Blog
Stats BlogStats Blog
Stats Blog
 
Quake city activity 1 slideshare 2
Quake city activity 1 slideshare 2Quake city activity 1 slideshare 2
Quake city activity 1 slideshare 2
 

Semelhante a My PhD defence

Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyOlivier Teytaud
 
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...Laurent Duval
 
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdfDeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdfSean Meyn
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...AIST
 
H infinity optimal_approximation_for_cau
H infinity optimal_approximation_for_cauH infinity optimal_approximation_for_cau
H infinity optimal_approximation_for_cauAl Vc
 
15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdfAllanKelvinSales
 
Noisy Optimization Convergence Rates (GECCO2013)
Noisy Optimization Convergence Rates (GECCO2013)Noisy Optimization Convergence Rates (GECCO2013)
Noisy Optimization Convergence Rates (GECCO2013)Jialin LIU
 
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...Jialin LIU
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeMagdi Mohamed
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practiceguest3550292
 
Dynamics of structures with uncertainties
Dynamics of structures with uncertaintiesDynamics of structures with uncertainties
Dynamics of structures with uncertaintiesUniversity of Glasgow
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesVissarion Fisikopoulos
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdfAnaNeacsu5
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsFabian Pedregosa
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Yandex
 
Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Soma Boubou
 

Semelhante a My PhD defence (20)

Noisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) SurveyNoisy optimization --- (theory oriented) Survey
Noisy optimization --- (theory oriented) Survey
 
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
 
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdfDeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
 
H infinity optimal_approximation_for_cau
H infinity optimal_approximation_for_cauH infinity optimal_approximation_for_cau
H infinity optimal_approximation_for_cau
 
15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf15.sp.dictionary_draft.pdf
15.sp.dictionary_draft.pdf
 
Noisy Optimization Convergence Rates (GECCO2013)
Noisy Optimization Convergence Rates (GECCO2013)Noisy Optimization Convergence Rates (GECCO2013)
Noisy Optimization Convergence Rates (GECCO2013)
 
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
 
Lecture 3 sapienza 2017
Lecture 3 sapienza 2017Lecture 3 sapienza 2017
Lecture 3 sapienza 2017
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
 
Dynamics of structures with uncertainties
Dynamics of structures with uncertaintiesDynamics of structures with uncertainties
Dynamics of structures with uncertainties
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by OraclesEfficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdf
 
236628934.pdf
236628934.pdf236628934.pdf
236628934.pdf
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and Algorithms
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
 
Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015
 
Lt2419681970
Lt2419681970Lt2419681970
Lt2419681970
 

Último

Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalFabian de Rijk
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCamilleBoulbin1
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Pooja Nehwal
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedDelhi Call girls
 

Último (18)

Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
 

My PhD defence

  • 1. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Jialin LIU advised by: Olivier Teytaud & Marc Schoenauer TAO, Inria, Univ. Paris-Saclay, UMR CNRS 8623, France December 11, 2015 1 / 76
  • 2. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Motivation Motivation Why noisy optimization (i.e. optim. in front of a stochastic model) ? Not that many works on noisy optimization faults in networks: you can not use an average over 50 years (many lines would be 100% guaranteed) ⇒ you need a (stochastic) model of faults Why adversarial (i.e. worst case) problems ? Critical problems with uncertainties (technological breakthroughs, CO2 penalization ...) Why portfolio (i.e. combining/selecting solvers) ? Great in combinatorial optimization → let us generalize :) Why MCTS ? Great recent tool Still many things to do All related ? All applicable to games All applicable to power systems Nash ⇒ mixed strategy portfolio 2 / 76
  • 3. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization criteria for black-box noisy optimization 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 3 / 76
  • 4. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization criteria for black-box noisy optimization Black-box Noisy Optimization Framework f : x → f(x, ω) from a domain D ⊂ Rd → Continuous optimization to R with random variable ω. Goal x∗ = argmin x∈Rd Eωf(x, ω) i.e. access to independent evaluations of f. Black-Box case: → do not use any internal property of f → access to f(x) only, not f(x) → for a given x: randomly samples ω and returns f(x, ω) → for its nth request, returns f(x, ωn) x −→ −→ f(x, ω) 4 / 76
  • 5. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization criteria for black-box noisy optimization Optimization criteria: State-of-the-art Noise-free case: log-linear convergence [Auger, 2005, Rechenberg, 1973] log ||xn − x∗ || n ∼ A < 0 (1) Noisy case: log-log convergence [Fabian, 1967] log ||xn − x∗ || log(n) ∼ A < 0 (2) Figure: y-axis: log ||xn − x∗||, x-axis:#eval for log-linear convergence in noise-free case or log #eval for log-log convergence in noisy case. 5 / 76
  • 6. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization criteria for black-box noisy optimization Optimization criteria: Convergence rates Slopes for Uniform Rate, Simple Regret1 [Bubeck et al., 2011] and Cumulative Regret x∗ : the optimum of f xn: the nth evaluated search point ˜xn: the optimum estimated after nth evaluation Uniform Rate URn = ||xn − x∗ || → all search points matter Simple Regret SRn = Eωf(˜xn, ω) − Eωf(x∗ , ω) → final recommendation matters Cumulative Regret CRn = j≤n (Eωf(xj , ω) − Eωf(x∗ , ω)) → all recommendations matter Convergence rates: Slope(UR) = lim sup n→∞ log(URn) log(n) (3) Slope(SR) = lim sup n→∞ log(SRn) log(n) (4) Slope(CR) = lim sup n→∞ log(CRn) log(n) . (5) 1 Simple Regret = difference between expected payoff recommended vs optimal. 6 / 76
  • 7. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 7 / 76
  • 8. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Tricks for handling noise: Resampling: average multiple evaluations Large population Surrogate models Specific methods (stochastic gradient descent with finite differences) Here: focus on resampling Resampling number: how many times do we resample noise ? 8 / 76
  • 9. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Resampling methods: Non-adaptive resampling methods [Recall] log-log convergence: log ||xn−x∗ || log(n) ∼ A < 0, n is evaluation number Non-adaptive rules: Exponential rules with ad hoc parameters ⇒ log-log convergence (mathematically proved by us) Other rules as a function of #iter: square root, linear rules, polynomial rules Other rules as a function of #iter and dimension 9 / 76
  • 10. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Resampling methods: Adaptive resampling methods Adaptive rules: Bernstein [Mnih et al., 2008, Heidrich-Meisner and Igel, 2009] Here: FOR a pair of search points x, x to be compared DO WHILE computation time is not elapsed DO 1000 resamplings for x and x IF mean(difference) >> std THEN break ENDIF ENDWHILE ENDFOR 10 / 76
  • 11. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Resampling methods: Comparison With Continuous Noisy Optimization (CNO) With Evolution Strategies (ES) With Differential Evolution (DE) 11 / 76
  • 12. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Comparison with CNO Continuous Noisy Optimization: we propose Iterative Noisy Optimization Algorithm (INOA) as a general framework for noisy optimization. Key points: Sampler which chooses a sampling around the current approximation, Opt which updates the approximation of the optimum, resampling number rn = B nβ and sampling step-size σn = A/nα Main application: finite differences sampling + quadratic model 12 / 76
  • 13. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Comparison with CNO: State-of-the-art and our results 3 types of noise: constant, linear or quadratic as a function of the SR: Var(f(x, ω)) = O [Eωf(x, ω) − Eωf(x∗ , ω)] z (6) with z ∈ {0, 1, 2}. z optimized for CR optimized for SR slope(SR) slope(CR) slope(SR) slope(CR) 0 (constant var) − 1 2 1 2 − 2 3 2 3[Fabian, 1967] [Dupaˇc, 1957] [Shamir, 2013] 0 and −1 ∞-differentiable [Fabian, 1967] 0 and “quadratic” −1 [Dupaˇc, 1957] 1 (linear var) −1 0 −1 0 [Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010] [Rolet and Teytaud, 2010] 2 (quadratic var) −∞ 0 −∞ 0 [Jebalia and Auger, 2008] [Jebalia and Auger, 2008] [Jebalia and Auger, 2008] [Jebalia and Auger, 2008] Table: State-of-the-art: Convergence rates. Blue: existing results, we also achieved. Red: new results by us. Main application: finite differences sampling + quadratic model Various (new, proved) rates depending on assumptions Recovers existing rates (with a same algorithm) and beyond 13 / 76
  • 14. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Comparison with CNO: Results & Discussion Our proposed algorithm (provably) reaches the same rate as Kiefer-Wolfowitz algorithm when the noise has constant variance as Bernstein-races optimization algorithms when the noise variance decreases linearly as a function of the simple regret as Evolution Strategies when the noise variance decreases quadratically as a function of the simple regret ⇒ no details here, focus on ES and DE. 14 / 76
  • 15. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods What about evolutionary algorithms ? Experiments with variance noise = constant (hard case) Algorithms: ES + resampling DE + resamplnig Results: slope(SR) = −1 2 in both cases (with e.g. rules depending on #iter and dimension) 5 10 15 20 25 5 10 15 N1.01exp N1.1exp N2exp Nscale Figure: Modified function F4 of CEC2005, dimension 2. x-axis: log(#eval); y-axis: log(SR). 15 / 76
  • 16. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Resampling methods: Partial conclusion Conclusion: Adaptation of Newton’s algorithm for noisy fitness ( f and Hf approximated by finite differences+resamplings) → leads to fast convergence rates + recovers many rates in one alg. + generic framework (but no proved application besides quadratic surrogate model) Non-adaptive methods lead to log-log convergence (math+xp) in ES Nscale = d−2 exp( 4n 5d ) ok (slope(SR) = −1 2 ) for both ES and DE (nb: −1 possible with large mutation + small inheritance) In progress: Adaptive resampling methods might be merged with bounds on resampling numbers ⇒ in progress, unclear benefit for the moment. 16 / 76
  • 17. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 17 / 76
  • 18. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques Monte Carlo [Hammersley and Handscomb, 1964, Billingsley, 1986] ˆEf(x, ω) = 1 n n i=1 f(x, ωi ) → Eωf(x, ω). (7) Quasi Monte Carlo [Cranley and Patterson, 1976, Niederreiter, 1992, Wang and Hickernell, 2000, Mascagni and Chi, 2004] Use samples aimed at being as uniform as possible over the domain. 18 / 76
  • 19. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques: white-box Antithetic variates Ensure some regularity of the sampling by using symmetries ˆEωf(x, ω)=1 n n/2 i=1 (f(x, ωi ) + f(x, −ωi )) . Importance sampling Instead of sampling ω with density dP, we sample ω with density dP ˆEωf(x, ω)=1 n n i=1 dP(ωi ) dP (ωi ) f(x, ωi ). Control variates Instead of estimating Eωf(x, ω), we estimate Eω (f(x, ω) − g(x, ω)) using Eωf(x, ω) = Eωg(x, ω) A +Eω (f(x, ω) − g(x, ω)) B . 19 / 76
  • 20. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Variance reduction techniques: grey-box Common random numbers (CRN) or pairing Use the same samples ω1, . . . , ωn for all the population xn,1, . . . , xn,λ. Seedn = {seedn,1, . . . , seedn,mn }. Eωf(xn,k , ω) is then approximated as 1 mn mn i=1 f(xn,k , seedn,i ). Different forms of pairing: Seedn is the same for all n mn increases and nested sets Seedn, i.e. ∀n, i ≤ mn, mn+1 ≥ mn, seedn,i = seedn+1,i all individuals in an offspring use the same seeds, + seeds are 100% changed between offspring 20 / 76
  • 21. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Noisy Optimization Optimization methods Pairing: Partial conclusion No details, just our conclusion: “almost” black-box easy to implement applicable for most applications On the realistic problem, pairing provided a great improvement But there are counterexamples in which it is detrimental. 21 / 76
  • 22. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio: state of the art 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 22 / 76
  • 23. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio: state of the art Portfolio of optimization algorithms Usually: Portfolio → Combinatorial Optimization (SAT Competition) Recently: Portfolio → Continuous Optimization [Baudiˇs and Poˇs´ık, 2014] This work: Portfolio → Noisy Optimization → Portfolio = choosing, online, between several algorithms 23 / 76
  • 24. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 24 / 76
  • 25. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization Why portfolio in Noisy Optimization? Stochastic problem limited budget (time or total number of evaluations) target: anytime convergence to the optimum black-box 2 How to choose a suitable solver? 2 Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html 25 / 76
  • 26. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Relationship between portfolio and noisy optimization Why portfolio in Noisy Optimization? Stochastic problem limited budget (time or total number of evaluations) target: anytime convergence to the optimum black-box 2 How to choose a suitable solver? Algorithm Portfolios: Select automatically the best in a finite set of solvers 2 Image from http://ethanclements.blogspot.fr/2010/12/postmodernism-essay-question.html 25 / 76
  • 27. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 26 / 76
  • 28. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: proposal A finite number of given noisy optimization solvers, “orthogonal” Unfair distribution of budget Information sharing (not very helpful here...) → Performs almost as well as the best solver 27 / 76
  • 29. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: NOPA Algorithm 1 Noisy Optimization Portfolio Algorithm (NOPA). 1: Input noisy optimization solvers Solver1, Solver2 . . . , SolverM 2: Input a lag function LAG : N+ → N+ 3: Input a non-decreasing integer sequence r1, r2, . . . Periodic comparisons 4: Input a non-decreasing integer sequence s1, s2, . . . Number of resamplings 5: n ← 1 Number of selections 6: m ← 1 NOPA’s iteration number 7: i∗ ← null Index of recommended solver 8: x∗ ← null Recommendation 9: while budget is not exhausted do 10: if m ≥ rn then 11: i∗ = arg min i∈{1,...,M} ˆEsn [f(˜xi,LAG(rn))] Algorithm selection 12: n ← n + 1 13: else 14: for i ∈ {1, . . . , M} do 15: Apply one evaluation for Solveri 16: end for 17: m ← m + 1 18: end if 19: x∗ = ˜xi∗,m Update recommendation 20: end while 28 / 76
  • 30. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: compare solvers early lag function: LAG(n) ≤ n: lag ∀i ∈ {1, . . . , M}, xi,LAG(n) = or = xi,n 29 / 76
  • 31. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: compare solvers early lag function: LAG(n) ≤ n: lag ∀i ∈ {1, . . . , M}, xi,LAG(n) = or = xi,n Why this lag ? algorithms’ ranking is usually stable → no use comparing the very last it’s much cheaper to compare old points: comparing good (i.e. recent) points → comparing points with similar fitness comparing points with similar fitness → very expensive 29 / 76
  • 32. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Portfolio of noisy optimization methods: Theorem with fair budget distribution Theorem with fair budget distribution Assume that each solver i ∈ {1, . . . , M} has simple regret SRi,n = (1 + o(1)) Ci nαi (as usual) and noise variance = constant. Then for some universal rn, sn, LAGn, a.s. there exists n0 such that, for n ≥ n0: portfolio always chooses an optimal solver (optimal αi and Ci ); the portfolio uses ≤ M · rn(1 + o(1)) evaluations ⇒ M times more than the best solver. Interpretation Negligible comparison budget (thanks to lag) On classical log-log graphs, the portfolio should perform similarly to the best solver, within the log(M) shift (proved) 30 / 76
  • 33. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods INOPA: introducing an unfair budget NOPA: same budget for all solvers. Remark: we compare old recommendations (LAGn << n) they were known long ago, before spending all this budget therefore, except selected solvers, most of the budget is wasted :( ⇒ Lazy evaluation paradigm: evaluate f(.) only when you need it for your output ⇒ Improved NOPA (INOPA): unfaired budget distribution Use only LAG(rn) evaluations (negligible) on the sub-optimal solvers (INOPA) log(M ) shift with M the number of optimal solvers (proved) 31 / 76
  • 34. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Experiments: Unimodal case Noisy Optimization Algorithms (NOAs): SA-ES: Self-Adaptive Evolution Strategy Fabian’s algorithm: a first-order method using gradients estimated by finite differences [Dvoretzky et al., 1956, Fabian, 1967] Noisy Newton’s algorithm: a second-order method using a Hessian matrix approximated also by finite differences (our contribution in CNO) Solvers z = 0 (constant var) z = 1 (linear var) z = 2 (quadratic var) RSAES .114 ± .002 .118 ± .003 .113 ± .003 Fabian1 −.838 ± .003 −1.011 ± .003 −1.016 ± .003 Fabian2 .108 ± .003 −1.339 ± .003 −2.481 ± .003 Newton −.070 ± .003 −.959 ± .092 −2.503 ± .285 NOPA no lag −.377 ± .048 −.978 ± .013 −2.106 ± .003 NOPA −.747 ± .003 −.937 ± .005 −2.515 ± .095 INOPA −.822 ± .003 −1.359 ± .027 −3.528 ± .144 Table: Slope(SR) for f(x) = ||x||2 + ||x||z N in dimension 15. Computation time = 40s. 32 / 76
  • 35. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Portfolio of noisy optimization methods Experiments: Stochastic unit commitment problem Solver d = 45 d = 63 d = 105 d = 125 RSAES .485 ± .071 .870 ± .078 .550 ± .097 .274 ± .097 Fabian1 1.339 ± .043 1.895 ± .040 1.075 ± .047 .769 ± .047 Fabian2 .394 ± .058 .521 ± .083 .436 ± .097 .307 ± .097 Newton .749 ± .101 1.138 ± .128 .590 ± .147 .312 ± .147 INOPA .394 ± .059 .547 ± .080 .242 ± .101 .242 ± .101 Table: Stochastic unit commitment problem (minimization). Computation time = 320s. What’s more: Given a same budget, a INOPA of identical solvers can outperform its mono-solvers. 33 / 76
  • 36. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 34 / 76
  • 37. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion Portfolio and noisy optimization: Conclusion Main conclusion: portfolios also great in noisy opt. (because in noisy opt., with lag, comparison cost = small) We show mathematically and empirically a log(M) shift when using M solvers, on a classical log-log scale Bound improved to log(M ) shift, with M = nb. of optimal solvers, with unfair distribution of budget (INOPA) 35 / 76
  • 38. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Portfolio and noisy optimization Conclusion Portfolio and noisy optimization: Conclusion Main conclusion: portfolios also great in noisy opt. (because in noisy opt., with lag, comparison cost = small) We show mathematically and empirically a log(M) shift when using M solvers, on a classical log-log scale Bound improved to log(M ) shift, with M = nb. of optimal solvers, with unfair distribution of budget (INOPA) Take-home messages portfolio = little overhead unfair budget = no overhead if “orthogonal” portfolio (orthogonal → M = 1) We mathematically confirmed the idea of orthogonality found in [Samulowitz and Memisevic, 2007] 35 / 76
  • 39. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 36 / 76
  • 40. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Framework: Zero-sum matrix games Game defined by matrix M I choose (privately) i Simultaneously, you choose (privately) j I earn Mi,j You earn −Mi,j So this is zero-sum. Figure: 0-sum matrix game. rock paper scissors rock 0.5 0 1 paper 1 0.5 0 scissors 0 1 0.5 Table: Example of 1-sum matrix game: Rock-paper-scissors. 37 / 76
  • 41. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Framework: Nash Equilibrium (NE) Definition (Nash Equilibrium) Zero-sum matrix game M My strategy = probability distrib. on rows = x Your strategy = probability distrib. on cols = y Expected reward = xT My There exists x∗ , y∗ such that ∀x, y, xT My∗ ≤ x∗T My∗ ≤ x∗T My. (8) (x∗ , y∗ ) is a Nash Equilibrium (no unicity). Definition (Approximate -Nash Equilibria) (x∗ , y∗ ) such that xT My∗ − ≤ x∗T My∗ ≤ x∗T My+ . (9) Example: The NE of Rock-paper-scissors is unique: (1/3, 1/3, 1/3). 38 / 76
  • 42. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 39 / 76
  • 43. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Methods for computing Nash Equilibrium Algorithm Complexity Exact solution? Confidence Time LP [von Stengel, 2002] O(Kα), α > 6 yes 1 constant [Grigoriadis and Khachiyan, 1995] O( K log(K) 2 ) no 1 random [Grigoriadis and Khachiyan, 1995] O( log2(K) 2 ) no 1 random with K log(K) processors EXP3 [Auer et al., 1995] O( K log(K) 2 ) no 1 − δ constant Inf [Audibert and Bubeck, 2009] O( K log(K) 2 ) no 1 − δ constant Our algorithm O(k3k K log K) yes 1 − δ constant (if NE is k-sparse) Table: State-of-the-art of computing Nash Equilibrium for ESMG MK×K . 40 / 76
  • 44. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Adversarial bandit Adversarial bandit algorithm Exp3.P Algorithm 2 Exp3.P: variant of Exp3. η and γ are two parameters. 1: Input η ∈ R how much the distribution becomes peaked 2: Input γ ∈ (0, 1] exploration rate 3: Input a time horizon (computational budget) T ∈ N+ and the number of arms K ∈ N+ 4: Output a Nash-optimal policy p 5: y ← 0 6: for i ← 1 to K do initialization 7: ωi ← exp( ηγ 3 T K ) 8: end for 9: for t ← 1 to T do 10: for i ← 1 to K do 11: pi ← (1 − γ) ωi K j=1 ωj + γ K 12: end for 13: Generate it according to (p1, p2, . . . , pK ) 14: Compute reward Rit ,t 15: for i ← 1 to K do 16: if i == it then 17: ˆRi ← Rit ,t pi 18: else 19: ˆRi ← 0 20: end if 21: ωi ← ωi exp γ 3K (ˆRi + η pi √ TK ) 22: end for 23: end for 24: Return probability distribution (p1, p2, . . . , pK ) 41 / 76
  • 45. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 42 / 76
  • 46. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Sparse Nash Equilibria (1/2) Considering x∗ a Nash-optimal policy for ZSMG MK×K : Let us assume that x∗ is unique and has at most k non-zero components (sparsity). Let us show that x∗ is “discrete”: (Remark: Nash = solution of linear programming problem) ⇒ x∗ = also NE of a k × k submatrix: Mk×k ⇒ x∗ = solution of LP in dimension k ⇒ x∗ = solution of k lin. eq. with coefficients in {−1, 0, 1} ⇒ x∗ = inv-matrix × vector ⇒ x∗ = obtained by “cofactors / det matrix” ⇒ x∗ has denominator at most k k 2 By Hadamard determinant bound [Hadamard, 1893], [Brenner and Cummings, 1972] 43 / 76
  • 47. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Sparse Nash Equilibria (2/2) Computation of sparse Nash Equilibria Under assumption that the Nash is sparse: x∗ is rational with “small” denominator (previous slide!) So let us compute an -Nash (with small enough!) (sublinear time!) And let us compute its closest approximation with “small denominator” (Hadamard) Two new algorithms for exact Nash: Rounding-EXP3: switch to closest approximation Truncation-EXP3: remove small components and work on the remaining submatrix (exact solving) (requested precision k−3k/2 only ⇒ compl. k3k K log K) 44 / 76
  • 48. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 45 / 76
  • 49. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Contribution for computing Nash Equilibrium Our proposal: Parameter-free adversarial bandit No details here; in short: We compare various existing parametrizations of EXP3 We select the best We add sparsity as follows: for a budget of T rounds of EXP3, threshold = max i∈{1,...,m} (Txi )α T ⇒ we get a parameter-free bandit for adversarial problems 46 / 76
  • 50. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 47 / 76
  • 51. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Scenarios Policies Simulator Average perf. Robustness Average cost technological berakthrough CO2 penalization Maintain a connection Create new connection,... Scenarios Policy R(k, s) Examples of scenario: CO2 penalization, gas curtailment in Eastern Europe, technological breakthrough Examples of policy: massive nuclear power plant building, massive renewable energies 48 / 76
  • 52. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Nash-planning for scenario-based decision making Decision tools METHOD EXTRACTION EXTRACTION COMPUTATIONAL INTERPRETATION OF POLICIES OF CRITICAL COST SCENARIOS Wald One One per policy K × S Nature decides later, minimizing our reward Savage One One per policy K × S Nature decides later, maximizing our regret Scenarios Handcrafted Handcrafted K × S Human expertise our proposal: Nash Nash-optimal Nash-optimal (K + S) × log(K + S)(∗) Nature decides privately, before us Table: Comparison between several tools for decision under uncertainty. K = |K| and S = |S|. ⇒ in this case sparsity performs very well. (*)improved if sparse, by our previous result! Nash ⇒ fast selection of scenarios and options: sparsity both fastens the NE computation and makes the output more readable (smaller matrix) 49 / 76
  • 53. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Testcase and parameterization We consider (big toy problem): 310 investment policies (k) 39 scenarios (s) reward: (k, s) → R(k, s) 50 / 76
  • 54. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Testcase and parameterization We consider (big toy problem): 310 investment policies (k) 39 scenarios (s) reward: (k, s) → R(k, s) We use Nash Equilibria, for their principled nature (Nature decides first and privately! that’s reasonable, right ?) and low computational cost in large scale settings compute the equilibria thanks to EXP3 (tuned)... ... with sparsity, for improving the precision reducing the number of pure strategies in our recommendation (unreadable matrix otherwise!) 50 / 76
  • 55. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Sparse-Nash algorithm Algorithm 3 The Sparse-Nash algorithm for solving decision under uncertainty prob- lems. Input A family K of possible decisions k (investment policies). Input A family S of scenarios s. Input A mapping (k, s) → Rk,s, providing the rewards Run truncated.Exp3.P on R, get a probability distribution on K (support = key options) and a probability distribution on S (support = critical scenarios). Emphasize the policy with highest probability. 51 / 76
  • 56. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: Results α AVERAGE SPARSITY LEVEL OVER 310 = 59049 ARMS T = K T = 10K T = 50K T = 100K T = 500K T = 1000K 0.1 13804 ± 52 non-sparse non-sparse non-sparse non-sparse non-sparse 0.3 2810 ± 59 non-sparse non-sparse non-sparse non-sparse non-sparse 0.5 396 ± 16 non-sparse non-sparse 59049 ± 197 49819 ± 195 non-sparse 0.7 43 ± 3 58925 ± 27 55383 ± 1507 46000 ± 278 9065 ± 160 non-sparse 0.9 4 ± 0 993 ± 64 797 ± 42 504 ± 25 98 ± 5 52633 ± 523 0.99 1 ± 0 2 ± 0 3 ± 0 2 ± 0 2 ± 0 7 ± 1 α ROBUST SCORE: WORST REWARD AGAINST PURE STRATEGIES T = K T = 10K T = 50K T = 100K T = 500K T = 1000K NT 4.922e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.1 4.948e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.3 5.004e-01 4.928e-01 4.956e-01 4.991e-01 5.221e-01 4.938e-01 0.5 5.059e-01 4.928e-01 4.956e-01 4.991e-01 5.242e-01 4.938e-01 0.7 5.054e-01 4.928e-01 4.965e-01 5.031e-01 5.317e-01 4.938e-01 0.9 4.281e-01 5.137e-01 5.151e-01 5.140e-01 5.487e-01 4.960e-01 0.99 3.634e-01 4.357e-01 4.612e-01 4.683e-01 5.242e-01 5.390e-01 Pure 3.505e-01 3.946e-01 4.287e-01 4.489e-01 5.143e-01 4.837e-01 Table: Average sparsity level and robust score. α is the truncation parameter. T is the budget. 52 / 76
  • 57. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: summary Define long term scenarios (plenty!) ? Build simulator R(k, s) Classical solution (Savage): min k∈K max s∈S regret(k, s) Our proposal (Nash): automatically select submatrix 53 / 76
  • 58. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to robust optimization (power systems) Application to power investment problem: summary Define long term scenarios (plenty!) ? Build simulator R(k, s) Classical solution (Savage): min k∈K max s∈S regret(k, s) Our proposal (Nash): automatically select submatrix Our proposed tool has the following advantages: Natural extraction of interesting policies and critical scenarios: α = .7 provides stable (and proved) results, but the extracted submatrix becomes easily readable (small enough) with larger values of α. Faster than Wald or Savage methodologies. Take-home messages We get a fast criterion, faster than Wald’s or Savage’s criteria, with a natural interpretation, and more readable ⇒ but stochastic recommendation! 53 / 76
  • 59. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 54 / 76
  • 60. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Two parts: Seeds matter: **choose** your seeds ! More tricky but worth the effort: position-specific seeds ! (towards a better asymptotic behavior of MCTS ?) 55 / 76
  • 61. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: Correlations Figure: Success rate per seed (ranked) in 5x5 Domineering, with standard deviations on y-axis: the seed has a significant impact. Fact: the random seed matters ! 56 / 76
  • 62. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: State-of-the-art Stochastic algorithms randomly select their pseudo-random seed. We propose to choose the seed(s), and to combine them. State-of-the-art for combining random seeds: [Nagarajan et al., 2015] combines several AIs [Gaudel et al., 2010] uses Nash methods for combining several opening books [Saint-Pierre and Teytaud, 2014] constructs several AIs from a single stochastic one and combines them by the BestSeed and Nash approaches 57 / 76
  • 63. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Trick: present results with one white seed per column and one black seed per row ... ... Column player gets 1-Mi,j Row player gets Mi,j Mi,j M1,1 M2,1 MK,1 M1,2 M1,K M2,2 ... ... ... ... ... ... ... ... ... MK,K MK,2 M2,K ... K random seed for White KrandomseedsforBlack ... ... ... Figure: One black seed per row, one white seed per column. 58 / 76
  • 64. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Propositions: Nash & BestSeed Nash Nash = combines rows (more robust; we will see later) BestSeed BestSeed = just pick up the best row / best column 59 / 76
  • 65. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Better than squared matrices: rectangle methods Remark: for choosing a row, if #rows = #cols, then #rows is more critical than #cols; for a given budget, increase #rows and decrease #cols (same budget!) K K Kt x Kt Kt Kt Figure: Left: square matrix of a game; right: rectangles of a game (K >> Kt ). 60 / 76
  • 66. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Does it work ? experiments on Domineering The opponent uses seeds which have never been used during the learning of the portfolio (cross-validation). Figure: Results for domineering, with the BestSeed (left) and the Nash (right) approach, against the baseline (K = 1) and the exploiter ( K > 1; opponent who “learns” very well). Kt = 900 in all experiments. BestSeed performs well against the original algorithm (K = 1), but poorly against the exploiter ( K > 1). Nash outperforms the original algorithm both w.r.t K = 1 (all cases) and K > 1 (most cases). 61 / 76
  • 67. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Beyond cross-validation: experiments with transfer in the game of Go Learning: BestSeed is applied to GnuGo, with MCTS and a budget of 400 simulations. Test: against “classical” GnuGo, i.e. the non-MCTS version of GnuGo. Opponent Performance of BestSeed Performance with randomized seed GnuGo-classical level 1 1. (± 0 ) .995 (± 0 ) GnuGo-classical level 2 1. (± 0 ) .995 (± 0 ) GnuGo-classical level 3 1. (± 0 ) .99 (± 0 ) GnuGo-classical level 4 1. (± 0 ) 1. (± 0 ) GnuGo-classical level 5 1. (± 0 ) 1. (± 0 ) GnuGo-classical level 6 1. (± 0 ) 1. (± 0 ) GnuGo-classical level 7 .73 (± .013 ) .061 (± .004 ) GnuGo-classical level 8 .73 (± .013 ) .106 (± .006 ) GnuGo-classical level 9 .73 (± .013 ) .095 (± .006 ) GnuGo-classical level 10 .73 (± .013 ) .07 (± .004 ) Table: Performance of “BestSeed” and “randomized seed” against “classical” GnuGo. Previous slide: we win against the AI which we have trained (but different seeds!). This slide: we improve the winning rate against another AI. 62 / 76
  • 68. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing random seeds: Partial conclusion Conclusion: Seed optimization (NOT position specific) = can be seen as a simple and effective tool for building an opening book with no development effort, no human expertise, no storage of database. “Rectangle” provides significant improvements. The online computational overhead of the methods is negligible. The boosted AIs significantly outperform the baselines. BestSeed performs well, but can be overfitted ⇒ strength of Nash. Further work: The use of online bandit algorithms for dynamically choosing K/Kt . Note: The BestSeed and the Nash algorithms are not new. The algorithm and analysis of rectangles is new. The analysis of the impact of seeds is new. The applications to Domineering, Atari-go and Breakthfrough are new. 63 / 76
  • 69. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Two parts: Seeds matter: **choose** your seeds ! More tricky but worth the effort: position-specific seeds ! (towards a better asymptotic behavior of MCTS ?) 64 / 76
  • 70. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Tsumego Tsumego (by Yoji Ojima, Zen’s author) Input: a Go position Question: is this situation a win for white ? Output: yes or no 65 / 76
  • 71. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Tsumego Tsumego (by Yoji Ojima, Zen’s author) Input: a Go position Question: is this situation a win for white ? Output: yes or no Why so important? At the heart of many game algorithms In Go, Exptime complete [Robson, 1983] 65 / 76
  • 72. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical algorithms Monte Carlo (MC) [Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007] Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006] Nested MC [Cazenave, 2009] Voting scheme among MCTS [Gavin et al., ] 66 / 76
  • 73. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical algorithms Monte Carlo (MC) [Bruegmann, 1993, Cazenave, 2006, Cazenave and Borsboom, 2007] Monte Carlo Tree Search (MCTS) [Bouzy, 2004, Coulom, 2006] Nested MC [Cazenave, 2009] Voting scheme among MCTS [Gavin et al., ] ⇒ here weighted voting scheme among MCTS 66 / 76
  • 74. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Evaluation of the game value Algorithm 4 Evaluation of the game value. 1: Input current state s 2: Input a policy πB for Black, depending on a seed in N+ 3: Input a policy πW for White, depending on a seed in N+ 4: for i ∈ {1, . . . , K} do 5: for j ∈ {1, . . . , K} do 6: Mi,j ← outcome of the game starting in s with πB playing as Black with seed b(i) and πW playing as White with seed w(j) 7: end for 8: end for 9: Compute weights p for Black and q for White for the matrix M (either BestSeed, Nash, or other) 10: Return pT Mq approximate value of the game M 67 / 76
  • 75. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Classical case (MC/MCTS): unpaired Monte Carlo averaging b(1) K*K random seeds for Black b(i)b(2) b(K*K)... ... w(1) K*K random seeds for White w(i)w(2) w(K*K)... ... ... ... Column player gets 1-Mi,j Row player gets Mi,j Mi,j M1,1 M2,1 MK,1 M1,2 M1,K M2,2 ... ... ... ... ... ... ... ... ... MK,K MK,2 M2,K ... K random seed for White KrandomseedsforBlack ... ... ... Figure: Left: unpaired case (classical estimate by averaging); right: paired case: K seeds vs K seeds. 68 / 76
  • 76. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v(s) Three methods use K2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix 69 / 76
  • 77. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v(s) Three methods use K2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix One unpaired method: classical MC estimate (the average of K2 random MCTS) Baseline: a single long MCTS (=state of the art !) →only one which is not K2 -parallel 69 / 76
  • 78. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Applied methods and setting Compared methods for approximating v(s) Three methods use K2 indep. batches of M MCTS-simulations using matrix of seeds: Nash reweighting = Nash-value BestSeed reweighting = Intersection best row / best col Paired MC estimate = Average of the matrix One unpaired method: classical MC estimate (the average of K2 random MCTS) Baseline: a single long MCTS (=state of the art !) →only one which is not K2 -parallel Parameter setting: GnuGo-MCTS [Bayer et al., 2008] setting A: 1 000 simulations per move setting B: 80 000 simulations per move 69 / 76
  • 79. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Experiments: Average results over 50 Tsumego problems 0 200 400 600 800 1000 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Submatrix Size (N 2 ) Performance Nash Paired Best Unpaired MCTS(1) (a) setting A: 1 000 simulations per move. 0 200 400 600 800 1000 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Submatrix Size (N 2 ) Performance Nash Paired Best Unpaired MCTS(1) (b) setting B: 80 000 simulations per move. Figure: Average over 50 Tsumego problems. x-axis: #simulations, y-axis: %correct answers. MCTS(1): one single MCTS run using all the budget. Setting A (small budget): MCTS(1) outperforms weighted average of 81 MCTS runs (but we are more parallel !) Setting B (large budget): we outperform MCTS and all others by far ⇒ consistent with the limited scalability of MCTS for huge number of sim. 70 / 76
  • 80. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! 71 / 76
  • 81. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! We outperformed average of MCTS runs sharing the budget a single MCTS using all the budget → For M large enough, our weighted averaging of 81 single MCTS runs with M simulations is better than a MCTS run with 81M simulations :) 71 / 76
  • 82. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Application to games Optimizing position-based random seeds: Partial conclusion Main conclusion: novel way of evaluating game values using Nash Equilibrium (theoretical validation & experiments on 50 Tsumego problems) Nash or BestSeed predictor requires far less simulations for finding accurate results + sometimes consistent whereas original MC is not ! We outperformed average of MCTS runs sharing the budget a single MCTS using all the budget → For M large enough, our weighted averaging of 81 single MCTS runs with M simulations is better than a MCTS run with 81M simulations :) Take-home messages We classify positions (“black wins” vs “white wins”). We use a WEIGHTED average of K2 MCTS runs of M simulations. Our approach outperforms: all tested voting schemes among K2 MCTS estimates of M simulations, and a pure MCTS of K2 × M simulations, when M is large and K2 = 81. 71 / 76
  • 83. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 72 / 76
  • 84. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion A work on sparsity, at the core of ZSMG A parameter-free adversarial bandit, obtained by tuning (no details provided in this talk) + sparsity Applications of ZSMG: Nash + Sparsity → faster + more readable robust decision making 73 / 76
  • 85. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Adversarial portfolio Conclusion A work on sparsity, at the core of ZSMG A parameter-free adversarial bandit, obtained by tuning (no details provided in this talk) + sparsity Applications of ZSMG: Nash + Sparsity → faster + more readable robust decision making Random seeds = new MCTS variants ? validated as opening book learning (Go, Atari-Go, Domineering, Breakthrough, Draughts,Phantom-Go. . . ) position-specific seeds validated on Tsumego 73 / 76
  • 86. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion 1 Motivation 2 Noisy Optimization Optimization criteria for black-box noisy optimization Optimization methods Resampling methods Pairing 3 Portfolio and noisy optimization Portfolio: state of the art Relationship between portfolio and noisy optimization Portfolio of noisy optimization methods Conclusion 4 Adversarial portfolio Adversarial bandit Adversarial Framework State-of-the-art Contribution for computing Nash Equilibrium Sparsity: sparse NE can be computed faster Parameter-free adversarial bandit for large-scale problems Application to robust optimization (power systems) Application to games Conclusion 5 Conclusion 74 / 76
  • 87. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) 75 / 76
  • 88. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? 75 / 76
  • 89. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? 75 / 76
  • 90. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-) 75 / 76
  • 91. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Conclusion & Further work Noisy opt: An algorithm, recovering most (but not all: Fabian’s rate!) existing results, extended to other surrogate models ES/DE with resamplings have good rates for linear/quad var, and/or robust criteria (UR); for other cases resamplings are not sufficient for optimal rates (“mutate large inherit small” + huge population and/or surrogate models...) Portfolio: Application to noisy opt.; great benefits with several solvers of a given model Towards wider applications: portfolio of models ? Adversarial portfolio: successful use of sparsity; parameter-free bandits ? MCTS and seeds: room for 5 ph.D. ... if there is funding for it :-) Most works here → ROBUSTNESS by COMBINATION (robust to solvers, to models, to parameters, to seeds ...) 75 / 76
  • 92. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS Conclusion Thanks for your attention ! Thanks to all the collaborators from Artelys, INRIA, CNRS, Univ. Paris-Saclay, Univ. Paris-Dauphine, Univ. du Littoral, NDHU ... 76 / 76
  • 93. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references I Audibert, J.-Y. and Bubeck, S. (2009). Minimax policies for adversarial and stochastic bandits. In proceedings of the Annual Conference on Learning Theory (COLT). Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. (1995). Gambling in a rigged casino: the adversarial multi-armed bandit problem. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pages 322–331. IEEE Computer Society Press, Los Alamitos, CA. Auger, A. (2005). Convergence results for the (1, λ)-sa-es using the theory of φ-irreducible markov chains. Theoretical Computer Science, 334(1):35–69. Baudiˇs, P. and Poˇs´ık, P. (2014). Online black-box algorithm portfolios for continuous optimization. In Parallel Problem Solving from Nature–PPSN XIII, pages 40–49. Springer. 77 / 76
  • 94. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references II Bayer, A., Bump, D., Daniel, E. B., Denholm, D., Dumonteil, J., Farneb¨ack, G., Pogonyshev, P., Traber, T., Urvoy, T., and Wallin, I. (2008). Gnu go 3.8 documentation. Technical report, Free Software Fundation. Billingsley, P. (1986). Probability and Measure. John Wiley and Sons. Bouzy, B. (2004). Associating shallow and selective global tree search with Monte Carlo for 9x9 Go. In 4rd Computer and Games Conference, Ramat-Gan. Brenner, J. and Cummings, L. (1972). The Hadamard maximum determinant problem. Amer. Math. Monthly, 79:626–630. Bruegmann, B. (1993). Monte-carlo Go (unpublished draft http://www.althofer.de/bruegmann-montecarlogo.pdf). 78 / 76
  • 95. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references III Bubeck, S., Munos, R., and Stoltz, G. (2011). Pure exploration in finitely-armed and continuous-armed bandits. Theoretical Computer Science, 412(19):1832–1852. Cazenave, T. (2006). A phantom-go program. In van den Herik, H. J., Hsu, S.-C., Hsu, T.-S., and Donkers, H. H. L. M., editors, Proceedings of Advances in Computer Games, volume 4250 of Lecture Notes in Computer Science, pages 120–125. Springer. Cazenave, T. (2009). Nested monte-carlo search. In Boutilier, C., editor, IJCAI, pages 456–461. Cazenave, T. and Borsboom, J. (2007). Golois wins phantom go tournament. ICGA Journal, 30(3):165–166. 79 / 76
  • 96. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references IV Coulom, R. (2006). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In P. Ciancarini and H. J. van den Herik, editors, Proceedings of the 5th International Conference on Computers and Games, Turin, Italy, pages 72–83. Cranley, R. and Patterson, T. (1976). Randomization of number theoretic methods for multiple integration. SIAM J. Numer. Anal., 13(6):904,1914. Dupaˇc, V. (1957). O Kiefer-Wolfowitzovˇe aproximaˇcn´ı Methodˇe. ˇCasopis pro pˇestov´an´ı matematiky, 082(1):47–75. Dvoretzky, A., Kiefer, J., and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Annals of Mathematical Statistics, 33:642–669. Fabian, V. (1967). Stochastic Approximation of Minima with Improved Asymptotic Speed. Annals of Mathematical statistics, 38:191–200. 80 / 76
  • 97. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references V Gaudel, R., Hoock, J.-B., P´erez, J., Sokolovska, N., and Teytaud, O. (2010). A Principled Method for Exploiting Opening Books. In International Conference on Computers and Games, pages 136–144, Kanazawa, Japon. Gavin, C., Stewart, S., and Drake, P. Result aggregation in root-parallelized computer go. Grigoriadis, M. D. and Khachiyan, L. G. (1995). A sublinear-time randomized approximation algorithm for matrix games. Operations Research Letters, 18(2):53–58. Hadamard, J. (1893). R´esolution d’une question relative aux d´eterminants. Bull. Sci. Math., 17:240–246. Hammersley, J. and Handscomb, D. (1964). Monte carlo methods, methuen & co. Ltd., London, page 40. 81 / 76
  • 98. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references VI Heidrich-Meisner, V. and Igel, C. (2009). Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In ICML ’09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 401–408, New York, NY, USA. ACM. Jebalia, M. and Auger, A. (2008). On multiplicative noise models for stochastic search. In et a.l., G. R., editor, Conference on Parallel Problem Solving from Nature (PPSN X), volume 5199, pages 52–61, Berlin, Heidelberg. Springer Verlag. Liu, J., Saint-Pierre, D. L., Teytaud, O., et al. (2014). A mathematically derived number of resamplings for noisy optimization. In Genetic and Evolutionary Computation Conference (GECCO 2014). Mascagni, M. and Chi, H. (2004). On the scrambled halton sequence. Monte-Carlo Methods Appl., 10(3):435–442. 82 / 76
  • 99. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references VII Mnih, V., Szepesv´ari, C., and Audibert, J.-Y. (2008). Empirical Bernstein stopping. In ICML ’08: Proceedings of the 25th international conference on Machine learning, pages 672–679, New York, NY, USA. ACM. Nagarajan, V., Marcolino, L. S., and Tambe, M. (2015). Every team deserves a second chance: Identifying when things go wrong (student abstract version). In 29th Conference on Artificial Intelligence (AAAI 2015), Texas, USA. Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods. Rechenberg, I. (1973). Evolutionstrategie: Optimierung Technischer Systeme nach Prinzipien des Biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart. Robson, J. M. (1983). The complexity of go. In IFIP Congress, pages 413–417. 83 / 76
  • 100. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references VIII Rolet, P. and Teytaud, O. (2010). Adaptive noisy optimization. In Di Chio, C., Cagnoni, S., Cotta, C., Ebner, M., Ek´art, A., Esparcia-Alcazar, A., Goh, C.-K., Merelo, J., Neri, F., Preuß, M., Togelius, J., and Yannakakis, G., editors, Applications of Evolutionary Computation, volume 6024 of Lecture Notes in Computer Science, pages 592–601. Springer Berlin Heidelberg. Saint-Pierre, D. L. and Teytaud, O. (2014). Nash and the Bandit Approach for Adversarial Portfolios. In CIG 2014 - Computational Intelligence in Games, Computational Intelligence in Games, page 7, Dortmund, Germany. IEEE, IEEE. Samulowitz, H. and Memisevic, R. (2007). Learning to solve qbf. In Proceedings of the 22nd National Conference on Artificial Intelligence, pages 255–260. AAAI. Shamir, O. (2013). On the complexity of bandit and derivative-free stochastic convex optimization. In COLT 2013 - The 26th Annual Conference on Learning Theory, June 12-14, 2013, Princeton University, NJ, USA, pages 3–24. 84 / 76
  • 101. PORTFOLIO METHODS IN UNCERTAIN CONTEXTS References Some references IX Storn, R. (1996). On the usage of differential evolution for function optimization. In Fuzzy Information Processing Society, 1996. NAFIPS. 1996 Biennial Conference of the North American, pages 519–523. IEEE. von Stengel, B. (2002). Computing equilibria for two-person games. Handbook of Game Theory, 3:1723 – 1759. Wang, X. and Hickernell, F. (2000). Randomized halton sequences. Math. Comput. Modelling, 32:887–899. 85 / 76