Keywords and examples of machine learning

Machine learning:
Keywords + Applications
1) Applications of machine learning
- wind power forecasting (important e.g. for PengHu island!)
- rainfalls estimation
2) Some key words (you must know what they mean):
- black box / white box
- shrinking horizon
- objective function
- “what you get is what you have”
- model complexity
- cross-validation
- generative model
- quantile, value-at-risk

What you will see
in these slides
- shrinking horizon
- model complexity
- cross-validation
- generative model

I want to produce
electricity

I want to produce
electricity

I have:
- water for hydroelectricity
- a nuclear power plant
- wind farms

- gas turbines

I want to produce
electricity

I must ensure, for each time step:

Production of electricity
=
Demand of electricity

Demand(t0), Demand(t1), Demand(t2), Demand(t3) known.

I want to produce
electricity

We get four equations:
Production(t0) = Demand(t0)
Production(t2) = Demandt(2)

Other equation:
Production = hydro-production + nuclear-production
+ wind-farm production + gas production

I want to produce
electricity

H(t0)+W(t0)+N(t0)+G(t0) = Demand(t0)
H(t2)+W(t2)+N(t2)+G(t2) = Demandt(2)

Stock level for Hydro depends on production
x(1) = x(0)-H(0) x(2) = x(1)-H(1)
x(3) = x(2)-H(2) x(4) = x(3)-H(3)

Also depends on inflows


Stock level for Hydro: x(0); constraint: x(i) >= 0
x(1) = x(0)+I(0)-H(0) x(2) = x(1)+I(1)-H(1)
x(3) = x(2)+I(2)-H(2) x(4) = x(3)+I(3)-H(3)

8 equations
(yes, it increases...)


X(0) + I(0) – H(0) >=0
X(0) + I(0) – H(0) + I(1) – H(1) >=0
X(0) + I(0) – H(0) + I(1) – H(1) + I(2) – H(2) >=0
X(0) + I(0)–H(0) +I(1)– H(1) +I(2)–H(2) +I(3)-H(3)>=0

8 equations
(yes, it increases...)

Nuclear has constraints as well:

- N(1) in f(N(0))
- N(2) in f(N(1))
- N(3) in f(N(2))

(very simplified; in fact there are stocks, refills...)

Ok! Summary ?
W(0), W(1), W(2), W(3) wind farms production
= can not be chosen and
W(1), W(2), W(3) unknown!

To be chosen:
G(0), G(1), G(2), G(3) gas turbines production

H(0), H(1), H(2), H(3) hydroelectric production
(can be somehow negative)

N(0), N(1), N(2), N(3) nuclear power

Ok! Summary ?

To be chosen:
G(0), G(1), G(2), G(3) gas turbines production

Constraints: production plans must satisfy constraints.

E.g.: if unlimited gas turbines production, we might decide
G(0)=demand(0)-W(0), G(1)=demand(1)-W(1),
G(2)=demand(2)-W(2), G(3)=demand(3)-W(3)
==> it is a feasible solution

Ok! Summary ?

To be chosen:
G(0), G(1), G(2), G(3) gas production


E.g.: if unlimited gas production, we might decide
==> it is a bad feasible solution

Ok! Summary ?

To be chosen:
G(0), G(1), G(2), G(3) gas production


E.g.: if unlimited gas production, we might decide
==> it is a bad feasible solution

Objective function: not all solutions are equivalent!

Ok! Summary ?

Production cost:

Hcost * (H0+H1+H2+H3)
+ Ncost * (N0+N1+N2+N3)
+ Gcost * (G0+G1+G2+G3)
+ Wcost* (W0+W1+W2+W3)

Nb: Cost does not only mean $.
Cost means ecological & environmental costs as well.

Quizz !

So we have:
x0,x1,x2,x3: states at time t0, t1, t2, t3.
x0 is given, x1, x2, x3 depend on our decisions.

Some decisions are chosen at time t0.

The cost depends on all decisions.

Is this a supervised learning problem ?
Is this a reinforcement learning problem ?
Is this a boring problem ?

Ok! Summary ?

So we have equations.
If we know W(1),W(2),W(3),
we can evaluate the production cost.
We want to:
- solve equations
- minimize production cost

Problem: we don't know W(1), W(2), W(3).
How to know ?

Ok! Summary ?

We want to know W(1), W(2), W(3).

Steps:

(1) Weather simulation: we predict the wind
at time steps t1 t2 t3 (as in classical
weather forecast)

(2) From the wind forecast,
predict the power (e.g. “black box” model):
Based on data
E.g. mean-square error

Predicting W(1), W(2), W(3):
Boring problem ?
Supervised learning problem ?
Reinforcement learning problem ?

Ok! Summary ?

We want to know W(1), W(2), W(3).

Steps:

(1) Weather simulation: we predict What does
the wind
at time steps t1 t2 t3 (as in classical box”
“black
weather forecast) mean ?

(2) From the wind forecast,
predict the power (e.g. “black box” model):
Based on data
E.g. mean-square error

Predicting W(1), W(2), W(3):
Boring problem ?
Supervised learning problem ?
Reinforcement learning problem ?

Difficulties ?

In many cases, you will see in your life as an engineer that:

- collecting datas and models is a big
part of the work

- solving the problem exactly is impossible

- what really matters in an application is to
find where the current codes are
not satisfactory, and not to spend time on
other aspects

Typical questions for
this application
Many constraints /
effects
are missing !

(for the real
application,
we must have far
more
constraints...)

this application
Many constraints /
effects
are missing !

Mean square (for the real
error in the application,
supervised we must have far
learning for more
W1,W2,W3 ? constraints...)

But ..........
................
.................

this application
Many constraints /
effects
are missing !

(for the real How many time
Mean square
application, steps in the future
error in the
we must have far should
supervised
more we consider ?
learning for

But ..........
................
.................

this application
Many constraints /
effects
are missing !

Mean square
error in the
supervised
more we consider ?
learning for

But ..........
................
.................

We should
penalize
cases with W4
small !

this application
Many constraints /
effects
are missing !

Mean square
error in the
supervised
more we consider ?
learning for

But ..........
................ In case of long
................. term:
should we
We should consider
“climate change”
penalize
cases with W4 bias ?
small !

Some of these points
Typical questions for are important, some
are negligible,
this application depending on the
system
Many constraints / under analysis.
effects
are missing !

Mean square
error in the
supervised
more we consider ?
learning for

But ..........
................ In case of long
................. term:
should we
We should consider
“climate change”
penalize
cases with W4 bias ?
small !

Another beautiful application

This is Paris.
Beautiful town.
With plenty of people
(10 millions in IDF).


This is Paris.
Beautiful town.
With plenty of people
(10 millions in IDF).
Producing plenty of fecal
matter ==> dirty water.

Our river in Paris
is the “Seine”.

A French
politician said
he would soon
swim across it.

After all, he never
did it.

For your health,
don't do it.

Nevertheless,
we try
to keep it
as clean
as possible.

Dirty water should be separated from the Seine.
And usually it is.
Something like this:

Seine

Dirty
water

Problem: if big rainfalls reach dirty water,
then dirty water might pollute the Seine

Seine

Dirty
water

No typhoon in France.
But we can have heavy rains/winds in Paris:
- 0.96 dm in 24 hours happened in 1987.
- gusts at 169 km/h in 1999 (very unusual in France)


Seine

Dirty
water
(yes, in Taiwan it is more impressive,
sometimes it is 16.7 dm in 24 hours and gusts
can reach 250 km/h...)

No typhoon in France.
But we can have heavy rains/winds in Paris:
- 0.96 dm in 24 hours happened in 1987.
- gusts at 169 km/h in 1999 (very unusual in France)


Seine
Dirty
water
→ Seine!(yes, in Taiwan it is more impressive,
sometimes it is 16.7 dm in 24 hours and gusts
can reach 250 km/h...)


Three water networks:

- dirty water: should go to cleaning stations

- clean water: can go to the Seine, but can't be drunk

- drinkable water (France: tap water = drinkable)

Big water network

Dirty Dirty Dirty Dirty
water water water water

Clean Clean Clean Clean
water water water water

Water vs dirty water

Challenge:
Summer storms.
Not comparable to a Taiwanese typhoon.
But a lot of water.
Can make dirty water become very big.
Can invade clean water.

Your mission:
- Get read of dirty water
- Protect clean water


State: level in each stock,
valves' status
(open or closed)

At each time step,
rainfalls(i) liters of water reach stock i.
you can open or close valves
==> get a new state.

Your mission:
- Get read of dirty water
- Protect clean water


Typically:
(0, 1, 0, 0, 0, 1, 0, 1, 0.42, 0.2, 0.0, 0.8, 0.3)
(valves) (stock levels)

Plenty of rules:
- if (valve 4 opens, then water from stock 1
goes to stock 2 at rate 0.02m3/s)
- if (stock[2]>0.3) then dirty water ==> Seine,
3
0.1m /s

==> Miminize the quantity of dirty water in clean
stocks at the end of the storm


D-dimensional
vector
Equations:

Stocks(t+1) = complicatedFunction(Stocks(t),
rainfalls(t), valves(t))
D-dimensional
vector
(D=number of stocks) D-dimensional V-dimensional
vector vector
(D=number of stocks) (V=number of valves)


To be decided:
valves(t) for each t

If there are 240 times steps,
we get 240 x V decision
V-dimensional
vector
variables
(V=number of valves)

Criterion = objective function = quantity of dirty
water reaching the clean network + quantity of
dirty water in the river

Shrinking horizon

Too many time steps!

At each time step, make a decision
using only 30 time steps.

Move this window of 30 time steps.

Shrinking horizon

moving window of
30 time steps

Summary ?

Is this:
- an optimization problem ?
- a reinforcement learning problem ?
- a supervised machine learning problem ?

Summary ?

Is this:
- an optimization problem ?
- a reinforcement learning problem ?
- a supervised machine learning problem ?

Problem: rainfalls are unknown.

How to predict rainfalls ?

In fact, there are distinct rainfalls:
- R1: a spatial distribution of rainfalls
(one number per time step
per point of the map)
- R2:
a underground list of rainfall arrivals (inflows),
per stocks (D-dimensional)

Input data:
- weather forecast of archive ( R1(t) for each t)
- archives of weather forecast R1(t)
- archives of inflows R2(t)

If your life was depending on it, what
would you do ?

would you do ?

We are at time t.
We need a forecaster:
- which takes available data as input
- and outputs R2(t') for t'>=t (why not for t' < t ?)

would you do ?

We are at time t.

(R2(t+1),R2(t+2),R2(t+3), .... , R2(t+30))

=?

would you do ?

We are at time t.

(R2(t+1),R2(t+2),R2(t+3), .... , R2(t+30))

= f( R1(t) ) ?

would you do ?

We are at time t.

(R2(t+1),R2(t+2),R2(t+3), .... , R2(t+30))

= f( R1(t), R1(t-1), R1(t-2), R1(t-3), R1(t-4), ..., R1(t-50) )

(because there are delays)

would you do ?

We are at time t.

(R2(t+1),R2(t+2),R2(t+3), .... , R2(t+30))

= f( R1(t), R1(t-1), R1(t-2), R1(t-3), R1(t-4), ..., R1(t-50), R2(t) )

(because “what you get is what you have”)

would you do ?


and then agregation:
= f( R1(t),
R1(t-1)+R1(t-2),
R1(t-3)+R1(t-4)+R1(t-5)+R1(t-6),
+...,
R2(t) )

Why ?

would you do ?


= f( R1(t),
R1(t-1)+R1(t-2),
R1(t-3)+R1(t-4)+R1(t-5)+R1(t-6),
+...,
R2(t) )

Because less parameters.

would you do ?


= f( R1(t),
R1(t-1)+R1(t-2),
R1(t-3)+R1(t-4)+R1(t-5)+R1(t-6),
+...,
R2(t) )

Rule of thumb: number of parameters
less than number of data points / 20 <=== why ?

would you do ?


= f( R1(t),
R1(t-1)+R1(t-2),
R1(t-3)+R1(t-4)+R1(t-5)+R1(t-6),
+...,
R2(t) )


How to choose between all these models ?

would you do ?


= f( R1(t),
R1(t-1)+R1(t-2),
R1(t-3)+R1(t-4)+R1(t-5)+R1(t-6),
+...,
R2(t) )


How to choose between all these models ? Cross-validation.

Main weakness of this analysis ?

The same as in the previous application.

We predicted R2(t), R2(t+1), ....

Then we maximize cleanness based on these forecasts.

But there are huge uncertainties.

Main weakness of this analysis ?

This is often done in real world.
No change on the
Often, we do not spend time optimization algorithm
on checking that the consequences
are minor. (we are just pessimistic
in the forecasts)
“Cheap” solutions (do not take too much time):

- predicting a quantile (do you know how ?)
instead of a conditional expectation
and check on simulations

- predicting a conditional expectation +
moments (do you know how ?)

Then, optimize on average
(slight change in the objective function)

What about an exact solution ?

The exact solution is much harder to implement.

We can use forecasts with moments.

Then, we get a MDP.

Then, this is reinforcement learning.

- simple: forecasting + optimizing
- a bit more complex: pessimistic forecasting + optimizing
- more complex: forecasting with moments + optimizing on average
or optimizing a quantile (“value at risk”)
- advanced: full reinforcement learning model

What about an exact solution ?
The best choice depends on the precision of your model,
the budget you have.

Some problems involve billions of US $ and have precise models.
Then, each percent of improvement represents more money than
all your professional life. Then, you can (must)
implement something very advanced.

Sometimes, model are very imprecise.
Then, optimizing at 0.001% is meaningless. Improving the model
is more important.

- simple: forecasting + optimizing
- a bit more complex: pessimistic forecasting + optimizing
- more complex: forecasting with moments + optimizing on average
or optimizing a quantile (“value at risk”)
- advanced: full reinforcement learning model

What do you think ?
Did you understand ?

- shrinking horizon
- model complexity
- cross-validation
- generative model

===> olivier.teytaud@inria.fr

Keywords and examples of machine learning

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Similar to Keywords and examples of machine learning

Similar to Keywords and examples of machine learning (20)

Recently uploaded

Recently uploaded (20)

Keywords and examples of machine learning