Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
孫民/從電腦視覺看人工智慧 : 下一件大事
1. Ar#ficial
Intelligence:
The
Next
Big
Thing
from
a
computer
vision
perspec0ve
VSLab
清大電機
孫民
2. What’s
the
Next
Big
Thing?
h2p://research.microso6.com/en-‐us/um/redmond/events/fs2015
3. Goal
“big
data
being
the
source,
machine
learning
being
the
technique,
and
AI
being
the
outcome”
by
Prof.
Hsuan-‐Tien
Lin
at
IEEE
BigData
2016
Many
kinds
of
source
(data)
and
outcomes
(AI
tasks)
can
be
trained
end-‐
to-‐end
using
Deep
Learning
(DL)
12. DL
Fuses
AI-‐subfields
• Vision
and
Language
• Vision
and
Control
h2p://mscoco.org/
Atari
Breakout
game
&
AlphaGo,
DeepMind.
-‐>
AGI
• Mul0ple
Encoding
and
Decoding
13. Image
Cap#oning
f(
)
=
The
man
at
bat
is
ready
to
swing
at
the
pitch
Vision
Language
Recurrent
Neuron
Network
(RNN)
credit:
Nature
convolu0ons
Convolu#on
Neuron
Network
(CNN)
credit:
wiki
19. DL
Fuses
AI-‐subfields
• Vision
and
Language
• Vision
and
Control
h2p://mscoco.org/
Atari
Breakout
game
&
AlphaGo,
DeepMind.
-‐>
AGI
• Mul0ple
Encoding
and
Decoding
20. Vision
and
Control
h2ps://gym.openai.com/
• Learning
to
play
game
with
weak
supervision:
Reinforcement
Learning
(RL)
21. Where
It
All
Begins
…
by
DeepMind
in
NIPS
2013
Deep
Learning
Wrokshop
Playing Atari with
Deep Reinforcement Learning
slides
by
Yen-‐Chen
Lin
22. Control:
Learning
to
Act
Play
Breakout
equals
to
• Input:
screen
images
• Output:
ac0ons
(do
nothing
|
left
|
right)
Supervised
Classifica0on
slides
by
Yen-‐Chen
Lin
23. Supervised
Solu#on
• Training data:
Record
experts
game
sessions
• Target label:
Ac0on
experts
take
at
every
step
• What
if
there’s
no
expert?
• This
is
not
how
human
learns
Problems:
slides
by
Yen-‐Chen
Lin
24. How
Human
Learns
• Don’t
need
somebody
to
tell
us
a
million
0mes
which
move
to
choose
at
each
screen
• Just
need
occasional feedback
that
we
did
the
right
thing
slides
by
Yen-‐Chen
Lin
25. Reinforcement
Learning
• Somewhere
between
supervised
and
unsupervised
learning
• Sparse
and
time-delayed
labels
Based
only
on
those
rewards,
the
agent
has
to
learn
to
behave
in
the
environment.
A
ra0onal
agent
should
op0mize
total
reward.
slides
by
Yen-‐Chen
Lin
27. Markov
Decision
Process
• State
• Action
• Reward
The
probability
of
the
next
state
si+1
depends
only
on
current
state
si
and
ac0on
ai.
slides
by
Yen-‐Chen
Lin
28. Episode
One
episode
of
this
process
(e.g.
one
game)
forms
a
finite
sequence
of
states,
ac0ons
and
rewards:
slides
by
Yen-‐Chen
Lin
29. Example:
Breakout
• State: game
screen
• Action:
• Reward:
game
score
1. do
nothing
2.
le6
3.
right
slides
by
Yen-‐Chen
Lin
30. Example:
Breakout
• State: successive
game
screens
• Action:
• Reward:
game
score
1. do
nothing
2.
le6
3.
right
slides
by
Yen-‐Chen
Lin
31. • To
perform
well,
we
should
also
take
future
rewards
into
account,
how
to
do
that?
Total reward:
Total future reward:
Reward
slides
by
Yen-‐Chen
Lin
32. Discounted
Future
Reward
• However,
since
the
environment
is
stochas0c,
intui0vely
one
should
earn
reward
as
soon
as
possible
Total discounted future reward:
slides
by
Yen-‐Chen
Lin
33. Q
func#on
• Q(s, a):
The
maximum discounted future reward
when
we
perform
ac0on
a
in
state
s,
and
con0nue
optimally
from
that
point
on.
It represents the “quality” of a certain action in a given state.
slides
by
Yen-‐Chen
Lin
34. How
to
Choose
Ac#on?
Here
π
represents
the
policy,
the
rule
how
we
choose
an
ac0on
in
each
state.
If
we
know
Q
func0on,
slides
by
Yen-‐Chen
Lin
35. Q
Func#on
Implementa#on
ac#on
0
ac#on
1
ac#on
2
state
0
-‐2
-‐1
5
state
1
3
2
3
state
2
5
6
-‐6
slides
by
Yen-‐Chen
Lin
36. If
We
Use
Pixels
as
State
1. Resize
images
to
84x84
2. Convert
to
grayscale
with
256
levels
3. Use
last
4
frames
to
represent
state
25684x84x4
=
1067970
possible
game
states
We
can
never
cover
all
the
cases!
slides
by
Yen-‐Chen
Lin
37. Vision
&
Controal:
Deep
Q
Network
We
use
CNN
to
represent
Q
func0on,
which
takes:
• Input:
the
state
(4
game
screens)
and
ac0on
• Output:
Q-‐values
of
different
ac0ons
a
(i.e.,
Q(s,a))
slides
by
Yen-‐Chen
Lin
π(
)=argmaxaQ(
,a)
38. Fusing
Mul#ple
Sensors
Ke#le%
Medium+wrap%
Ke#le%
Medium+wrap%
thumb+4+finger%
Manipula7on%
Region%
Side+view%
Chan
et
al.
ECCV
2015
from
VSLab
44. Take-‐Home
Message
• Encoding
Source
(data)
– N-‐D
observa0on
– N-‐D
sequence
of
observa0ons
• Decoding
Outcome
(AI
tasks)
– N-‐D
single
output
– N-‐D
open-‐ended
sequence
as
output
• Mul0ple
Encoding
and
Decoding
• If
each
module
is
differen0able/approximately
differen0able
-‐>
End-‐to-‐End
Learning
We
get
many
tools
to
tackle
Ar#ficial
General
Intelligence
Just
Try!
Worse
Thing:
Do
Nothing
46. Ques#ons
• Can
I
simply
ask
my
engineers
to
use
open
source
deep
learning
tools
to
create
new
products?
Answer:
Yes
and
Not
really.
Yes
–
if
you
want
to
complete
a
well-‐known
task.
But
Google’s
MLaaS
product
will
almost
always
beat
you.
Not
really
–
if
you
want
to
solve
your
own
problem,
with
your
own
data.
You
need
talents
or
make
engineers
not
afraid
of
failure.
47. Where
can
I
find
talents?
• Most
talents
are
PhD
students
or
young
professionals
in
the
US
and
EU.
h2p://www.economist.com/news/business/21695908-‐silicon-‐valley-‐fights-‐talent-‐universi0es-‐struggle-‐hold-‐their
How
can
we
compete?
48. Local
Students
• Our
students
know
deep
learning
is
HOT!
[
Deep
Learning
Workshop
中研院
]
500
位參加者
49. Case
Study:
NTHU@TW
Undergraduate
h2ps://github.com/yenchenlin1994/DeepLearningFlappyBird
51. To-‐Do
for
Local
Students
• We
need
more
students
to
work
on
– realis0c
deep
learning
projects
with
– enough
computer
resource
• We
need
some
of
them
to
stay
in
our
local
industry
Advanced
Deep
Learning
Course
at
NTHU
(105學年)
1. Taught
by
a
group
of
profs
2. Topics
including
latest
DNN
models,
distributed
training,
DL
for
embedded
system
3. Sponsored
by
MTK
and
ITRI
巨資中心
4. More
sponsors
are
welcomed!
52. For
Talents
Abroad
Get
in
the
Talents
Race!
h2p://cvpr2016.thecvf.com/exhibit/industry_expo