3. 脳科学と人工知能の共進化:視覚
脳科学 ⼈⼯知能
多層教師あり学習
(Amari 1967)
ネオコグニトロン
(Fukushima 1980)
ConvNet(Krizhevsky, Sutskever, Hinton, 2012)
GoogleBrain(2012)
場所細胞
(O’Keefe 1976)
顔細胞(Bruce, Desimone, Gross1981)
(Sugase et al. 1999)
e calculated the time
from the number of
indow of 50 ms, and
th the x2
test13
(see
mation analysis for the
ron coded ‘significant’
he earliest information
d rapidly, correspond-
nse (beige histogram),
2 (monkey expression)
peaks before levelling off. F1 (monkey identity) and F4 (human
expression) were encoded at a much lower level than the others. The
response latency of this neuron was 53 ms, and the latencies for the
information about each category were as follows: G, 45; F1, 157; F2,
93; F3, 125; and F4, 125 ms. Thus, the earliest transient discharge
conveyed global information, and the later sustained discharge
encoded one category of the fine information best14
.
Of the 86 face-responsive neurons, 11 neurons (13%) did not
encode significant information in any category, 43 (50%) coded
either global or fine category information, and 32 (37%) coded both
categories. To clarify how single neurons encoded both global and
D
A B C D
A B C D
1
2
3
c
d
HIPPOCAMPAL PLACE UNITS 87
WALL
-213-4-j
lRACK
FIG. 2. Place fields for all place units except 21342 and those from animal 217.
distributed around the maze. The concentration of fields from the other
animals in arm B may have reflected the fact that many of the rats spent
their “free time” in this arm. The fact that the initial search for units
was conducted there might also have introduced a bias towards units
active in that area. In any case, it was clear that the majority of fields
were not located in those places which contained the rewards or other
FIG. 3. Place fields for place units from animal 217.
経験依存学習
(Blakemore & Cooper 1970)
RECEPTIVE FIELDS IN CAT STRIATE CORTEX 579
found by changing the size, shape and orientation of the stimulus until a clear
response was evoked. Often when a region with excitatory or inhibitory
responses was established the neighbouring opposing areas in the receptive
field could only be demonstrated indirectly. Such an indirect method is
illustrated in Fig. 3B, where two flanking areas are indicated by using a short
slit in various positions like the hand of a clock, always including the very
A B
+7 -!
mm -I
- m_
aS~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T T T
Fig. 3. Same unit as in Fig. 2. A, responses to shinling a rectangular light spot, 1° x 8°; centre of
slit superimposed on centre of receptive field; successive stimuli rotated clockwise, as shown
to left of figure. B, responses to a 1° x 5° slit oriented in various directions, with one end
always coveringthe centre ofthe receptive field: note that this central region evoked responses
when stimulated alone (Fig. 2a). Stimulus and background intensities as in Fig. 1; stimulus
duration 1 sec.
centre of the field. The findings thus agree qualitatively with those obtained
with a small spot (Fig. 2a).
Receptive fields having a central area and opposing flanks represented a
common pattern, but several variations were seen. Some fields had longnarrow
central regions with extensive flanking areas (Figs. 1-3): others had a large
central area and concentrated slit-shaped flanks (Figs. 6, 9, 10). In many
fields the two flanking regions were asymmetrical, differing in size and shape;
in these a given spot gave unequal responses in symmetrically corresponding
37 PHYSIO. CXL,VIIT) by guest on October 20, 2012jp.physoc.orgDownloaded from J Physiol (
特徴抽出細胞
(Hubel & Wiesel 1959)
パーセプトロン
(Rosenblatt 1962)
4. 脳科学と人工知能の共進化:強化学習
脳科学 ⼈⼯知能
0
1
delay)3s delay)6s delay)9s
**
古典的条件付け(Pavlov 1903)
オペラント条件付け
(Thorndike 1898)
Deep Q network (Mnihet al. 2015)difficult and engaging for human players. We used the same network
architecture, hyperparameter values (see Extended Data Table 1) and
learningprocedurethroughout—takinghigh-dimensionaldata(210|160
colour video at 60 Hz) as input—to demonstrate that our approach
robustly learns successful policies over a variety of games based solely
We compared DQN with the best performing methods from the
reinforcement learning literature on the 49 games where results were
available12,15
.Inaddition tothe learned agents,wealsoreportscoresfor
aprofessionalhumangamestesterplayingundercontrolledconditions
and a policy that selects actions uniformly at random (Extended Data
Convolution Convolution Fully connected Fully connected
No input
Figure 1 | Schematic illustration of the convolutional neural network. The
details of the architecture are explained in the Methods. The input to the neural
network consists of an 8438434 image produced by the preprocessing
map w, followed by three convolutional layers (note: snaking blue line
symbolizes sliding of each filter across input image) and two fully connected
layers with a single output for each valid action. Each hidden layer is followed
by a rectifier nonlinearity (that is, max 0,xð Þ).
RESEARCH LETTER
TD学習(Sutton et al. 1983)
ドーパミン細胞の報酬
予測誤差応答
(Schultzetal.1993,1997)
ドーパミン依存
シナプス可塑性
(Wickensetal.2000)
W. SCHULTZ4
fails to occur, even in the absence of an immediately preced-
ing stimulus (Fig. 2, bottom). This is observed when animals
fail to obtain reward because of erroneous behavior, when
liquid flow is stopped by the experimenter despite correct
behavior, or when a valve opens audibly without delivering
liquid (Hollerman and Schultz 1996; Ljungberg et al. 1991;
Schultz et al. 1993). When reward delivery is delayed for
0.5 or 1.0 s, a depression of neuronal activity occurs at the
regular time of the reward, and an activation follows the
reward at the new time (Hollerman and Schultz 1996). Both
responses occur only during a few repetitions until the new
time of reward delivery becomes predicted again. By con-
trast, delivering reward earlier than habitual results in an
activation at the new time of reward but fails to induce a
depression at the habitual time. This suggests that unusually
early reward delivery cancels the reward prediction for the
habitual time. Thus dopamine neurons monitor both the oc-
currence and the time of reward. In the absence of stimuli
immediately preceding the omitted reward, the depressions
do not constitute a simple neuronal response but reflect an
expectation process based on an internal clock tracking the
precise time of predicted reward.
Activation by conditioned, reward-predicting stimuli
About 55–70% of dopamine neurons are activated by
conditioned visual and auditory stimuli in the various classi-
cally or instrumentally conditioned tasks described earlier
(Fig. 2, middle and bottom) (Hollerman and Schultz 1996;
Ljungberg et al. 1991, 1992; Mirenowicz and Schultz 1994;
Schultz 1986; Schultz and Romo 1990; P. Waelti, J. Mire-
nowicz, and W. Schultz, unpublished data). The first dopa-
mine responses to conditioned light were reported by Miller
et al. (1981) in rats treated with haloperidol, which increased
the incidence and spontaneous activity of dopamine neurons
but resulted in more sustained responses than in undrugged
animals. Although responses occur close to behavioral reac-
tions (Nishino et al. 1987), they are unrelated to arm and
eye movements themselves, as they occur also ipsilateral toFIG. 2. Dopamine neurons report rewards according to an error in re-
ward prediction. Top: drop of liquid occurs although no reward is predicted the moving arm and in trials without arm or eye movements
at this time. Occurrence of reward thus constitutes a positive error in the (Schultz and Romo 1990). Conditioned stimuli are some-
prediction of reward. Dopamine neuron is activated by the unpredicted
what less effective than primary rewards in terms of response
線条体の⾏動価値表現 (Samejima et al. 2005)
⼤脳基底核TD学習仮説
(Bartoet al. 1995,
Montague et al. 1996)
!
V(s) Q(s,a)
11. Model-based action planning
involves cortico-cerebellar and
basal ganglia networks
Alan S. R. Fermin , ,
,TakehikoYoshida ,
, JunichiroYoshimoto ,
, Makoto Ito ,
SaoriC.Tanaka & Kenji Doya , , ,
Humans can select actions by learning, planning, or retrieving motor memories. Reinforcement
Learning (RL) associates these processes with three major classes of strategies for action selection:
exploratory RL learns state-action values by exploration, model-based RL uses internal models to
simulate future states reached by hypothetical actions, and motor-memory RL selects past successful
state-action mapping. In order to investigate the neural substrates that implement these strategies,
we conducted a functional magnetic resonance imaging (fMRI) experiment while humans performed a
ventromedial prefrontal cortex and ventral striatum increased activity in the exploratory condition;
the dorsolateral prefrontal cortex, dorsomedial striatum, and lateral cerebellum in the model-based
condition; and the supplementary motor area, putamen, and anterior cerebellum in the motor-memory
implements the model-based RL action selection strategy.
Using exploration and reward feedback, humans and other animals have a remarkable capacity to learn new
motor behaviors without explicit teaching1
. Throughout most of our lives, however, we depend on explicit or
implicit knowledge, based upon past experiences, such as a map of the area or properties of the musculoskeletal
system, to enable focused exploration and efficient learning2,3
. After repeated practice, a motor behavior becomes
stereotyped and can be executed with little mental load4
. What brain mechanisms enable animals to employ
different learning strategies and to select or integrate them in a given situation? In this paper, we take a new
behavioral paradigm that captures different stages of motor learning during a single experimental session5
, and
using fMRI we explore brain structures that are specifically involved in implementing different learning strategies.
6
R
A
P
OPEN
(Alan Fermin)
20
n 脳内シミュレーションには
⼤脳⽪質/基底核/⼩脳が関与
1
3 2
1 2 3
+
dition 3 Condition 2 Condition 3B
D
Condition 1 Condition 3A Condition 2 Condition 3B
C D
Condition 1 Condition 3A Condition 2 Condition 3B
C D
14. 脳の回路はなぜ柔軟に活性化/接続され得るのか?
計算理論
n ゲーティングの学習
(Jacobs, Jordan, 1991; Singh 1992)
n 予測誤差による選択/統合
l MOSAIC architecture
(Wolpert, Kawato, 1998; Sugimoto et al., 2012)
n 不確かさによる選択/統合
l Bayesian cue integration
(Deneve et al., 2001; Daw et al., 2005)
神経機構
n ⼤脳基底核によるゲーティング
(Eliasmith et al., 2012)
n 視床/前障 (claustrum)
(Crick 1984; Crick, Koch 2005)
n 樹状突起/軸索の抑制
(Wang & Yang, 2018)
n 脳波/スパイク同期
(Malsburg; Singer)
MOSAIC for Multiple-Reward Environments 581
Figure 1: Schematic diagram of MOSAIC-MR. Reward modules (top left) pre-
dict the immediate reward (ˆri), and forward modules (bottom left) predict the
local dynamics of environment (ˆ˙xj). Reward and forward modules decompose
a complex environment into subenvironments on the basis of prediction errors
(r − ˆri) and (˙x − ˆ˙xj), respectively. RL modules (displayed at right) output action
(uk) appropriate for each subenvironment. The learning and control of each
RL module are weighted based on TD error (δk), that is, on how well each RL
controller predicts the expected discounted future rewards.
modules is performed on the basis of the prediction errors of the reward
and the dynamics, respectively. The RL module, which receives the reward
and dynamics predictions (ˆr(t), ˆ˙x(t)) and the weights of each module in
80 Neurobiology of behavior
Figure 4
Input pathway 2
ThalamusBasal Ganglia
Output pathway 1
Output pathway 2
Input
gating
Output
gating
Sync. gating
Recurrent
gating
Striatal/Thalamic
gating
Input pathway 1
STC DTC
ITC
Current Opinion in Neurobiology
Various mechanisms for information gating in the brain. Input gating can be achieved by dendrite-targeting interneurons that selectively control
inputs to pyramidal dendrites. In the synchronous gating mechanism, communication between two areas depends on the degree of temporal
synchrony of neural activity between the source and target areas. Recurrent gating mechanism involves selective integration of inputs based on
context-dependent dynamics of the network. Output gating is instantiated with perisoma-targeting interneurons that specifically inhibit pyramidal
neurons projecting to one pathway but not others. Gating may also involve subcortical structures, especially basal ganglia and thalamus.
15. 強化学習
n 報酬の予測: 価値関数
l V(s) = E[ r(t) + gr(t+1) + g2r(t+2)…| s(t)=s]
l Q(s,a) = E[ r(t) + gr(t+1) + g2r(t+2)…| s(t)=s, a(t)=a]
n ⾏動の選択
l greedy: a = argmax Q(s,a)
l Boltzmann: P(a|s) µ exp[ b Q(s,a)]
n 予測の修正: TD 誤差
l d(t) = r(t) + gV(s(t+1)) - V(s(t))
l DV(s(t)) = a d(t)
l DQ(s(t),a(t)) = a d(t)
これらのステップを実現する回路機構は?
これらのパラメタの制御機構は?
25. 行動戦略の多型性
(Elfwing et al. 2015)
n 貪⾷ロボと好⾊ロボ
n 進化的安定性
Average mating performance (ts/mating)
Figure 2. The Correlation between the average mating learning performance and the
average fitness in the final 20 generations in all experiments. The learning performance was
estimated as the number of time steps the mating behavior was selected divided with number of mating
events. The seven types of markers indicate the number of energy sources in the environment for each
simulation.
(a) (b)
Figure 3. Example trajectories of the learned behaviors for the roamer strategy and the
stayer strategy. (a) The roamer ignores the tail-lamp of the mating partner and executes the learned
foraging behavior to capture the energy source. (b) The stayer executes the learned waiting behavior
and adjusts its position according to the trajectory of the mating partner.
15
−0.5 0 0.5 1
1.6
2
2.4
2.8
w1
w5
Roamers
Stayers
(a)
0 0.2 0.4 0.6 0.8 1
0%
4%
8%
12%
¯Em
Percentofpopulation
Roamers
Stayers
(b)
80%
100%
e
Roamers
Stayers
顔への重み
交配⾏動のバイアス 交配⾏動を取る充電レベル
貪⾷型
Forager
好⾊型
Tracker
Foragers
Trackers
Foragers
Trackers
⽐率
0 0.25 0.5 0.75 1
10
11
12
Stayer proportion
Average#matingevents
Roamers
Stayers
Population
(a)
0.25 0.5 0.75
0.25
0.5
0.75
Stayer proportion
¯Mr→s/ ¯Mr
¯Ms→s/ ¯Ms
¯Mr/ ¯Ms
(b)
0 0.25 0.5 0.75 1
0.6
0.7
0.8
0.9
1
Averagematingenergylevel
Stayer proportion
Roamers
Stayers
(c)
0 0.25 0.5 0.75 1
16
17
18
19
20
21
Stayer proportion
Averagefitness
Roamers
Stayers
(d)
Figure 5. Average number of number of mating events, average proportion of mating
events with stayer mating partners, average energy level at the mating events, and average
fitness, as functions of the stayer proportion in the population, for the roamer (green solid
lines with circles) and stayer (red solid lines with circles) subpopulations. (a) The dotted
lines show the best linear fit for the two subpopulations and the black line shows average values for the
population as a whole. (b) The dotted lines show the best linear fit for the two subpopulations and the
black line shows average ratio of the number of roamer mating events to the number of stayer mating
events. (c) The dotted lines show the constant approximations as the average values over all phenotype
proportions. (d) The dotted lines show the estimated fitness values using Equations 6 and 8.
Tracker proportion
Foragers
Trackers
27. futureoflife.org/ai-principles
53
Research Issues
1) Research Goal: The goal of AI research should be to create not undirected intelligence, but beneficial
intelligence.
2) Research Funding: Investments in AI should be accompanied by funding for research on ensuring its
beneficial use, including thorny questions in computer science, economics, law, ethics, and social studies,
such as:
How can we make future AI systems highly robust, so that they do what we want without malfunctioning or
getting hacked?
How can we grow our prosperity through automation while maintaining people’s resources and purpose?
How can we update our legal systems to be more fair and efficient, to keep pace with AI, and to manage
the risks associated with AI?
What set of values should AI be aligned with, and what legal and ethical status should it have?
3) Science-Policy Link: There should be constructive and healthy exchange between AI researchers and
policy-makers.
4) Research Culture: A culture of cooperation, trust, and transparency should be fostered among
researchers and developers of AI.
5) Race Avoidance: Teams developing AI systems should actively cooperate to avoid corner-cutting on
safety standards.
Ethics and Values
6) Safety: AI systems should be safe and secure throughout their operational lifetime, and verifiably so
where applicable and feasible.
7) Failure Transparency: If an AI system causes harm, it should be possible to ascertain why.
8) Judicial Transparency: Any involvement by an autonomous system in judicial decision-making should
provide a satisfactory explanation auditable by a competent human authority.
9) Responsibility: Designers and builders of advanced AI systems are stakeholders in the moral implications
of their use, misuse, and actions, with a responsibility and opportunity to shape those implications.
10) Value Alignment: Highly autonomous AI systems should be designed so that their goals and behaviors
can be assured to align with human values throughout their operation.
11) Human Values: AI systems should be designed and operated so as to be compatible with ideals of
human dignity, rights, freedoms, and cultural diversity.
12) Personal Privacy: People should have the right to access, manage and control the data they generate,
given AI systems’ power to analyze and utilize that data.
13) Liberty and Privacy: The application of AI to personal data must not unreasonably curtail people’s real or
perceived liberty.
14) Shared Benefit: AI technologies should benefit and empower as many people as possible.
15) Shared Prosperity: The economic prosperity created by AI should be shared broadly, to benefit all of
humanity.
16) Human Control: Humans should choose how and whether to delegate decisions to AI systems, to
accomplish human-chosen objectives.
17) Non-subversion: The power conferred by control of highly advanced AI systems should respect and
improve, rather than subvert, the social and civic processes on which the health of society depends.
18) AI Arms Race: An arms race in lethal autonomous weapons should be avoided.
Longer-term Issues
19) Capability Caution: There being no consensus, we should avoid strong assumptions regarding upper
limits on future AI capabilities.
20) Importance: Advanced AI could represent a profound change in the history of life on Earth, and should
be planned for and managed with commensurate care and resources.
21) Risks: Risks posed by AI systems, especially catastrophic or existential risks, must be subject to planning
and mitigation efforts commensurate with their expected impact.
22) Recursive Self-Improvement: AI systems designed to recursively self-improve or self-replicate in a
manner that could lead to rapidly increasing quality or quantity must be subject to strict safety and control
measures.
23) Common Good: Superintelligence should only be developed in the service of widely shared ethical ideals,
and for the benefit of all humanity rather than one state or organization.
30. 人工知能は脳からさらに何を学ぶべきか
エネルギー効率
データ効率
l 世界モデルと脳内シミュレーション
l モジュール⾃⼰組織化
l メタ学習
⾃律性と社会性
(銅⾕, 松尾, Brain and Nerve, 2019; groups.oist.jp/ncu/publications)
C D
!"#$%&'(
Condition 1 Condition 3A Condition 2 Condition 3B
C D
!"#$%&'(
Condition 1 Condition 3A Condition 2 Condition 3B
C D
!"#$%&'(
14
16 18 20 22 24 26 28
15
20
25
30
Average mating performance (ts/mating)
Averagefitness
4 b.
6 b.
8 b.
10 b.
12 b.
14 b.
16 b.
R = −0.84
p < 0.0001
Figure 2. The Correlation between the average mating learning performance and the
average fitness in the final 20 generations in all experiments. The learning performance was
estimated as the number of time steps the mating behavior was selected divided with number of mating
events. The seven types of markers indicate the number of energy sources in the environment for each
simulation.
(a) (b)
Figure 3. Example trajectories of the learned behaviors for the roamer strategy and the
stayer strategy. (a) The roamer ignores the tail-lamp of the mating partner and executes the learned
foraging behavior to capture the energy source. (b) The stayer executes the learned waiting behavior
and adjusts its position according to the trajectory of the mating partner.
respectively). The remainder of differences were not significant
(Supplementary Table 2). Subsequently, we tested variability of
waiting time ratio among mice. We compared the obtained mixed
model with the model including a fixed effect of reward-delay
condition, but not a random effect of MI. To evaluate likelihood
Bayesian decision model of waiting. Can these effects of ser-
otonin on waiting, depending on the RP and timing uncertainty,
be explained in a coherent way? Here we consider the possibility
that serotonin signals the prior probability of reward delivery in a
Bayesian model of repeated decisions to wait or to quit. In this
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
0
10
20
30
40
50
60
70
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
0
10
20
30
40
50
60
70
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
0
10
20
30
40
50
60
70
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
0
10
20
30
40
50
60
70
Waiting time (s)
Waiting time (s)
Waiting time (s)
Waiting time (s)
NumberoftimesNumberoftimesNumberoftimesNumberoftimes
Delay 6 test
Delay 4-6-8 test
Delay 2-6-10 test
Delay 10 test
a
b
c
d
Serotonin no-activation
Serotonin activation
Serotonin no-activation
Serotonin activation
Serotonin no-activation
Serotonin activation
Serotonin no-activation
Serotonin activation
Fig. 5 Optogenetic activation of DRN serotonin neurons enhances waiting for temporally uncertain rewards. a Distribution of waiting time during omission
trials in the D6 test. b Distribution of waiting time during omission trials in the D4-6-8 test. c Distribution of waiting time during omission trials in the D2-
6-10 test. d Distribution of waiting time during omission trials in the D10 test. Orange circles illustrate the timing and number of food pellets presented in
rewarded trials. White circles denote omission trials
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-04496-y
?
0.12
NormalizedmeanR/R
0
1
PPC: N = 43
Normalized
meanR/R
Normalized
likelihood
-10
15
Posterior
Goal distance (cm)
67 33 0
33 0 33 0
0.04
0
0
Goal distance (cm)
67 33 0 67 33 0 67 67
Sound
zone
a
Estimateddistance(cm)
Layer2
PM
PPC
Continuous Intermittent1 Intermittent2
b
67
33
0
33
0
Sound zoneSound zone
Decod
[0 0.0
Deco
31. 謝辞
n 強化学習ロボット
l Paavo Parmas
l Jiexin Wang (ATR)
l 吉⽥尚⼈ (U Tokyo)
l Stefan Elfwing (ATR)
l 内部英治 (ATR)
l 森本淳 (ATR)
n ⾏動価値/線条体
l 伊藤真 (Progress Technologies)
l 吉澤知彦 (⽟川⼤)
l Charles Gerfen (NIH)
l 鮫島和⾏(⽟川⼤)
l ⽊村實(⽟川⼤)
l 上⽥康雅(関⻄医⼤)
n 2光⼦イメージング
l 船⽔章⼤ (CSHL)
l Bernd Kuhn
l Sergey Zobnin
l Yuzhe Li
n セロトニン
l 宮崎勝彦
l 宮崎佳代⼦ (JSPS)
l 濱⽥太陽
l ⽥中謙⼆(慶應⼤)
l ⼭中章宏(名古屋⼤)
n fMRI
l Alan Fermin(⽟川⼤)
l 吉⽥岳彦
l ⽥中沙織(ATR)
l Nicolas Schweighofer (USC)
l 吉本潤⼀郎(NAIST)
l 清⽔優
l 徳⽥智磯(NAIST
l ⼭脇成⼈(広島⼤)
l 川⼈光男(ATR)
n ⼤脳基底核−視床−⼤脳⽪質モデル
l 庄野修(HRI)
l 五⼗嵐潤(理研)
l Jan Moren
l Jean Leinard
l Calos Gutierrez
l Benoit Girard, Daphne Heraiz (UPMC)
l 橘吉寿(神⼾⼤)
l 南部篤(⽣理研)
l ⾼⽊周(東⼤)
n 新学術領域研究「予測と意思決定」
n 新学術領域研究「⼈⼯知能と脳科学」
n ポスト京萌芽的課題「全脳シミュレーションと⼈⼯知能」
n ⾰新的技術による脳機能ネットワークの全容解明プロジェクト
n 脳科学研究戦略推進プログラム
32. International Symposium on AI and Brain Science
n October 10–12, 2020, Online www.brain-ai.jp/symposium2020
59
33. Special Issue on
Artificial Intelligence and Brain Science
60
www.journals.elsevier.com/neural-networks/
n Timeline
l Manuscript submission due: January 10, 2021
l First review completed: March 31, 2021
l Revised manuscript due: May 31, 2021
l Final decisions to authors: June 30, 2021
n Guest Editors:
l Kenji Doya, Okinawa Institute of Science and Technology (ncus@oist.jp)
l Karl Friston, University College London
l Masashi Sugiyama, RIKEN AIP and The University of Tokyo IRCN
l Josh Tenenbaum, Massachusetts Institute of Technology