3. 今日の焦点:質問応答
‣ 目標:質問文の意味を理解して、データベースから答えを探すよ
うなシステムを構築する
Liang
et
al.’11
The Big Picture
What is the most populous city in California?
Database System
Los Angeles
ensive: logical forms
le & Mooney, 1996; Zettlemoyer & Collins, 2005]
ong & Mooney, 2007; Kwiatkowski et al., 2010]
at is the most populous city in California?
argmax( x.city(x) ^ loc(x, CA), x.pop.(x))
w many states border Oregon?
count( x.state(x) ^ border(x, OR)
·
They
allow
us
to
temporarily
sidestep
intractable
philosophical
ques5ons
on
how
to
represent
meaning
in
general. Liang
et
al.’13
2 /37
4. 今日の範囲
‣ 質問応答の分野での、意味の表現についての議論
-‐ 学習については多分あまり話しません
-‐ 主に二つの意味表現の紹介と、両者の最近の進展について
CCG
g algo-
original
he lexi-
f l and
to the
ide for
handle
y. For
ston `
flights from Boston
N (NN)/NP NP
lx.flight(x) lyl flx.f(x)^ from(x,y) bos
>
(NN)
l flx.f(x)^ from(x,bos)
<
N
lx.flight(x)^ from(x,bos)
Given analyses of this form, we introduce new
templates that will allow us to recover from miss-
ing words, for example if “from” was dropped. We
identify commonly occurring nodes in the best parse
trees found during training, in this case the non-
DCS
New: Dependency-Based Compositional Semanti
most populous city in California
1
1
1
1
cc
argmax
population
2
1
CA
loc
city
3 /37
5. 問題設定:意味表現の学習The Big Picture
What is the most populous city in California?
Database System
Los Angeles
ve: logical forms
Mooney, 1996; Zettlemoyer & Collins, 2005]
Mooney, 2007; Kwiatkowski et al., 2010]
the most populous city in California?
ax( x.city(x) ^ loc(x, CA), x.pop.(x))
ny states border Oregon?
t( x.state(x) ^ border(x, OR)
自然言語を、コンピュータ
の理解できる意味表現に変換
論理式(プログラミング言語)
Database
Expensive: logical forms
[Zelle & Mooney, 1996; Zettlemoyer & Collin
[Wong & Mooney, 2007; Kwiatkowski et al.,
What is the most populous city in Cali
) argmax( x.city(x) ^ loc(x, CA), x
How many states border Oregon?
) count( x.state(x) ^ border(x, OR)
· · ·
Database System
Los Angeles
Expensive: logical forms
[Zelle & Mooney, 1996; Zettlemoyer & Collins, 2005]
[Wong & Mooney, 2007; Kwiatkowski et al., 2010]
What is the most populous city in California?
) argmax( x.city(x) ^ loc(x, CA), x.pop.(x))
How many states border Oregon?
) count( x.state(x) ^ border(x, OR)
· · ·
その他の表現
New: Dependency-Based Compositional Sema
most populous city in California
1
1
1
1
cc
argmax
population
2
1
CA
loc
city
文 意味表現 答え
曖昧性がある
難しい!
決定的
簡単!
文が与えられたとき、正解の意味表現
もしくは答えを求められれば良い
教師あり学習
4 /37
6. 教師データの与え方
文 意味表現 答え
曖昧性がある
難しい!
決定的
簡単!
意味表現文 のペア 答え文 のペア
アノテートが高コスト
学習はより簡単
非専門家がアノテートできる
学習が難しい
How
many
states
border
Oregon?
count(λx.state(x)
∧
border(x,
OR)
How
many
states
border
Oregon?
3
5 /37
7. 大きく二つの意味表現
CCG
+
論理式系 DCS系
他に
Tree
Grammar
系もあるが省略
文と論理式の
ペアから学習
文と答えの
ペアから学習
Ze#lemore
&
Collins’05
Ze#lemore
&
Collins’07
Kwiatkovski
et
al.’10
Kwiatkovski
et
al.’11
・・・
Liang
et
al.’11
Berant
et
al.’13
Berant
and
Liang’14
Kwiatkovski
et
al.’13
Artzi
&
Ze#lemore’11
Artzi
&
Ze#lemore’13
Matsuzek
et
al.’12
QA以外
6 /37
8. 大きく二つの意味表現
CCG
+
論理式系 DCS系
他に
Tree
Grammar
系もあるが省略
文と論理式の
ペアから学習
文と答えの
ペアから学習
Ze#lemore
&
Collins’05
Ze#lemore
&
Collins’07
Kwiatkovski
et
al.’10
Kwiatkovski
et
al.’11
・・・
Liang
et
al.’11
Berant
et
al.’13
Berant
and
Liang’14
Kwiatkovski
et
al.’13
Artzi
&
Ze#lemore’11
Artzi
&
Ze#lemore’13
Matsuzek
et
al.’12
QA以外
6 /37
9. 問題設定の確認
‣ 自然文から、論理式への変換を行う分類器を構築したい
-‐ 機械翻訳に似ている?(そういう手法もある)
-‐ 論理式は構造を持っていることが異なる
-‐ 関数の合成によって式を得たい
-‐ 文の構造に沿って論理式の計算が
できる枠組みが欲しい
-‐ そのための道具として
CCG
を
用いる
How
many
states
border
Oregon?
count(λx.state(x)
∧
border(x,
OR))
How
many
states
border
Oregon?
λf.λg.count(λx.f(x)∧g(x))
λg.count(λx.state(x)∧g(x))
λx.state(x)
λx.border(x,
OR)
count(λx.state(x)
∧
border(x,
OR))
λg.count(λx.
state(x)∧g(x)) λx.border(x,
OR)
7 /37
10. Combinatory
Categorical
Grammar
CCG
=
Combinatory
rules
+
Categorical
Grammar
文の構造を記述する文法理論の一種
依存文法
(Dependency
Grammar)
John
loves
Mary
sbj obj
文脈自由文法
(CFG)
John
loves
Mary
S
NP VP
範疇文法
(Categorical
Grammar)
John
loves
Mary
NP SNP/NP NP
SNP
S
見た目は
CFG
と似ているが
8 /37
11. (CG)
•
• S, NP, N
•
– “/” “”
– X/Y Y X
– XY Y X
•
– SNP
– SNP/NP
SNPNP
John walked
S
宮尾祐介
(2012)
自然言語処理における
構文解析と言語理論の関係
より
9 /37
12. 組み合わせ規則
John
loves
Mary
NP SNP/NP NP
SNP
S
X/Y
Y
X
‣ 少数の組み合わせ規則が存在する
-‐ forward
applicaXon
(>)
-‐ backward
applicaXon
(<)
Y
XY
X
X
と
Y
にはどんなカテゴリも入る
文法が定めるのは、これらの少数のルールだけ
10 /37
13. 組み合わせ規則
John
loves
Mary
NP SNP/NP NP
SNP
S
X/Y
Y
X
Y
XY
X
X
と
Y
にはどんなカテゴリも入る
文法が定めるのは、これらの少数のルールだけ
‣ 少数の組み合わせ規則が存在する
-‐ forward
applica-on
(>)
-‐ backward
applicaXon
(<)
10 /37
14. 組み合わせ規則
John
loves
Mary
NP SNP/NP NP
SNP
S
X/Y
Y
X
‣ 少数の組み合わせ規則が存在する
-‐ forward
applicaXon
(>)
-‐ backward
applica-on
(<)
Y
XY
X
X
と
Y
にはどんなカテゴリも入る
文法が定めるのは、これらの少数のルールだけ
10 /37
15. 組み合わせ規則
John
loves
Mary
NP SNP/NP NP
SNP
S
X/Y
Y
X
‣ 少数の組み合わせ規則が存在する
-‐ forward
applicaXon
(>)
-‐ backward
applicaXon
(<)
Y
XY
X
CCG
の導出は証明の形で表されることが多い
loves
Mary
John SNP/NP NP
>
SNPNP
<
S
10 /37
16. 意味表現の計算
‣ CCGを用いることの利点:木構造に沿って意味の計算が行える
-‐ 各単語には、カテゴリと共に、ラムダ式の形で意味表示が与えられる
-‐ 各規則は、論理式の合成の仕方も規定する
forward
applicaXon
(>)
X/Y
Y
X
f
g
f(g)
backward
applicaXon
(<)
Y
XY
X
g
f
f(g)
loves
Mary
John
SNP/NP NP
λx.λy.love(y,x) mary
>
SNP
λy.love(y,mary)
NP
john
<
S
love(john,mary)
John
⊢
NP:
john
is
⊢
SNP/NP:
λx.λy.love(y,x)
Mary
⊢
NP:
mary
11 /37
17. CCG-‐based:
Overview
ZeHlemore
&
Collins’05,’09
Kwiatkowski
et
al.’10,’11
n-
of
se
vi-
ur
us
e-
x-
gh
x-
to
as
th
a) What states border Texas
x.state(x) ^ borders(x, texas)
b) What is the largest state
arg max( x.state(x), x.size(x))
c) What states border the state that borders the most states
x.state(x) ^ borders(x, arg max( y.state(y),
y.count( z.state(z) ^ borders(y, z))))
Figure 1: Examples of sentences with their logical forms.
• Additional quantifiers: The expressions involve
the additional quantifying terms count, arg max,
arg min, and the definite operator ◆. An example
of a count expression is count( x.state(x)), which
returns the number of entities for which state(x)
is true. arg max expressions are of the form
arg max( x.state(x), x.size(x)). The first argu-
ment is a lambda expression denoting some set of en-
tities; the second argument is a function of type he, ri.
教師データ:(文,
論理式)
の集合
機械学習
テスト(評価)
How
many
states
border
Oregon?
???
知っていること
・CCG
の合成規則
・各単語のカテゴリの
ゆるい候補
Y:
g
XY:
f
X:
f(g)
X/Y:
f
Y/Z:
g
X/Z:
λx.f(g(x))
・・・
正解の木構造は与えられない
文の論理式だけをたよりに、
モデルのパラメータを学習
12 /37
18. 正解の木構造は与えられない
‣ 一種の
distant
supervision
-‐ 木構造をアノテートする必要がない
-‐ 普通の構文解析より難しい
-‐ 文法獲得との関連?
b) What states border Texas
(S/(SNP))/N N (SNP)/NP NP
f. g. x.f(x) ^ g(x) x.state(x) x. y.borders(y, x) texas
> >
S/(SNP) (SNP)
g. x.state(x) ^ g(x) y.borders(y, texas)
>
S
x.state(x) ^ borders(x, texas)
e 2: Two examples of CCG parses.
that a sin-
, and hence
ombinatory
ategories in
mplest such
application rules are then extended as follows:
(2) The functional application rules (with semantics):
a. A/B : f B : g ) A : f(g)
b. B : g AB : f ) A : f(g)
Rule 2(a) now specifies how the semantics of the category
b) What states border Texas
(S/(SNP))/N N (SNP)/NP NP
f. g. x.f(x) ^ g(x) x.state(x) x. y.borders(y, x) texas
> >
S/(SNP) (SNP)
g. x.state(x) ^ g(x) y.borders(y, texas)
>
S
x.state(x) ^ borders(x, texas)
What states border Texas
(S/(SNP))/N N (SNP)/NP NP
f. g. x.f(x) ^ g(x) x.state(x) x. y.borders(y, x) texas
> >
S/(SNP) (SNP)
g. x.state(x) ^ g(x) y.borders(y, texas)
>
S
x.state(x) ^ borders(x, texas)
: Two examples of CCG parses.
at a sin-
d hence
binatory
ories in
application rules are then extended as follows:
(2) The functional application rules (with semantics):
a. A/B : f B : g ) A : f(g)
b. B : g AB : f ) A : f(g)
目的関数:
Latent
Variable
Structured
Perceptron
学習:
13 /37
19. 文法獲得との関連(余談)
‣ 二つのゴール:
-‐ Scien-fic:
赤ちゃんが言語を獲得する仕組みを明らかにする
-‐ Engineering:
教師データのない言語の解析に役立つ
-‐ しかし、赤ちゃんは言語以外の様々なシグナルを利用して文法を獲得する
(科学的目的のためには、設定があまり現実的でない)
教師なし構文解析
Klein
&
Manning’04
Smith
&
Eisner’06
Headden
III
et
al.’09
Mareček
&
Žabokrtský’11
・・・
you
have
another
cookie
教師なし学習
you
have
another
cookie
完全に生の文から、モデル
を推定する問題
14 /37
20. 文法獲得との関連(余談)
‣ 今回の問題設定
-‐ (文、論理式)のペアから文の構造
(隠れ変数)
を推定する
-‐ 文法獲得の観点からは、生の文だけで学習するよりも現実的といえる?
‣ より現実的なタスク:
Kwiatkowski
et
al.’12
一文に対し、複数の候補が与えられたもとでの学習
ac.uk lsz@cs.washington.edu steedman@inf.ed.ac.uk
cs †
Computer Science & Engineering
University of Washington
Seattle, WA, 98195, USA
b-
s-
r-
h
s.
i-
e
of
s.
gs
of propositional uncertainty1, from a set of con-
textually afforded meaning candidates, as here:
Utterance : you have another cookie
Candidate
Meanings
8
<
:
have(you, another(x, cookie(x)))
eat(you, your(x, cake(x)))
want(i, another(x, cookie(x)))
The task is then to learn, from a sequence of such
(utterance, meaning-candidates) pairs, the correct
どれが正解か
分からない
15 /37
21. どのように学習するか?
‣ モデルは木構造の上での対数線形モデル
-‐ 主に、各単語がどのようなカテゴリと結びつく
べきか?を学習する
b) What states border Texas
(S/(SNP))/N N (SNP)/NP NP
f. g. x.f(x) ^ g(x) x.state(x) x. y.borders(y, x) texas
> >
S/(SNP) (SNP)
g. x.state(x) ^ g(x) y.borders(y, texas)
>
S
x.state(x) ^ borders(x, texas)
gure 2: Two examples of CCG parses.
Note that a sin-
ype, and hence
application rules are then extended as follows:
(2) The functional application rules (with semantics):
> >
S/(SNP) (SNP)
g. x.state(x) ^ g(x) y.borders(y, texas)
>
S
x.state(x) ^ borders(x, texas)
2: Two examples of CCG parses.
that a sin-
and hence
mbinatory
egories in
plest such
as follows:
application rules are then extended as follows:
(2) The functional application rules (with semantics):
a. A/B : f B : g ) A : f(g)
b. B : g AB : f ) A : f(g)
Rule 2(a) now specifies how the semantics of the category
A is compositionally built out of the semantics for A/B
and B. Our derivations are then extended to include a com-
positional semantics. See Figure 2(a) for an example parse.
This parse shows that Utah borders Idaho has the syntactic
type S and the semantics borders(utah, idaho).
論理式をもとに、単語レベルでありえそうなカテゴリを抽出する
S/(SNP)/(SNP):
λg.λf.λx.g(x)
∧
f(x)
SNP:
λx.state(x)
SNP:
λx.borders(x,texas)
S/(SNP):
λf.λx.state(x)
∧
f(x)
S/S:
λx.x
・・・
S/(SNP)/(SNP):
λg.λf.λx.g(x)
∧
f(x)
SNP:
λx.state(x)
S/S:
λx.x
S/(SNP)/(SNP):
λg.λf.λx.g(x)
∧
f(x)
SNP:
λx.state(x)
S/S:
λx.x
What
-‐
S/S:
λx.x
What
-‐
SNP:
λx.state(x)
42
-‐30
states
-‐
SNP:
λx.state(x) 63
16 /37
22. 手法の進化
‣ Zeblemore
&
Collins’05
-‐ 文と論理式のペアから初めて
CCG
を学習
-‐ いくつかの機能語のカテゴリは固定する
e.g.,
every
⊢
(S/(S|NP))/N:
λf.λg.∀x.f(x)
→
g(x)
‣ Kwiatkovski
et
al.’10
-‐ 全ての語のカテゴリを学習する(英語以外でも学習可能に)
-‐ 良い初期値を得るために
IBM
モデル1
を最初に使う
‣ Kwiatkovski
et
al.’11
-‐ カテゴリのパラメータを分解してスパースネスを抑える
Parameter Initialization
Compute co-occurrence (IBM Model 1)
between words and logical constants
Initial score for new lexical entries: average
over pairwise weights
I want a flight to Boston ` S : x.flight(
I want a flight to Boston ` S : x.flight(x) ^ to(x, BOS)
Artzi
et
al.’13
17 /37
23. 大きく二つの意味表現
CCG
+
論理式系 DCS系
他に
Tree
Grammar
系もあるが省略
文と論理式の
ペアから学習
文と答えの
ペアから学習
Ze#lemore
&
Collins’05
Ze#lemore
&
Collins’07
Kwiatkovski
et
al.’10
Kwiatkovski
et
al.’11
・・・
Liang
et
al.’11
Berant
et
al.’13
Berant
and
Liang’14
Kwiatkovski
et
al.’13
Artzi
&
Ze#lemore’11
Artzi
&
Ze#lemore’13
Matsuzek
et
al.’12
QA以外
18 /37
24. 文と答えのペアから学習Graphical Model
x
capital of
California?
parameters
✓ z
1
2
1
1
CA
capital
⇤⇤
database
w y Sacramento
Semantic Parsing: p(z | x, ✓)
(probabilistic)
Interpretation: p(y | z, w)
(deterministic)
11
‣ これまでは、CCGの導出を隠れ変数としてモデル化した
‣ DCS
では、論理表現を隠れ変数として扱う
19 /37
25. DCS
Dependency-‐based
ComposiXonal
SemanXcs
Basic DCS Trees
DCS tree Constraints
city c 2 city
1
1
c1 = `1
loc ` 2 loc
2
1
`2 = s1
CA s 2 CA
Database
city
San Francisco
Chicago
Boston
· · ·
loc
Mount Shasta California
San Francisco California
Boston Massachusetts
· · · · · ·
CA
California
例:
city
in
California
部分木は
集合を表す
loc
の2列目が
California
であるような
loc
の要素
20 /37
26. DCS
Dependency-‐based
ComposiXonal
SemanXcs
Basic DCS Trees
DCS tree Constraints
city c 2 city
1
1
c1 = `1
loc ` 2 loc
2
1
`2 = s1
CA s 2 CA
Database
city
San Francisco
Chicago
Boston
· · ·
loc
Mount Shasta California
San Francisco California
Boston Massachusetts
· · · · · ·
CA
California
例:
city
in
California
earning Dependency-Based Compositional Semantics
iii
c 9m 9` 9s .
city(c) ^ major(m) ^ loc(`) ^ CA(s)^
c1 = m1 ^ c1 = `1 ^ `2 = s1
(b) Lambda calculus formula
20 /37
27. DCS
の特徴
‣ 論理式は、自然言語と意味表現の間に大きなギャップがある
‣ DCS
は文の係り受け構造にかなり似ている
Challenges
Computational: how to e ciently search exponential space?
What is the most populous city in California?
argmax( x.city(x) ^ loc(x, CA), x.population(x))
Los Angeles
New: Dependency-Based Compositional Semanti
most populous city in California
1
1
1
1
cc
argmax
population
2
1
CA
loc
city
dency-Based Compositional Semantics (DCS)
most populous city in California
most
populous
California
in
city
21 /37
29. Mark-‐ExecuteSolution: Mark-Execute
most populous city in California
Execute at semantic scope
Mark at syntactic scope
x1x1
1
1
1
1
cc
argmax
population
2
1
CA
loc
city
⇤⇤
Superlatives
9
rgence between Syntactic and Semantic Scope
most populous city in California
tax Semantics
in
y
argmax( x.city(x) ^ loc(x, CA), x.population(x))
23 /37
30. 全量子化、Scope
ambiguitySolution: Mark-Execute
Some river traverses every city.
Execute at semantic scope
Mark at syntactic scope
x12x12
2
1
1
1
qq
some
river
qq
every
city
traverse
⇤⇤
Quantification (narrow)
Solution: Mark-Execute
es every city.
tic scope
c scope
x12x12
2
1
1
1
qq
some
river
qq
every
city
traverse
⇤⇤
Quantification (narrow)
9
surface
scope
Solution: Mark-Execute
Some river traverses every city.
Execute at semantic scope
Mark at syntactic scope
x21x21
2
1
1
1
qq
some
river
qq
every
city
traverse
⇤⇤
Quantification (wide)
inverse
scope
∃x.(river(x)
∧
∀y.(city(y)
→
traverse(x,
y))) ∀y.(city(y)
→
∃x.(river(x)
∧
traverse(x,
y)))
継続の
shif/reset
操作と似ているらしい
24 /37
31. どのように学習するか?
‣ CCG
の場合と基本的に同じ
-‐ DCS
の構造については何も仮定しない
-‐ 文の係り受け構造は使わない
Words to Predicates (Lexical Semantics)
city city
state state
river river
argmax population population CA
What is the most populous city in CA ?
Lexical Triggers:
1. String match CA ) CA
2. Function words (20 words) most ) argmax
3. Nouns/adjectives city ) city state river population
機能語や一部の語は
人手で正解を与える
city
in
CA
California
ci5es
Basic DC
DCS tree Constraints
city c 2 city
1
1
c1 = `1
loc ` 2 loc
2
1
`2 = s1
CA s 2 CA
A DCS tree encodes a constraint sat
Computation: dynamic programming
25 /37
32. どのように学習するか?
‣ CCG
の場合と基本的に同じ
-‐ DP
で全探索することができない
-‐ Beam-‐search
で
k-‐best
の木を抽出し、SGD
で更新
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j]
most populous city in California
i jk
Ci,k Ck,j
cc
argmax
population
1
1
2
1
CA
loc
city
1
1
1
1
cc
argmax
population
2
1
CA
loc
city
26 /37
33. 実験:GEO
data
‣ 少し複雑な表現(接続詞、最上級、否定など)を含む
‣ 語彙は少ない
-‐ 単語のタイプ数:280
-‐ 述語の数:48
what
states
does
the
ohio
river
run
through
(lambda
$0
e
(and
(state:t
$0)
(loc:t
ohio_river:r
$0)))
what
states
surround
kentucky
(lambda
$0
e
(and
(state:t
$0)
(next_to:t
$0
kentucky:s)))
what
is
the
capital
of
states
that
have
ci6es
named
durham
(lambda
$0
e
(and
(capital:t
$0)
(exists
$1
(and
(state:t
$1)
(exists
$2
(and
(city:t
$2)
(named:t
$2
durham:n)
(loc:t
$2
$1)))
(loc:t
$0
$1)))))
which
is
the
highest
peak
not
in
alaska
(argmax
$0
(and
(mountain:t
$0)
(not
(loc:t
$0
alaska:s)))
(elevaRon:i
$0))
訓練データ:
論理式
or
答えとペアの文の集合
(600文)
27 /37
34. 比較Experiment 2
On Geo, 600 training examples, 280 test examples
System Description Lexicon Logical forms
zc05 CCG [Zettlemoyer & Collins, 2005]
zc07 relaxed CCG [Zettlemoyer & Collins, 2007]
kzgs10 CCG w/unification [Kwiatkowski et al., 2010]
dcs our system
dcs+ our system
zc05
79.3%
zc07
86.1%
kzgs10
88.9%
dcs
88.6%
dcs+
91.1%
75
80
85
90
95
100
testaccuracy
2328 /37
36. web-‐scale
の
QA
を行いたい
Berant
et
al.’13
Kwiatkovski’13
Berant
and
Liang’14
What was the cover price of the X-men Issue 1?
• Generate questions based on Freebase facts
WebQuestions [our work]: 5,810 examples, 4,525 w
What character did Natalie Portman play in Star Wars?
What kind of money to take to Bahamas?
What did Edward Jenner do for a living?
• Generate questions from Google ) less formu
‣ これまでは比較的綺麗なデータを扱っていた(語彙も少ない)
‣ web
のデータベースをもとに、システムをスケールさせることは
できるか?
30 /37
37. Freebase
knowledge
graph
Berant
et
al.’13
Freebase knowledge graph
BarackObama
Person
Type
Politician
Profession
1961.08.04
DateOfBirth
HonoluluPlaceOfBirth
Hawaii
ContainedBy
City
Type
UnitedStates
ContainedBy
USState
Type
Event8
Marriage
MichelleObama
Spouse
Type
Female
Gender
1992.10.03
StartDate
Event3
PlacesLived
Chicago
Location
Event21
PlacesLived
Location
ContainedBy
9
BarackObama
Person
Type
Politician
Profession
1961.08.04
DateOfBirth
HonoluluPlaceOfBirth
City
Type
Event3
PlacesLived
Chicago
Location
ContainedBy
41M entities (nodes)
19K properties (edge labels)
596M assertions (edges)
SPARQL
によってクエリを投げることができる 31 /37
38. 何が難しいか?
‣ 述語が多く、自然文との間にミスマッチが発生
-‐ GEO
のように全ての述語を
enumerate
して学習することができない
‣ 使用すべき述語がドメイン依存
Type.Country
Profession.Lawyer
PeopleBornHere
InventorOf
...
...
Type.HumanLanguage
Type.ProgrammingLanguage
Brazil
BrazilFootballTeam
What languages do people in Brazil use
alignment alignment
13
Berant
et
al.’13
sz}@cs.washington.edu
tions (Chen and Mooney, 2011; Artzi and Zettle-
moyer, 2013b), and generating programs (Kushman
and Barzilay, 2013).
In each case, the parser uses a predefined set
of logical constants, or an ontology, to construct
meaning representations. In practice, the choice
of ontology significantly impacts learning. For
example, consider the following questions (Q) and
candidate meaning representations (MR):
Q1: What is the population of Seattle?
Q2: How many people live in Seattle?
MR1: x.population(Seattle, x)
MR2: count( x.person(x) ^ live(x, Seattle))
A semantic parser might aim to construct MR1 for
Freebase
ではこちらしか
受け付けない
32 /37
39. DCS
系のアプローチ
‣ 機能が制限された
DCS
(basic
λ-‐DCS)
を用いている
-‐ Mark-‐Execute
などはいつの間にか消えている
-‐ Freebase
のクエリは単に知識を問うことしかできず、量子化などを表現
する必要性がない(できない)から?
-‐ 熟語を選ぶ難しさが増したが、構造の導出はより簡単に?
Berant
et
al.’13
Berant
and
Liang’14
naries,
ist u
ersec-
nt(u)
K =
star-
would
ma) ^
ise));
ma u
K as
Type.Location u PeopleBornHere.BarackObama
Type.Location
where
was PeopleBornHere.BarackObama
BarackObama
Obama
PeopleBornHere
born
?
join
intersection
lexicon
lexicon lexicon
Figure 2: An example of a derivation d of the utterance
“Where was Obama born?” and its sub-derivations, each
labeled with composition rule (in blue) and logical form
(in red). The derivation d skips the words “was” and “?”.
ily over-generates. We instead rely on features and
33 /37
40. CCG
でも答えから学習する
Kwiatkovski’13Domain Independent Parsing
How many people live in Seattle
S/(SNP)/N N SNP SS/NP NP
f g x.eq(x, count( x.P(x) x ev.P(x, ev) x f9ev.P(ev, x) ^ f(ev) C
y.g(y) ^ f(y)))
> >
S/(SNP) SS
g x.eq(x, count( y.g(y) ^ P(y))) f9ev.P(ev, C) ^ f(ev)
<B
SNP
x9ev.P(x, ev) ^ P(ev, C)
>
S
x.eq(x, count( y.P(y) ^ 9ev.P(y, ev) ^ P(ev, C)))
x.eq(x, count( y.9ev.people(y) ^ live(y, ev) ^ in(ev, seattle)))
String labels signify source words, not
semantic constants.
ドメイン非依存の論理式を最初につくる
CCG
の語彙は学習しない(ある程度人手で与える)
Constant
Matches
2 Step Semantic Parsing
How many people live in Seattle
S/(SNP)/N N SNP SS/NP NP
f g x.eq(x, count( x.people(x) x ev.live(x, ev) x f9ev.in(ev, x) seattle
y.g(y) ^ f(y))) ^ f(ev)
> >
<B
>
S
x.eq(x, count( y.9ev.people(y) ^ live(y, ev) ^ in(ev, seattle)))
Domain Independent Parse
Ontology Match
x.eq(x, count( y.9ev.people(y) ^ live(y, ev) ^ in(ev, seattle)))
x.how many people live in(seattle, x)
x.how many people live in(seattle, x)
Structure Match
Constant
Matches
for .
2 Step Semantic Parsing
How many people live in Seattle
S/(SNP)/N N SNP SS/NP NP
f g x.eq(x, count( x.people(x) x ev.live(x, ev) x f9ev.in(ev, x) seattle
y.g(y) ^ f(y))) ^ f(ev)
> >
<B
>
S
x.eq(x, count( y.9ev.people(y) ^ live(y, ev) ^ in(ev, seattle)))
Domain Independent Parse
Ontology Match
x.eq(x, count( y.9ev.people(y) ^ live(y, ev) ^ in(ev, seattle)))
x.how many people live in(seattle, x)
x.how many people live in(seattle, x)
x.population(seattle, x)
Structure Match
論理式まで含めて
隠れ変数として学習
34 /37
42. QA
以外での
DCS
と
CCG
‣ CCG
は広い範囲に使われだしている
-‐ 入力に対する論理式(プログラム)を学習するような問題
-‐ 対話ログからの対話システムの構築
(ArX
and
Zeblemore’11)
-‐ ロボットの誘導
(Artzi
and
Zeblemore’13)Modeling Instructions
1 2 3 4 5
1
2
3
4
5
go to the chair
{ }
Events can be modified
by adverbials
a.move(a)^
to(a, ◆x.chair(x))
Modeling Instructions
1 2 3 4 5
1
2
3
4
5
go to the chair
{ }
Events can be modified
by adverbials
a.move(a)^
to(a, ◆x.chair(x))
Artzi
et
al.’13現在位置
この対応関係を
得ることが目的
・ただし論理式は
直接与えられない
・実行したらそれが
正解がどうかが分かる
36 /37
43. QA
以外での
DCS
と
CCG
‣ DCS
の意味表示の実行は、データベースの存在に依存している
-‐ データベース上での集合の直積によって意味が表現される
‣ Tian,
Miyao
and
Matsuzaki’14
(ACL)
-‐ DCS
の枠組みを、含意関係認識に適用
-‐ データベースがなくても
DCS
を意味表示として
用いることができる方法を示した
(abstract
denotaXon)
‣ CCG
のほうが歴史が古い分、新しい問題にも適用しやすい?
-‐ DCS
はよりシンプルで文の構造と親和性が高い
-‐ どの表現がどの問題に対し、どれぐらい(なぜ)優れているのか
1: The DCS tree of “students read books”
book
ARG
A Tale of Two Cities
Ulysses
...
read
SUBJ OBJ
Mark New York Times
Mary A Tale of Two Cities
John Ulysses
... ...
1: Databases of student, book, and read
CS trees
SUBJ
have
Tom animal
OBJ
ARG ARG
love
ARG
OBJ
SUBJ
love
Mary dog
OBJ
ARG ARG
Tom
SUBJ
have
dog
OBJ
ARG ARG
Mary
SUBJ
ARG
T: H:
⊂
Figure 2: DCS trees of “Mary loves every
(Left-Up), “Tom has a dog” (Left-Down)
“Tom has an animal that Mary loves” (Right
responding words1. To formulate the dat
querying process defined by a DCS tree, we
vide formal semantics to DCS trees by empl
37 /37
44. Reference
(1)
‣ Yoav
Artzi
and
Luke
S
ZeHlemoyer
(2011).
Bootstrapping
seman_c
parsers
from
conversa_ons.
In
EMNLP.
‣ Yoav
Artzi
and
Luke
S
ZeHlemoyer
(2013).
Weakly
Supervised
Learning
of
Seman_c
Parsers
for
Mapping
Instruc_ons
to
Ac_ons.
In
TACL.
‣ Yoav
Artzi,
Nicholas
FitzGerald,
and
Luke
ZeHlemoyer
(2013).
Seman_c
Parsing
with
Combinatory
Categorical
Grammars.
In
ACL
tutorial.
‣ Jonathan
Berant,
Andrew
Chou,
Roy
Fros_g,
and
Percy
Liang
(2013).
Seman_c
Parsing
on
Freebase
from
Ques_on-‐Answer
Pairs.
In
EMNLP.
‣ Jonathan
Berant
and
Percy
Liang
(2014).
Seman_c
parsing
via
paraphrasing.
In
ACL.
‣ Tom
Kwiatkowski,
Luke
S
ZeHlemoyer,
Sharon
Goldwater,
and
Mark
Steedman
(2010).
Inducing
probabilis_c
CCG
grammars
from
logical
form
with
higher-‐order
unifica_on.
In
EMNLP.
‣ Tom
Kwiatkowski,
Luke
S
ZeHlemoyer,
Sharon
Goldwater,
and
Mark
Steedman
(2011).
Lexical
generaliza_on
in
CCG
grammar
induc_on
for
seman_c
parsing.
In
EMNLP.
45. Reference
(2)
‣ Tom
Kwiatkowski,
Sharon
Goldwater,
Luke
S
ZeHlemoyer,
and
Mark
Steedman
(2012).
A
probabilis_c
model
of
syntac_c
and
seman_c
acquisi_on
from
child-‐directed
uHerances
and
their
meanings.
In
EACL.
‣ Tom
Kwiatkowski,
E
Choi,
Y
Artzi,
and
Luke
S
ZeHlemoyer
(2013).
Scaling
seman_c
parsers
with
on-‐the-‐fly
ontology
matching.
In
EMNLP.
‣ Percy
Liang,
Michael
I
Jordan,
and
Dan
Klein
(2011).
Learning
dependency-‐based
composi_onal
seman_cs.
In
ACL.
‣ Percy
Liang,
Michael
I
Jordan,
and
Dan
Klein
(2013).
Learning
dependency-‐based
composi_onal
seman_cs.
In
ComputaBonal
LinguisBcs.
‣ Cynthia
Matuszek,
Nicholas
FitzGerald,
Luke
S
ZeHlemoyer,
Liefeng
Bo,
and
Dieter
Fox
(2012).
A
Joint
Model
of
Language
and
Percep_on
for
Grounded
AHribute
Learning.
In
ICML.
‣ Ran
Tian,
Yusuke
Miyao,
and
Takuya
Matsuzaki
(2014).
Logical
Inference
on
Dependency-‐
based
Composi_onal
Seman_cs.
In
ACL.
46. Reference
(3)
‣ Luke
S
ZeHlemoyer
and
Michael
Collins
(2005).
Learning
to
Map
Sentences
to
Logical
Form:
Structured
Classifica_on
with
Probabilis_c
Categorial
Grammars.
In
UAI.
‣ Luke
S
ZeHlemoyer
and
Michael
Collins
(2007).
Online
learning
of
relaxed
CCG
grammars
for
parsing
to
logical
form.
In
EMNLP.