icassp2015_svec

Word-Semantic Lattices for Spoken Language
Understanding
Jan ˇSvec, Luboˇs ˇSm´ıdl, Tomáˇs Valenta, Adam Chýlek A.,
Pavel Ircing
NTIS - New Technologies for Information Society, Faculty of Applied Sciences
University of West Bohemia, Pilsen, Czech Republic
honzas,smidl,valentat,chylek,ircing@ntis.zcu.cz
April 23, 2015
ˇSvec J., ˇSm´ıdl L., Valenta T., Chýlek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding

Motivation
Key concept
Integration of expert knowledge in a statistical SLU
In this paper
SLU for spoken dialog systems
Expert knowledge – used to detect semantic entities
Semantic entities – local, well-deﬁned entities such as date,
time, person’s name
Task
Use a word lattice Detect semantic entities Replace such
entities in the lattice Train a statistical SLU

Semantic Entities
Local, well-deﬁned entities (mostly related to real objects)
Expert-deﬁned knowledge
Several types z
station, time, date, train type, name, resource name
Described by a nonrecursive context-free grammar Gz
conversion to Weighted Finite State Transducer Tz
Interpretation – CFG tags, describing the meaning
Gz example (ABNF form)
$number = ($d | ten {10} | twenty {20} [$d]);
$d = (one {1} | two {2} | three {3});
$time = (ten {10} past {p} three {3} |;
ten {10} thirty {30} )
$year = last {2014} year;

Semantic Entity Detection
Illustration
yi wR posterior.
last year year:2014 0.14
twenty three number:20:3 0.06
twenty number:20 0.06
ten number:10 0.80
three number:3 0.86
ten past three time:10:p:3 0.80

Algorithm
Input: lattice U
Output:
unambiguously assigned SEs
1 Convert Gz → Tz
2 Union Z = z Tz
3 Create factor automaton F(U)
4 Compose R = F(U) ◦ Z
5 Generate all paths from R
6 Solve integer linear
programming
SE types: year, number, time
$number = ($d | ten{10} | twenty{20}[$d]);
$d = (one {1} | two {2} | three {3});
$time = (ten {10} past {p} three {3} |;
ten {10} thirty {30} )
$year = last {2014} year;
yi
wR posterior.
last year year:2014 0.14
twenty three number:20:3 0.06
twenty number:20 0.06
ten number:10 0.80
three number:3 0.86
ten past three time:10:p:3 0.80

Comments
Maximum unambiguous coverage ILP
constraint: each transition in the lattice has assigned at most one SE
criterion: a number of transitions with SE assigned is maximized
Expert describes just the semantic entities
Words not belonging to any SE are ignored F(U)
SE posteriors are provided
Diﬀerent lexical realisations of SE with the same
interpretation are merged
ˇSvec J., Ircing P., ˇSm´ıdl, ”Semantic entity detection from multiple ASR
hypotheses within the WFST framework,” ASRU 2013

Word-Semantic Lattices
Word lattice
Set of semantic entites
Merge and train statistical SLU
Expert knowledge helps to reduce the sparsity of training data.
Example: ten past three + ten thirty time
Building W-SE lattices
After performing SED, replace words (in a lattice) with symbol(s)
derived from SE interpretation.

lattice UT (by-product of SED)
time:10:p:3 (2, 3, 4)
time:10:30 (2, 5)
invert (M) ◦ UT
+WFST optimization
mapping transducer M
resulting W-SE lattice
Map sequences of transitions belonging to SE to SE-derived
symbol(s), keep other transitions unchanged.

Construction of the mapping transducer M
Given a set of SEs and corresponding UT transitions and set of all
UT transitions, diﬀerent derivations of mapping T could be used:
type – time:10:p:3 (time)
typen – time:10:p:3 (time1, time2, time3)
split – time:10:p:3 (time, 10, p, 3)
full – time:10:p:3 (time:10:p:3)
Example:
SE (time) ten past three
transitions 2,3,4
mapped to time:10:p:3 (full derivation)
other SE time:10:30
transitions not bearing the SE 1,6,7,8

Example
UT lattice ^
mapping transducer
(full derivation) ^
W-SE lat., full derivation ^
W-SE lat., type derivation ^

Hierarchical Discriminative Model
Statistical SLU model based on SVMs
Allows to process lattices using rational kernel theory
ˇSvec J., ˇSm´ıdl, Ircing P., ”Hierarchical Discriminative Model for Spoken
Language Understanding,” ICASSP 2013
DEP
DEP-TIME
TO-STATION
ARR
GREETING
d(u)
DEP
TIME
TO
STATION
{TIME} 0.20
{TIME, TO}0.75
∅ 0.05
{ }ν 0.08
∅ 0.92
{ }ν 0.12
{STATION} 0.56
∅ 0.32
{ }ν 0.29
∅ 0.71
TIME TO
DEP
STATION
other t in SN
0.12
0.29
0.08
-1.21
-0.9
dt
(u)
lattice u
0.75
0.56
0.71
0.92
P(π|u) = 0.274
u1
u2
u3
u4
uk
⋮
Rationalkernelw[ui
TT∘∘-1
u]∘
lattice u
Input layer Hidden layer SVMs Output layer SVMs P(A→β|u) Most probable tree
Traininglattices
K(u,ui
)

Experimental Setup
Two Czech semantic corpora:
HHTT – Human-Human Train Timetable – inquiries about train
connections; station, time, date, train type
TIA – Intelligent Telephone Assistant – meeting planing, resource
sharing, conference calls; date, time, name, res. name
HHTT TIA
# diﬀerent concepts 28 20
# train sentences 5240 6425
# devel. sentences 570 519
# test sentences 1439 1256
ASR Vocabulary size 13886 42615
ASR Acc (Oracle Acc) 75.0% (84.6%) 77.9% (87.0%)
SLU performance (semantic trees): cAcc = N−S−D−I
N

Experimental Results
Structure of HDM input HHTT cAcc TIA cAcc
words 1-best 73.96 81.96
word lattice 74.90 83.55
W-SE type 75.88 83.90
W-SE typen 76.67 83.87
W-SE split 75.77 84.40
W-SE full 69.36 82.24
SE removed 57.97 74.59
SE only 59.54 51.32
W-SE 1-best 75.18 82.56

Experimental Results
Detailed results for speciﬁc semantic concepts
word lattice W-SE lattice
HHTT concept F P R F P R
TIME 93.0 93.0 93.0 93.5 93.1 94.0
TRAINTYPE 88.6 95.1 83.0 87.3 92.1 83.0
STATION 79.7 96.1 68.1 84.4 96.4 75.1
word lattice W-SE lattice
TIA concept F P R F P R
TIME 93.9 94.7 93.0 94.0 94.6 93.4
NAME 92.3 97.0 88.1 93.7 97.1 90.5
RES 80.8 87.1 75.3 86.8 93.0 81.5

Conclusion
Eﬀective use of expert-knowledge in a statistical SLU
framework
Improves the SLU performance by increasing recall
W-SE lattices with diﬀerent level of details
Acknowledgment
This research was supported by the Grant Agency of the Czech
Republic, project No. GAˇCR GBP103/12/G084.

icassp2015_svec

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a icassp2015_svec

Semelhante a icassp2015_svec (14)

icassp2015_svec