1. Word-Semantic Lattices for Spoken Language
Understanding
Jan ˇSvec, Luboˇs ˇSm´ıdl, Tom´aˇs Valenta, Adam Ch´ylek A.,
Pavel Ircing
NTIS - New Technologies for Information Society, Faculty of Applied Sciences
University of West Bohemia, Pilsen, Czech Republic
honzas,smidl,valentat,chylek,ircing@ntis.zcu.cz
April 23, 2015
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
2. Motivation
Key concept
Integration of expert knowledge in a statistical SLU
In this paper
SLU for spoken dialog systems
Expert knowledge – used to detect semantic entities
Semantic entities – local, well-defined entities such as date,
time, person’s name
Task
Use a word lattice Detect semantic entities Replace such
entities in the lattice Train a statistical SLU
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
3. Semantic Entities
Local, well-defined entities (mostly related to real objects)
Expert-defined knowledge
Several types z
station, time, date, train type, name, resource name
Described by a nonrecursive context-free grammar Gz
conversion to Weighted Finite State Transducer Tz
Interpretation – CFG tags, describing the meaning
Gz example (ABNF form)
$number = ($d | ten {10} | twenty {20} [$d]);
$d = (one {1} | two {2} | three {3});
$time = (ten {10} past {p} three {3} |;
ten {10} thirty {30} )
$year = last {2014} year;
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
4. Semantic Entity Detection
Illustration
yi wR posterior.
last year year:2014 0.14
twenty three number:20:3 0.06
twenty number:20 0.06
ten number:10 0.80
three number:3 0.86
ten past three time:10:p:3 0.80
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
5. Semantic Entity Detection
Algorithm
Input: lattice U
Output:
unambiguously assigned SEs
1 Convert Gz → Tz
2 Union Z = z Tz
3 Create factor automaton F(U)
4 Compose R = F(U) ◦ Z
5 Generate all paths from R
6 Solve integer linear
programming
SE types: year, number, time
$number = ($d | ten{10} | twenty{20}[$d]);
$d = (one {1} | two {2} | three {3});
$time = (ten {10} past {p} three {3} |;
ten {10} thirty {30} )
$year = last {2014} year;
yi
wR posterior.
last year year:2014 0.14
twenty three number:20:3 0.06
twenty number:20 0.06
ten number:10 0.80
three number:3 0.86
ten past three time:10:p:3 0.80
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
6. Semantic Entity Detection
Comments
Maximum unambiguous coverage ILP
constraint: each transition in the lattice has assigned at most one SE
criterion: a number of transitions with SE assigned is maximized
Expert describes just the semantic entities
Words not belonging to any SE are ignored F(U)
SE posteriors are provided
Different lexical realisations of SE with the same
interpretation are merged
ˇSvec J., Ircing P., ˇSm´ıdl, ”Semantic entity detection from multiple ASR
hypotheses within the WFST framework,” ASRU 2013
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
7. Word-Semantic Lattices
Word lattice
Set of semantic entites
Merge and train statistical SLU
Expert knowledge helps to reduce the sparsity of training data.
Example: ten past three + ten thirty time
Building W-SE lattices
After performing SED, replace words (in a lattice) with symbol(s)
derived from SE interpretation.
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
8. Word-Semantic Lattices
lattice UT (by-product of SED)
time:10:p:3 (2, 3, 4)
time:10:30 (2, 5)
invert (M) ◦ UT
+WFST optimization
mapping transducer M
resulting W-SE lattice
Map sequences of transitions belonging to SE to SE-derived
symbol(s), keep other transitions unchanged.
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
9. Word-Semantic Lattices
Construction of the mapping transducer M
Given a set of SEs and corresponding UT transitions and set of all
UT transitions, different derivations of mapping T could be used:
type – time:10:p:3 (time)
typen – time:10:p:3 (time1, time2, time3)
split – time:10:p:3 (time, 10, p, 3)
full – time:10:p:3 (time:10:p:3)
Example:
SE (time) ten past three
transitions 2,3,4
mapped to time:10:p:3 (full derivation)
other SE time:10:30
transitions not bearing the SE 1,6,7,8
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
10. Word-Semantic Lattices
Example
UT lattice ^
mapping transducer
(full derivation) ^
W-SE lat., full derivation ^
W-SE lat., type derivation ^
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
11. Hierarchical Discriminative Model
Statistical SLU model based on SVMs
Allows to process lattices using rational kernel theory
ˇSvec J., ˇSm´ıdl, Ircing P., ”Hierarchical Discriminative Model for Spoken
Language Understanding,” ICASSP 2013
DEP
DEP-TIME
TO-STATION
ARR
GREETING
d(u)
DEP
TIME
TO
STATION
{TIME} 0.20
{TIME, TO}0.75
∅ 0.05
{ }ν 0.08
∅ 0.92
{ }ν 0.12
{STATION} 0.56
∅ 0.32
{ }ν 0.29
∅ 0.71
TIME TO
DEP
STATION
other t in SN
0.12
0.29
0.08
-1.21
-0.9
dt
(u)
lattice u
0.75
0.56
0.71
0.92
P(π|u) = 0.274
u1
u2
u3
u4
uk
⋮
Rationalkernelw[ui
TT∘∘-1
u]∘
lattice u
Input layer Hidden layer SVMs Output layer SVMs P(A→β|u) Most probable tree
Traininglattices
K(u,ui
)
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
12. Experimental Setup
Two Czech semantic corpora:
HHTT – Human-Human Train Timetable – inquiries about train
connections; station, time, date, train type
TIA – Intelligent Telephone Assistant – meeting planing, resource
sharing, conference calls; date, time, name, res. name
HHTT TIA
# different concepts 28 20
# train sentences 5240 6425
# devel. sentences 570 519
# test sentences 1439 1256
ASR Vocabulary size 13886 42615
ASR Acc (Oracle Acc) 75.0% (84.6%) 77.9% (87.0%)
SLU performance (semantic trees): cAcc = N−S−D−I
N
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
13. Experimental Results
Structure of HDM input HHTT cAcc TIA cAcc
words 1-best 73.96 81.96
word lattice 74.90 83.55
W-SE type 75.88 83.90
W-SE typen 76.67 83.87
W-SE split 75.77 84.40
W-SE full 69.36 82.24
SE removed 57.97 74.59
SE only 59.54 51.32
W-SE 1-best 75.18 82.56
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
14. Experimental Results
Detailed results for specific semantic concepts
word lattice W-SE lattice
HHTT concept F P R F P R
TIME 93.0 93.0 93.0 93.5 93.1 94.0
TRAINTYPE 88.6 95.1 83.0 87.3 92.1 83.0
STATION 79.7 96.1 68.1 84.4 96.4 75.1
word lattice W-SE lattice
TIA concept F P R F P R
TIME 93.9 94.7 93.0 94.0 94.6 93.4
NAME 92.3 97.0 88.1 93.7 97.1 90.5
RES 80.8 87.1 75.3 86.8 93.0 81.5
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding
15. Conclusion
Effective use of expert-knowledge in a statistical SLU
framework
Improves the SLU performance by increasing recall
W-SE lattices with different level of details
Acknowledgment
This research was supported by the Grant Agency of the Czech
Republic, project No. GAˇCR GBP103/12/G084.
ˇSvec J., ˇSm´ıdl L., Valenta T., Ch´ylek A., Ircing P. Word-Semantic Lattices for Spoken Language Understanding