This document discusses grammar coverage analysis for automatic speech recognition systems. It describes two complementary techniques: sentence generation and sentence pattern exploration. Sentence generation uses tools to automatically generate test sentences from a grammar, while sentence pattern exploration allows interactive expansion of grammar rules to derive test sentences. The document provides best practices for comprehensive grammar testing, including avoiding redundant sentences and ensuring all semantic tags and patterns are tested. It also demonstrates how to use sentence generation and exploration tools to debug grammars.
1. The Art and Science of
Grammar Coverage Analysis
Dominique Boucher
Nu Echo Inc.
dominique.boucher@nuecho.com
SpeechTEK 2009
New York, USA
2. The grammar development process
Enter / get initial Write initial
set of sentences grammar to cover
in coverage set utterances
Run coverage
tests
Generate
no Problems? yes Fix grammar
sentences
Objective:
Enrich coverage To obtain:
set (a) A complete coverage set
(b) A grammar that covers the coverage set
and produces the correct semantic
result
3. The importance of coverage analysis
Design and • Ensure grammars conform to their specification
development • Semantic tags testing
Maintenance • Grammars evolve over the life of an application
• Provides an effective tool for testing that a grammar
and optimization isn’t accidentally broken by a grammar change
Conversion • Ensure proper conversion between grammar
formats
projects
4. Grammar coverage challenges
Provide exhaustive coverage of all sentence patterns…
… with the smallest possible set of sentences.
Otherwise:
– Analysis of generated sentences will be time-consuming; and
– Errors will go undetected
6. Technique #1
Sentence generation
Tools usually provided by the ASR engine SDK
– Operate on the source or compiled grammar
Commonly used generation strategies
– Exhaustive generation
– Generation of a fixed number of random sentences
7. Technical difficulties
Some grammars generate an infinite number of
sentences
– Exhaustive generation not possible
Semantic tags not all tested
– Errors may remain undetected until application run time
All interesting cases may not be covered by the
generated sentences
Uninteresting patterns are generated over and over
8. Sentence generation best practices
Avoid generating redundant sentences
– Powerful sentence generation tools makes this possible
– Too many sentences increases the risk of errors going
undetected
Carefully examine generated sentences
– If a sentence doesn’t look right, it probably shouldn’t be in
there (although, to be sure, look at the parse tree)
Make sure the coverage test is as complete as possible
– Should include all semantic tags and all sentence patterns
– Full coverage is best (whenever possible)
9. Sentence generation revisited
Individual rule
configuration
More effective strategies
– Tags coverage
– All grammar paths
– Pick from @examples
– Use fixed sentence
Generation can be
started from any set of
sentence patterns
11. Technique #2
Exploring sentence patterns
Interactive expansion of
grammar rules
Derive sentence patterns
– Useful to generate
sentences for a specific
pattern
Derive complete sentences
– Can be debugged, etc.
Ideal to understand the
structure of a grammar