2. o Short answer- YES!
o Most failed voice implementations can be traced to the voice
recognizer software that “listens to” and interprets what the work
says.
o Ignoring the quality of the voice recognition software can lead to
a failed implementation
3. Dozens of recognizers on the market today that
are designed for a “controlled” environment (little
to no background noise)
Warehouses are the most challenging places for
a voice recognizer to work
Warehouses also employ a diverse workforce
with different native tongues and accents
If recognizer makes a mistake, worker must
repeat himself and productivity suffers
4. o Voxware offers 99.9% recognition accuracy with their
voice technology
o Voxware Integrated Speech Engine (VISE) designed
to operate in very noise settings without
compromising accuracy. VISE has been refined over
25 years
o Voxware ensures consistently high recognition rates
regardless of mobile device being used, language or
environmental circumstances
5. o In an apparel warehouse near
Atlanta, a Voxware voice
solution accurately recognizes
workers who speak five different
languages:
English, Spanish, Bosnian, Viet
namese, and Somali.
6. Speaker Dependent Recognizers: recognizers are
“trained” to recognize the way a specific user says a
vocabulary of words. Speaker independent recognizers
do not require training.
Speaker independent recognizers are widely used for
customer service applications (e.g. airline reservations)
7. o VISE leads users through a training session and creates a voice
profile for each person that is specific to that person’s way of
speaking (e.g. accents)
o Speaker dependent recognizers account for an increase in
accuracy from 95% to 99.9%. This difference is huge in terms of
ROI from voice implementation.
o Training is time consuming, but research shows that time gained by
skipping training is lost in the first week of production use because
of mis-recognitions. Amounts to $20,000 of wasted worker time in a
medium size DC.
8. Many voice recognizers try to block out background
noise with “noise-reducing” microphones.
This does not ensure recognition accuracy. Why? DCs
have too much fluctuating noise for a “noise-reducing”
microphone to handle.
9. VISE “listens” for background noise and eliminates it
which allows the recognizer to process only what the
worker said.
According to Voxware, VISE has run in some of the
loudest operations imaginable (sawmills, airport
runways) and VISE still recognizes what users say with
near 100% accuracy.
10. VISE is optimized to recognize phrases as opposed to
individual words.
Continuous recognition enhances productivity because
workers can combine into one response what would ordinarily
take two to three interactions using a discrete word
recognizer.
For example: “Check 457 Grab 6 Put to Alpha”. Other
systems would have to break this up into as many as three
interactions.
11. VISE always “knows” what it is listening for.
VISE ignores idle chitchat and waits for the expected
response. This is called “out of vocabulary rejection”.
Recognizers that do not have “out of vocabulary rejection” will
interpret everything the user says as input, including overhead
paging that is loud enough.
12. Anyone who knows VoiceXML (used to develop voice
applications) could create an application to interact with
VISE.
Since VISE is open and standards-based, Voxware can
use a different VoiceXML recognizer if one is found that
could deliver better performance than VISE.
13. Voxware’s software is hardware independent. Customers
are able to port their voice applications to new devices
without the need to rewrite any code.
Voxware works with hardware manufacturers to help
them produce units with the requisite audio performance
and quality.