O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.
O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.
Under the supervision of
MR. RAKESH KUMAR
DEPARTMENT OF INFORMATION TECHNOLOGY
RAJKIYA ENGINEERING COLLEGE, AMBEDKAR NAGAR (UP)-224122
What is Voice browser?
Why is a Voice browser?
W3C Interface Framework.
Speech Recognition Grammar Specification (SRGS)
Semantic Interpretation for Speech Recognition(SISR)
Pronunciation Lexicon Specification (PLS)
Advantages and disadvantages
A voice browser is a software application that presents an
interactive voice user interface to the user in a manner analogous to
the functioning of a web browser.
Dialog documents interpreted by voice browser are often encoded
in standards-based markup languages, such as (VoiceXML).
A voice browser presents information aurally, using pre-recorded
audio file playback or text-to-speech synthesis software.
A voice browser obtains information using speech recognition and
keypad entry, such as DTMF detection.
WHAT IS A VOICE BROWSER?
Use of the hands during browsing might prove inconvenient
Voice input is a natural solution for such ands-busy
Even in standard browser applications, using voice input is
simply more fun than the alternatives.
Voice input provides direct "see and say" access to links,
eliminating the wrist strain associated with holding the mouse
for often hours at a time.
This is most helpful for the disabled persons.
Why is a Voice Browser?
Far more people today have access to a telephone than have
access to a computer with an Internet connection.
Many of us have already or soon will have a mobile phone within
reach wherever we go.
Voice interaction can escape the physical limitations on keypads
and displays as mobile devices become ever smaller.
Disadvantages to existing methods:WAP (Cellular phones, Palm
1. Access Speed
2. Limited or fragmented availability
4. Lack of user habit
Differences Between Graphical & Voice
Graphical browsing is more
passive due to the persistence of
the visual information.
Graphical Browsers are
Voice browsing is more active
since the user has to issue
whereas Voice Browsers are
W3C Speech Interface Framework
The World Wide Web Consortium (W3C) develops interoperable
technologies (specifications, guidelines, software, and tools) to
lead the Web to its full potential as a forum for information,
commerce, communication, and collective understanding.
VoiceXML (VXML) is a digital document standard for
specifying interactive media and voice dialogs between humans
The VoiceXML document format is based on Extensible
text.html VOICE Xml
A speech recognition grammar is a set of word patterns, and tells a
speech recognition system what to expect a human to say.
SRGS specifies two alternate but equivalent syntaxes, one based on
XML, and one using augmented BNF format. In practice, the XML
syntax is used more frequently.
Speech Recognition Grammar Specification
Semantic Interpretation for Speech Recognition (SISR) defines
the syntax and semantics of annotations to grammar rules in the
Speech Recognition Grammar Specification (SRGS).
It allows voice browsers via ECMAScript to semantically interpret
complex grammars and provide the information back to the
Coders commonly use ECMAScript for client-side scripting on the
World Wide Web, and it is increasingly being used for writing server
Semantic Interpretation for Speech
The Pronunciation Lexicon Specification (PLS) is a W3C
Recommendation which is designed to enable interoperable
specification of pronunciation information for both speech
recognition and speech synthesis engines within voice browsing
Pronunciations are grouped together into a PLS document which
may be referenced from other markup languages.
CCXML is designed to inform the voice browser how to handle
the telephony control of the voice channel.
The two XML applications are wholly separate and are not
required by each other to be implemented - however, they have been
designed with interoperability in mind
Working of Voice Browser
Accessing business information:
1. The corporate "front desk" which asks callers who or what they wa
2. Automated telephone ordering service .
3. Airline arrival and departure information.
4. Home banking services.
Accessing public information:
1. Community information such as weather, traffic condition,
school closures, directions and events.
2. Local, national and international news.
3. National and international stock market information.
4. Business and e-commerce transactions.
1. Voice mail.
2. Calendars, address and telephone lists
3. Personal horoscope.
4. Personal newsletter.
5. To-do lists, shopping lists, and calorie counters.
Accessing personal information:
Advantages of Voice Browser
Voice is very natural user interface which speeds up browsing.
Less space requirements.
Portable voice browser can also be implemented.
Practical interface for blind users.
User can browse web while keeping there hands and eyes for
Disadvantages of voice browser
This is useful if only a restricted volume of phrases and sentences
It require large storage.
If voice browsers are meant to replace human operator dialog,
they must be fast in response.
Speech Recognition / Interpretation / Synthesis depend on
When a user requests a certain document, several related
documents can be downloaded for easier access.