voice browser
TRANSCRIPT
Presented By
CH.BINDU BHARGAVII.V.S SIRISHA
III CSE ISTS ENGEENERING COLLEGE
Voice Browser
• Abstract
• Introduction
• Key technologies
• System Design
• components
• Applications
• Advantages
• Future Scope
• Conclusion
OVERVIEW
Voice browser is a new way of interacting with modern
computer systems. This is an alternative to the traditional text
based web browsers.
Voice browser makes use of speech recognition technology, to
understand what the user is saying. Voice browser gives
response aurally. This is done by converting the text messages
stored in the computer to voice messages.
ABSTRACT
A voice browser is a device which interprets a (voice) markup language and
is capable of generating voice output and/or interpreting voice input and
possibly other input/output modalities. The information that the system uses is dynamic and comes somewhere
from an end-user’s perspective, the Moto is to provide a service similar
to what graphical browsers of HTML and related technologies do today,
but on devices that are note equipped with full-browsers or even the
screens to support them.
INTRODUCTION
KEY TECHNOLOGIES
• Speech Recognition
•Speech Synthesis
Speech Recognition
• Voice input VoXML file Text
Speech Synthesis
• Text VoXML file Output(Prerecorded)
VoiceXML VoiceXML is a dialog markup language designed for
telephony applications, where users are restricted to voice
and DTMF (touch tone) input.
text.html
text.vxml
WebServer Internet
Browser
Speech Recognition
DTMF Grammars
Speech Grammars
StochasticLanguage
Models
Semantic Interpretation
Touch ToneUSER
Speech
Speech grammar
Speech grammars allow authors to specify the rules for
covering the sequence of words that users are expected to say
in particular context.
These contextual clues allow the recognition engine to focus on
likely utterances , improving the chances of the correct match
7
Stochastic (N Gram) Language Models Speech grammar is un useful in case of open- ended prompt
e.g. how can I help you
The solution is to use a stochastic language models. such
models specify the probability that one word occurs following
certain others. the probabilities are computed from the
collection of utterances collected from many users.
8
12
DTMF Grammars• Touch tone input is often used as an alternative to speech
recognition.
• Especially useful in noisy conditions or when the social
context makes it awkward to speak.
• The W3C DTMF grammar format allows authors to specify
the expected sequence of digits, and to bind them to the
appropriate results
Semantic Interpretation • The recognition process matches an utterance to a speech
grammar, building a parse tree as a byproduct.
• There are two approaches to harvesting semantic rules from
the parse tree :
1. Automating grammar rules with semantic interpretation
tags 2. Representing the results in XML
10
Call Control• Fine-grained control of speech (signal processing ) resources and
telephony resources in a VoiceXML telephony platform.
• Will enable application developers to use markup to perform call
screening, whisper call waiting call transfer, and more.
• Can be used to transfer a user from on voice browser to another
on a completely different machine.11
Applications It can be divided into three categories :
◦ Web Browsing
◦ Limited information Access
◦ Spoken Dialog Systems
◦ Cinema and theater booking services
12
Future•Voice browsing will become visual(Multi-modal)
•Can be integrated to an OS
•Integrated to every application.
13
Conclusion• Browser technology is changing very fast these days and we
are moving from the visual paradigm to the voice paradigm.
• Voice browser is the technology to enter this paradigm.
• Voice browser is a device which interpret voice input and
generate voice output.
14