voice browser

Presented By

CH.BINDU BHARGAVII.V.S SIRISHA

III CSE ISTS ENGEENERING COLLEGE

Voice Browser

• Abstract

• Introduction

• Key technologies

• System Design

• components

• Applications

• Advantages

• Future Scope

• Conclusion

OVERVIEW

Voice browser is a new way of interacting with modern

computer systems. This is an alternative to the traditional text

based web browsers.

Voice browser makes use of speech recognition technology, to

understand what the user is saying. Voice browser gives

response aurally. This is done by converting the text messages

stored in the computer to voice messages.

ABSTRACT

A voice browser is a device which interprets a (voice) markup language and

is capable of generating voice output and/or interpreting voice input and

possibly other input/output modalities. The information that the system uses is dynamic and comes somewhere

from an end-user’s perspective, the Moto is to provide a service similar

to what graphical browsers of HTML and related technologies do today,

but on devices that are note equipped with full-browsers or even the

screens to support them.

INTRODUCTION

KEY TECHNOLOGIES

• Speech Recognition

•Speech Synthesis

Speech Recognition

• Voice input VoXML file Text

Speech Synthesis

• Text VoXML file Output(Prerecorded)

VoiceXML VoiceXML is a dialog markup language designed for

telephony applications, where users are restricted to voice

and DTMF (touch tone) input.

text.html

text.vxml

WebServer Internet

Browser

Speech Recognition

DTMF Grammars

Speech Grammars

StochasticLanguage

Models

Semantic Interpretation

Touch ToneUSER

Speech

Speech grammar

Speech grammars allow authors to specify the rules for

covering the sequence of words that users are expected to say

in particular context.

These contextual clues allow the recognition engine to focus on

likely utterances , improving the chances of the correct match

7

Stochastic (N Gram) Language Models Speech grammar is un useful in case of open- ended prompt

e.g. how can I help you

The solution is to use a stochastic language models. such

models specify the probability that one word occurs following

certain others. the probabilities are computed from the

collection of utterances collected from many users.

8

12

DTMF Grammars• Touch tone input is often used as an alternative to speech

recognition.

• Especially useful in noisy conditions or when the social

context makes it awkward to speak.

• The W3C DTMF grammar format allows authors to specify

the expected sequence of digits, and to bind them to the

appropriate results

Semantic Interpretation • The recognition process matches an utterance to a speech

grammar, building a parse tree as a byproduct.

• There are two approaches to harvesting semantic rules from

the parse tree :

1. Automating grammar rules with semantic interpretation

tags 2. Representing the results in XML

10

Call Control• Fine-grained control of speech (signal processing ) resources and

telephony resources in a VoiceXML telephony platform.

• Will enable application developers to use markup to perform call

screening, whisper call waiting call transfer, and more.

• Can be used to transfer a user from on voice browser to another

on a completely different machine.11

Applications It can be divided into three categories :

◦ Web Browsing

◦ Limited information Access

◦ Spoken Dialog Systems

◦ Cinema and theater booking services

12

Future•Voice browsing will become visual(Multi-modal)

•Can be integrated to an OS

•Integrated to every application.

13

Conclusion• Browser technology is changing very fast these days and we

are moving from the visual paradigm to the voice paradigm.

• Voice browser is the technology to enter this paradigm.

• Voice browser is a device which interpret voice input and

generate voice output.

14

voice browser

Technology