portales de voz 1.introducción...

52
TIC en Redes Móviles TIC en Redes Móviles Portales de Voz Portales de Voz 1. 1. Introducción Introducción 2. 2. VoiceXML VoiceXML

Upload: others

Post on 25-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

TIC en Redes MóvilesTIC en Redes Móviles

Portales de VozPortales de Voz

1.1.IntroducciónIntroducción

2.2.VoiceXMLVoiceXML

TIC en Redes MóvilesTIC en Redes Móviles

What do we mean by Voice Portal?

TIC en Redes MóvilesTIC en Redes Móviles

What do we mean by Voice Portal?

TIC en Redes MóvilesTIC en Redes Móviles

Why is Why is the phone the phone importantimportant ??

Phones are everywhere

Far more phone than PCs

More wireless phone than PCs

Wireless phone are highly portable

Support interesting location-based services

Can’t use a PC while driving

Click for video

TIC en Redes MóvilesTIC en Redes Móviles

What is What is VoiceXMLVoiceXML ??A language for specifying voice dialogs.•Voice dialogs use audio prompts and text-to-speech (TTS) for output•Touch-tone keys (DTMF) and automatic speech recognition (ASR) for input

Who is developing Who is developing VoiceXMLVoiceXML ??W3C Voice Browser working groupVoiceXML Forum http://www.voicexml.org

TIC en Redes MóvilesTIC en Redes Móviles

Why is important VoiceXML ?Standard language enables portability.High-level domain-specific languages simplifies application development.Accesing to the existing web infrastructure by phone.Allows a clean separation of service logic from user interaction.Can consolidate voice and web applications, opening the telephony platform to the web.

Click for video

TIC en Redes MóvilesTIC en Redes Móviles

<?xml version="1.0"?><vxml version="1.0">

<form id=“hello"><block>

<audio src="http://www.gtc.cps.unizar.es/~eduardo/wav/alerta.wav"/><audio src="file://C:\Windows\Media\ringin.wav"/>Hello World!

</block></form>

</vxml>

How is a How is a VoiceXMLVoiceXML document?document?

Click for video

TIC en Redes MóvilesTIC en Redes Móviles

What about What about SALT?SALT? SSpeech AApplication LLanguage TTagsFirst release: v1.0 July 2002Copyright from Cisco Systems, Comverse, Intel,Microsoft, Philips and SpeechWorksSALT Forum www.saltforum.orgSpeech Application Language Tags (SALT) 1.0 is an extension of HTML and other markup languages(cHTML, XHTML, WML, etc.) which adds a speech and telephony interface to web applications and services, for both voice only (e.g. telephone) and multimodal browsers.

TIC en Redes MóvilesTIC en Redes Móviles

Design PrinciplesDesign Principles1. Clean integration of speech with web pages2. Speech interface as separate layer3. Powerfull Programming Model (DOM execution

model)4. Range of devices

TIC en Redes MóvilesTIC en Redes Móviles

TIC en Redes MóvilesTIC en Redes Móviles

The elements ofThe elements of SALTSALT

There are three main top-level elements in SALT:

<listen …> configures the speech recognizer, executes recognitions and handles speech input events

<prompt …> configures the speech synthesizer and playsout prompts

<dtmf …> configures and controls DTMF in telephony applications

SALT also features ways to configure and manipulate telephony call control through both script and markup.

TIC en Redes MóvilesTIC en Redes Móviles

<!-- HTML --> <html xmlns:salt="http://www.saltforum.org/2002/SALT"> <body onload="RunAsk()"><form id="travelForm"><input name="txtBoxOriginCity" type="text" /><input name="txtBoxDestCity" type="text" /></form><!—- Speech Application Language Tags --><salt:prompt id="askOriginCity"> Where would you like to leave from? </salt:prompt><salt:prompt id="askDestCity"> Where would you like to go to? </salt:prompt><salt:prompt id="sayDidntUnderstand" onComplete="runAsk()"> Sorry, I didn't understand.</salt:prompt><salt:listen id="recoOriginCity”

onReco="procOriginCity()" onNoReco="sayDidntUnderstand.Start()"><salt:grammar src="city.xml" />

</salt:listen><salt:listen id="recoDestCity"

onReco="procDestCity()" onNoReco="sayDidntUnderstand.Start()"><salt:grammar src="city.xml" />

</salt:listen>

An ExampleAn Example

TIC en Redes MóvilesTIC en Redes Móviles

<!—- script --><script>function RunAsk() {

if (travelForm.txtBoxOriginCity.value=="") {askOriginCity.Start();recoOriginCity.Start();

} else if (travelForm.txtBoxDestCity.value=="") {askDestCity.Start();recoDestCity.Start();

}}function procOriginCity() {

travelForm.txtBoxOriginCity.value = recoOriginCity.text;RunAsk();

}function procDestCity() {

travelForm.txtBoxDestCity.value = recoDestCity.text;travelForm.submit();

}</script></body>

</html>

TIC en Redes MóvilesTIC en Redes Móviles

What What are are the technologies behind Voice Portals the technologies behind Voice Portals ??

TIC en Redes MóvilesTIC en Redes Móviles

Voice eXtensible Markup Languge

VoiceXML

TIC en Redes MóvilesTIC en Redes Móviles

IntroductionIntroduction• VoiceXML is a derivative of the eXtensible Markup

Language (XML).• XML universal format for structuring information• Designed for creating audio dialogs that feature:

– synthesized speech– digited audio– recognition of spoken and DMTF key input– recording of spoken input– telephony– mixed-initiative conversations

TIC en Redes MóvilesTIC en Redes Móviles

Architectural ModelArchitectural Model

VoiceXML InterpreterContext

Document Server

Implementation Platform

DocumentRequest

VoiceXML Interpreter

TIC en Redes MóvilesTIC en Redes Móviles

Architectural ModelArchitectural Model

A document server processes requests from theVoiceXML Interpreter through the VoiceXMLinterpreter context. The server produces VoiceXMLdocuments which are processed by the VoiceXMLinterpreter.

The implementation platform is controlled bythe VoiceXML interpreter context and by theVoiceXML interpreter.

TIC en Redes MóvilesTIC en Redes Móviles

Implementation Platform RequirementsImplementation Platform Requirements

• Document acquisition. • The interpreter context is expected to acquire

documents for the VoiceXML interpreter to act on. In some cases, the document request is generated by the interpretation of a VoiceXML document, while other requests are generated by the interpreter context in response to events (an income phone call).

Implementation Platform RequirementsImplementation Platform Requirements

• Audio output. The platform can provide audio output using

audio files or using text-to-speech (TTS).• Audio input. The platform must be able to:

– Report characters entered by a user.– Receive speech recognition data.– Record audio from the user.

TIC en Redes MóvilesTIC en Redes Móviles

TIC en Redes MóvilesTIC en Redes Móviles

ConceptsConceptsA VoiceXML document can be seen as a finite

state machine. The user is always in one state (dialog) at a time.

Dialog 3 Dialog 4Dialog 2Dialog 1

Exit

Each dialog determines the next dialog to transition to.Execution es terminated when a dialog does not

specify a successor or has an element that exits the conversation.

DialogsDialogs

• There are two kinds of dialogs:

– Forms: define an interaction that collect values from the user.

– Menus: give the user a choice of options. Transitions to another dialog are based on that choice.

TIC en Redes MóvilesTIC en Redes Móviles

TIC en Redes MóvilesTIC en Redes Móviles

FormForm

Every form contains one or more forms items, which are elements within a form that describe some kind of user interaction related to filling-in the form

Form are interpreted by form interpretation algorithm (FIA). The FIA has a main loop that select a form item and then visits it.

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

• Blocks : Contain executable code. Are often usedfor presenting information to the user. If text is placedwithin it’s treated as a command to play audio to theuser.

<form> <block>Hello, world!</block> </form>

This document indicates that the VoiceXMLinterpreter should play the audio “Hello, world!” tothe user.

<form><block> Hello, world!</block>

</form>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

In the example the audio element is played usingthe text-to-speech (TTS) engine. So the audiosynthesizer must be activated.

It is posible to improve the speech signal byplaying an audio file (in wav format) instead ofusing the synthesizer:

This audio element plays the wav file located atthe relative URL "ui/welcome.wav."

<audio src=”ui/welcome.wav”/>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

• Fields: are used to get input from the user.A field item has several important components:

name, prompt, grammar, event and <filled>element.

- Name: identifies the variable that is associatedwith the user input you collect.

- Prompt: within a prompt element you cancontrol an audio output and wait for the user input.

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

For instance:

The interpreter has to activate the synthesizer withthe input “What state?” but also has to activate therecognizer to get the user input. This will be placed inthe var “state”.

<field name=”state”><prompt>What state?</prompt>........

</field>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

- Grammar: defines the set of valid expressionsthat a user can say when interacting with a voiceapplication. Each interactive dialog in an application references one or more grammars usingone or more grammar elements.

Two types of grammars can be defined usinggrammar elements: external and inline grammars.

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

External Grammar

<grammar src=“state.gram” type=“application/x-jsgf”/>

<grammar type=“application/x-jsgf”>

...

</grammar>

Inline Grammar

TIC en Redes MóvilesTIC en Redes Móviles

ABNF ABNF –– Augmented BackusAugmented Backus--Naur FormatNaur FormatProvides a way of describing regular grammars.

This type of grammar must be properly defined in order to cover all the speaking options.

Features of ABNF: word Terminals $rule Rule name (non terminal) [x] Optional words (...) Grouping x{tag} Comments to x x | y | z x or y or z x <m-> m or more occurrences of x x <n> n occurrences of x x <m-n> between m and n occurrences of x/W/ x multiplying factor in the

likelihood domain

TIC en Redes MóvilesTIC en Redes Móviles

ABNFABNF

- Example 1#ABNF 1.0 ISO8859-1;root $welcome = hello | (good evening) | hi ;

hello

hi

good evening

TIC en Redes MóvilesTIC en Redes Móviles

ABNFABNF

- Example 2

$action = please (open | close | delete) ;

open

delete

closeplease

TIC en Redes MóvilesTIC en Redes Móviles

ABNFABNF

- Example 3

$action = [the | a] (window | file |menu) ;

window

file

menu

a

the

TIC en Redes MóvilesTIC en Redes Móviles

ABNFABNF

- Example 4

public $basicCmd = $action $object ;$action = open | close | delete | move ;$object = [the | a] (window | file | menu);

Posibilities:

- Open the window

- Close a file

- Open menu

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

- Event : are thrown by a catch element when theuser does not respond, doesn’t respond intelligibly,request help, etc.

If the user says “help” the synthesizer will beactivated with the sentence “Please speak the city forwhich you want the weather”.

<catch event =”help”>Please speak the city for which you want the weather.</catch>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm itemsThe default catch events and their behaviors are

shown in this table:Event Type Audio Provided Action

cancel no don’t reprompterror yes exit interpreterexit no exit interpreterhelp yes reprompt

noinput no repromptnomatch yes reprompt

telephone.disconnect no exit interpreterall others yes exit interpreter

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

- <filled> element: specifies an action to performafter the field is filled in by user input.

<field name=”city”><grammar src= ”http://www.ship-it.example/cities.gram/><prompt>What is the city?</prompt><filled>

<if cond= ”city == ‘New Orleans’”><prompt> New Orleans service is being repaired.</prompt>

</if></filled>

</field>

TIC en Redes MóvilesTIC en Redes Móviles

FormForm itemsitems

• Goto: used to go to another dialog in the same ordifferent document. With the “next” attribute youset the URI to transition to.

<filled><if cond="city ==’ New Orleans ‘”>

<goto next="http://city.example/cities" /></if>

</filled>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items

• Submit: similar to goto as it also goes to a newdialog but unlike goto, it allows you to get the valuesof a list of variables with the attribute “namelist”.

<block><submit next=/servlet/weather” namelist=”city state”/>

</block>

TIC en Redes MóvilesTIC en Redes Móviles

Form itemsForm items• Script: used for introducing script languagecode.

<form><var name=”hours”/> <var name=”minutes”/> <var name=”seconds”/><block>

<script>var d = new Date();hours = d.getHours();minutes = d.getMinutes();seconds = d.getSeconds();

</script></block><field name=”hear_another” type=”boolean”>

<prompt>The time is <value expr=”hours”/> hours,<value expr=”minutes”/> minutes, and<value expr=”seconds”/> seconds.

</prompt><prompt> Do you want to hear another time? </prompt>

</field></form>

TIC en Redes MóvilesTIC en Redes Móviles

MenuMenu

Presents a list of choices to the user andtransitions to the chosen information.

Menu items:• Prompt: used to present the list of choices tothe user.• Choice: specifies a speech or DTMF grammarfragment and an event or a location to transitionto when a recognition match occurs. The choiceelement works in conjunction with the promptelement, which prompts the user for specificinput in the form of a list of choices.

TIC en Redes MóvilesTIC en Redes Móviles

Menu itemsMenu items

<menu><prompt>Welcome. Say one of:sports, weather, news.</prompt><choice next= http://www.sports.example/start.vxml>

sports </choice><choice next= http://www.weather.example/intro.vxml>

weather </choice><choice next= http://www.news.example/start.vxml>

news </choice></menu>

TIC en Redes MóvilesTIC en Redes Móviles

Menu itemsMenu items

• Property: used to set values that affect platformbehavior, such as recognition process, timeoutscaching policy, etc. If DTMF is required in menu,property element must be included:

<menu><property name= “inputmodes” value= ”dtmf”/><prompt>

For sports press 1, for weather press 2, for news press 3.</prompt><choice dtmf=”1” next=http://sports.example/start.vxml/><choice dtmf=”2” next=http://weather.example/intro.vxml/><choice dtmf=”3” next=http://news.example/start.vxml/>

</menu>

TIC en Redes MóvilesTIC en Redes Móviles

Menu itemsMenu items

• Enumerate: is a description of the choices avalaible to the user. If there’s no content it lists all the choices used:

<menu><prompt>Welcome. Say one of: <enumerate/> </prompt><choice next= http://www.sports.example/start.vxml>

sports </choice><choice next= http://www.weather.example/intro.vxml>

weather </choice><choice next= http://www.news.example/start.vxml>

news </choice></menu>

TIC en Redes MóvilesTIC en Redes Móviles

Menu itemsMenu itemsIf enumerate has a content, it will have two special

variables: _prompt (choice’s prompt) and _dmtf(choice’s assigned DTMF sequence):

<menu><prompt>

Welcome.<enumerate>

For <value expr=”_prompt”/>, press <value expr=”_dmtf”/>.</enumerate></prompt><choice next= http://www.sports.example/start.vxml>sports </choice><choice next= http://www.weather.example/intro.vxml>weather </choice><choice next= http://www.news.example/start.vxml>news </choice>

</menu>

TIC en Redes MóvilesTIC en Redes Móviles

VoiceXML ExampleVoiceXML Example

Student Information Voice System

LoginIdentification number

ConfirmationPIN

Main_menu

Add Drop enrolled Description Exit

TIC en Redes MóvilesTIC en Redes Móviles

studinfostudinfo..vxmlvxml

<?xml version="1.0"?><vxml version="1.0">

<form id=“welcome"><block>

<audio src="http://www.gtc.cps.unizar.es/~eduardo/wav/alerta.wav"/><audio src="file://C:\Windows\Media\ringin.wav"/>

Thank you for calling the Student Information Voice System<goto next="login.vxml" />

</block></form>

</vxml>

TIC en Redes MóvilesTIC en Redes Móviles

<?xml version="1.0" ?><vxml version="1.0"><property name=“bargein" value="FALSE" /><var name="login_number"/><!--ask for the identification number (1 or 2)-->

<form id=“idn_prompt"><!--ssn_number field. Will be the variable that gets filled with the user's input-->

<field name=“idn_number"><!--defines allowed user input with dtmf-->

<dtmf> 1 <or/> 2</dtmf><!--Prompt that requests and listens for user input-->

<prompt>Please enter your Identification Number using the Touch Tone keys on your phone

</prompt><!--If the number that the user entered does not match what is allowed in the grammar above, this event is thrown-->

<nomatch>I'm sorry, that was not a valid entry<reprompt />

</nomatch><!--If the user does not provide input, this event is thrown-->

<noinput>I'm sorry, I did not hear you<reprompt />

</noinput>

LoginLogin..vxmlvxml

TIC en Redes MóvilesTIC en Redes Móviles

<!--When a match has been found--><filled>

<!--When idn_number equals ‘1‘ or ‘2’, goto confirm form--><if cond=“idn_number == '1' || idn_number == '2'">

<assign name="login_number" expr=“idn_number"/><goto next="#confirm" />

</if></filled>

</field></form>

<!--confirm form--><form id="confirm">

<!--Reads back to the user what his/her input was--><block>

You have entered the following ID number, <value expr="login_number"/></block>

<!--confirm_idn field. Asks user to confirm the identification number entered with a YES or NO--><field name="confirm_idn">

<!--allowed input. Caller may speak YES/NO or enter 1/2 on the keypad--><grammar type="application/x-gsl">[yes no]</grammar><dtmf> 1 <or/> 2 </dtmf><prompt count=“1”>

Please, if this is correct, say yes or press 1. If this is not correct, say no or press 2.</prompt>

LoginLogin..vxmlvxml

TIC en Redes MóvilesTIC en Redes Móviles

<prompt count=“3”>Please, use the touch tone keys on your phone. If the ID number is correct, press 1. If

this is not correct, press 2.</prompt><noinput>

I'm sorry, I didn't hear you<goto next="#confirm" />

</noinput><nomatch>

That is not a valid answer<goto next="#confirm" />

</nomatch><filled>

<!--if caller said "yes" or pressed '1', goto enter_pin form--><if cond="confirm_idn == 'yes' || confirm_idn == '1'">

<goto next="#enter_pin" /></if>

<!--if caller said "no" or pressed '2', return to ssn_prompt--><if cond="confirm_idn == 'no' || confirm_idn == '2'">

<goto next="#idn_prompt" /></if>

</filled></field>

</form>

LoginLogin..vxmlvxml

TIC en Redes MóvilesTIC en Redes Móviles

LoginLogin..vxmlvxml<form id="enter_pin">

<field name="pin_number"><!--Allowed user input. Caller may speak or enter '1357’,’1234’ - NOTHING ELSE!-->

<grammar type="application/x-gsl">[( 1 3 5 7 ) (1 2 3 4)]</grammar><dtmf> 1 3 5 7 <or/> 1 2 3 4 </dtmf>

<prompt>Please enter your pin number

</prompt><nomatch count=“1”>I'm sorry, that entry was not valid<reprompt /></nomatch><nomatch count=“2”><prompt> I'm sorry, that entry was not valid. This is your <emp>last</emp> chance.

</prompt><reprompt /></nomatch><nomatch count=“3”>Entrance denied.<exit /></nomatch>

TIC en Redes MóvilesTIC en Redes Móviles

LoginLogin..vxmlvxml

<noinput count=“1”><reprompt /></noinput><noinput count=“2”>I could not hear you. Goodbye.<exit/> </noinput>

<filled><!--If caller speaks/enters '1357' as their PIN number, goto main_menu.vxml-->

<if cond="pin_number == '1357' || pin_number == '1234'"><goto next="main_menu.vxml" />

</if></filled>

</field></form>

</vxml>