el proyecto hops enabling an intelligent natural language based hub for the deployment of advanced...

El proyecto HOPSEnabling an Intelligent Natural Language

Based Hub for the Deployment of Advanced Semantically

Enriched Multi-channel Mass-scale Online Public

Services

http://www.hops-fp6.org/

http://www.hops-fp6.org/

Descripción del proyecto

Duración: tres años (2004-2006)Financiación: Sexto programa marco de la Comisión EuropeaObjetivo: facilitar el acceso de los ciudadanos europeos a sus administraciones públicas más cercanas - Desarrollo de una plataforma inteligente que permita al usuario acceder a

- Información y servicios - Mediante teléfono e internet

Tecnologías

• Portales de voz. Voice XML

• Procesamiento de Lenguaje Natural

• Web Semántica

Lenguas

• Castellano, catalán, italiano e inglés

Los participantes

Ayuntamiento de Barcelona

Empresa de software inteligente

Empresa de hardware y software

Empresa de aplicaciones

Los participantes (II)

Ayuntamiento de Torino

Consorcio público de la región de Piamonte

Centro de investigación de Piamonte

En la division R&D de Telecom Italia

Universidad de Torino. Grupo de LN del departamento de IA

Los participantes (III)

London Borough of Camden

Empresa Runtime Collective

Universidad de Amsterdam. Facultad de ciencias. Departamento de informática aplicada a ciencias sociales.

Motivación

Habitantes: 900.000 Acceso a la web: 125.000 al mesLlamadas telefónicas: 1,5 millones al año

Habitantes: 200.000Más de 120 lenguas diferentes Acceso a la web: utilización de una taxonomía de servicios estándar en el Reino Unido

Habitantes: 1,5 (4,3 región metropolitana) Acceso a la web: 40.000 diarias - 63 servicios administrativos online Llamadas telefónicas: Más de 4,3 millones al año - Una media de tiempo de espera de 3 segundos

Arquitectura del Sistema

Integración de los componentes mediante FADA

La red de FADA no está centralizada

El módulo de voz

• Sintetizador de texto a voz (TTS) – Speech Sintesis Mark-up Language

• Reconocedor de voz (ASR) – Speech Recognition Grammar Specification

• Intérprete de VoiceXML – Compilador estándard de VoiceXML

• Controlador del Servidor de voz – Controla la interacción entre el TTS, el ASR y el

Intérprete VoiceXML

El módulo de voz

• Speech Recognition grammars

– Desarrolladas en formato W3C SRGS-XML

– Utilización del mecanismo “garbage”– Utilización de “objetos gramaticales”

(built-in grammars)

El módulo de voz

–Mapeado conceptos gramaticales a los parámetros utilizados por el controlador de diálogo

Controladores de voz

• Voice Server- Plataforma Modular para Voice

browsing

• Audio Web Server– Interfaz para conectar Voice

Server con Application Manager

El módulo de textodel prototipo 1

Analizador left-corner que procesa el texto utilizado por el usuario• Utiliza filtraje top-down

• Realiza el análisis sintáctico y semántico en paralelo

• Utiliza el conocimiento del contexto del diálogo

• Implementado en Prolog

El módulo de textodel prototipo 1 (II)

•Usa conocimiento específico del dominio

•Representado en la gramática y el léxico

•Utiliza conocimiento del contexto•El parámetro cuyo valor se ha

preguntado al usuario

• Incorpora un mecanismo de “basura”

El módulo de textodel prototipo 1 (III)

•Gramáticas y léxicos - Gramáticas libres de contexto

incorporan información semántica- Reglas Gramaticales y entradas

léxicas generales para todas las aplicaciones

- Reglas Gramaticales y entradas léxicas específicas para cada aplicación

El módulo de textodel prototipo 1 (IV)

•La interpretación semántica se basa en lambda calculus

•El resultado de la interpretación semántica es un conjunto de atributos y sus valores

Integración de voz y texto en el prototipo 1

• Los mensajes del sistema son los mismos

• Las gramáticas y léxicos son los mismos– Representados en diferentes

formalismos– Adaptados a las particularidades de

cada modo (i.e., errores ortográficos)

Integración de voz y texto en el prototipo 1 (II)

• Los recursos lingüísticos se obtienen del conocimiento de la aplicación– Se pueden utilizar ontologías de

dominio

• El flujo del diálogo se obtiene del conocimiento de la aplicación

Integración de voz y texto en el prototipo 1 (III)

Los recursos lingüísticos generales se reutilizan – Expresión de parámetros que

aparecen en varias aplicaciones•Fechas•Direcciones•Telefónos•Precios

El módulo de textodel prototipo 2-3

• Un analizador sintáctico estadístico

• Fácilmente adaptable a lenguas parecidas

• Originariamente en italiano

• Adaptado al castellano, catalán e inglés

• Necesita poca adaptación para nuevos dominios

• Un procesador semántico

• Obtiene un conjunto de pares atributo-valor

El módulo de web semántica

• Ontología de dominio adaptado a la aplicación

• Se pueden utilizar ontologías ya existentes con los términos del dominio para obtener el vocabulario

El módulo de web semántica

Ontologías– Creadas con Protégé– Representadas en OWL– Accesibles via KPOntology + Jena– Se consultan con RDQL

Controladores de la información

• Ontology Manager– Encapsula el acceso a las

ontologías

• Back-end Manager– Accede a los back-ends de los

servicios de cada ciudad

Controlador de diálogo

• El controlador de diálogo controla la interacción en texto y voz

• Recibe de entrada la interpretación semántica– Un atributo cuyo valor el usuario ha pedido

• Parámetros de salida

– Un conjunto de atributos y sus valores • Parámetros de entrada

Controlador de diálogo del prototipo 1 y 2

• Basado en un conjunto de estados finitos

• Los diálogos son simples y dirigidos por el sistema– Se le pregunta al usuario la

información necesaria para acceder a la aplicación

Controlador de diálogo del prototipo 3

• Basado en teoria de los estados de la información (Information state-based theory)

• La interacción es más natural– La iniciativa puede ser del usuario o del

sistema– Se le pregunta al usuario la información

necesaria para acceder a la aplicación• Se le permite cambiar de tema, dar la información

en otro orden• Subdiálogos de confirmación, recuperación de

errores

Main goals in the design

• Facilitate the incorporation of new services, languages, channels– Separation of the channel-specific,

language-specific and service-specific aspects from the general components

• Flexible communication– Incorporates

• A text processing component• A generic dialogue manager

The Dialogue Manager

• Main functionalities– Understanding what the user’s intends– Determining what should be next actions– Generating next system sentences

• Supports different types of interactions– System-driven dialogues

• The system asks the user the information needed

– User-driven dialogues • The user takes the initiative

Different types of services• Transactional services

– The system asks input parameters and answers output parameters

– System-driven dialogues are efficient

• Informational services– The user can give different type of

information to restrict the search– System-driven dialogues may result

unnatural

THE CULTURAL AGENDA SERVICE

System: Do you know the title of the event, the place of the event, the date of the event or the type of the event?

User: I am looking for classical music concerts

THE DIALOGUE MODEL

• In finite state models (VoiceXML)– It supports system-driven

communication– Each dialogue state is the result of an

action• Information State Theory

– It supports flexible communication • It defines general dialogue phenomena:

feedback, clarification

– A rich representation of dialogue state• Believes, intentions, plans

PLANS

• System’s actions are defined in plans• Plans are generated statically when a

new service is incorporated• For each service task there is a plan• The actions of the plan can be

– Simple•Ask, answer, execution

– Compound•A set of simple actions

THE CULTURAL AGENDA SERVICE

System: Do you know the title of the event, the place of the event, the date of the event or the type of the event?


A plan for the cultural agenda service


action ask title unary precondition queryFocus title action ask type unary precondition queryFocus type action ask place unary precondition queryFocus place action ask date unary precondition queryFocus dateaction execution

compoundaction presentresults

System: Do you know the title of the event, the place of the event, the date of the event or the type of the event

action ask queryFocus multiple

THE LARGE OBJECT COLLECTION SERVICE

System: Are you a particular or a company ?User: A particularSystem: Where do you live?User: 327 DiagonalSystem: Which is your telephone number?User: 932222222System: What do you want to throw out?User: A table

Sistemes de diàleg a gran escala

• Diferents canals• Diferents llengües• Diferents aplicacions • Certa complexitat• Arquitectura modular i flexible:

distribuïda• Integració de mòduls heterogenis

Sistemes de diàleg a gran escala (II)

Arquitectures distribuïdes, basades en missatges• Galaxy Communicator Software Infrastructure

(Seneff, Lau and Polifroni, 1999) – Un hub central permet la interacció entre mòduls

• The Open Agent Architecture (Martin, Cheyer and Moran, 1999) – Un unitat central controla agents heterogenis

• The MULTIPLATFORM (Herzog et al, 2004) – Control no centralitzat, comunicació assíncrona

Sistemes de diàleg a gran escala (III)

• TRIPS: The Rochester Interactive Planning System (Allen et al, 2001)Especialitzat en planificació de tasques– evacuar gent de una illa amb huracan– coordinació de vehicles en una emergencia,…

• COLLLAGEN (Rich, Sidner and Lesh, 2001)– Appicació teoría sobre el discurs col.laboratiu

• GEMINI platform (Hamerich et al, 2004)– Basat en VoiceXML– Facilita la generació de sistemes per

aplicacions

Rapid Dialogue Prototyping Laboratory

Swiss Federal Institute of Techology of Lausane

– Eines semi-automàtiques

–Genera un model de diàleg per una aplicació concreta en 3 fases:

– Recull corupus de diàleg utilitzant tècnica Wizard of Oz

– Construcció d’un primer model del diàleg utilitzant l’eina Rapid

Dialogue Prototyping

– Implementació i validació del prototipus

REFERÈNCIES• Allen, J., Byron, D., Dzikovska, M., Ferguson, G., Galescu,

L and Stent, A. (2001) ‘Toward Conversational Human-Computer Interaction’, AI Magazine, vol. 22, no. 4, Winter.

• Araki, M. and Tachibana, K. (2006) ‘Multimodal Dialog Description Language for Rapid System Development’, Workshop Proceedings, the 7th SIGdial Workshop on Discourse and Dialogue, Sidney, Australia.

• Hamerich, W., Schubert, V., Schless, V., Córdoba, R., Pardo, J., D’Haro, L., Kladis, B., Kocsis, O. and Igel, S. (2004) ‘Semi-Automatic Generation of Dialogue Applications in the GEMINI Project’, Workshop Proceedings, SIGdial Workshop on Discourse and Dialogue, Cambridge, USA

REFERÈNCIES (II)

• Herzog, G., Ndiaye, A., Merten, S., Kirchmann, H., Becker, T., Poller, P. (2004) ‘Large-scale software integration for spoken language and multimodal dialog systems’, Journal of Natural Language Engineering, vol. 10, no. 3-4, September, pp. 283-305.

• Ito, K. (2005) ‘Introducing Multimodal Character Agents into Existing Web applications’, Conference Proceedings, the World Wide Web Conference, Chiba, Japan, pp. 966-967.

• Kumar, S., Cohen, P.R. and Huber, M.J. (2002) ‘Direct Execution of Team Specification in STAPLE, Conference Proceedings, AAMAS Conference, Bologna, Italy, pp. 567- 568.

• Martin, D. L., Cheyer, A. J. and Moran, D. B. (1999) ‘The Open Agent Architecture: A Framework for Building Distributed Software Systems’, Applied Artificial Intelligence, vol. 13, pp. 91-128.

REFERÈNCIES (III)

• Narayan, M., Williams, C. Perugini, S. and Ramakrishnan, N. (2004) ‘Staging Transformations for Multimodal Web interaction Management’, Conference Proceedings, World Wide Web Conference, New York, USA, pp. 212-223.

• Seneff, S., Lau, R. and Polifroni, J. (1999), ‘Organization, communication and control in the Galaxy-II conversational system’, Conference Proceedings, Eurospeech, Budapest, Hungary, pp. 1271-1274.

• Rich, Ch., Sidner, C. and Lesh, N. (2001) ‘COLLAGEN: Applying Collaborative Discourse Theory to Human-Computer Interaction’, AI Magazine,vol. 22,n 4,15-25.

• Richards, J. and Hanson, V. (2004), ‘Web accessibility: A broader view’ Conference Proceedings, World Wide Web Conference, New York, USA, pp. 72-79.

el proyecto hops enabling an intelligent natural language based hub for the deployment of advanced...

Documents