a fusion framework for multimodal interactive applications

A Fusion Framework for Multimodal Interactive Applications

Presented by: Hildeberto Mendonça

Jean-Yves Lionel LawsonOlga Vybornova

Benoit MacqJean Vanderdonckt

ICMI-MLMI 2009 – Cambridge MA, USA, November 2-6, 2009Special Session Fusion Engines for Multimodal Interfaces

November 3, 2009

04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 2

Motivations

How to support multimodal fusion in order to maximize reuse and minimize complexity? If there is complexity on multimodal fusion it should

be about the fusion in itself What already exists should be reused with minimal

adaptation A general life cycle can guarantee a standard

treatment for each modality


Research Goal

To define and develop a multipurpose framework for high level data fusion on multimodal

interactive applications


Fusion Principles

Type: Parallel + Combined = Synergistic Each modality is endowed of meanings

Level: Feature (i.e. pattern extraction) + decision (i.e. Recognized task)

Input Devices: Multiple Notation: Defined by the developer Ambiguity resolution: Defined by the developer Time representation (Quantitative – Qualitative): Both Application Type : The domain is defined using ontologies


Process

Recognition: identification of patterns on input signals. Segmentation: delimitation of identified areas. Meanings Extraction: deeper analysis to identify

meanings and correlations between segments according to specific domains.

Annotation: formal description of segments through domain concepts.


Process

The flow is fixed but it can start at any point respecting the sequence.

Not fixed to any particular method. The method is “plugged”.

Focus on good level of analysis, not on real time processing.


OpenInterface


Fusion Mechanism

Define a process for each modality and put them in parallel.

Data from each stage is buffered and processed together for the purpose of fusion.

Agent-oriented: problem solved in a distributed fashion.


Fusion Mechanism


Fusion Mechanism – OpenInterface

OI Modeling Tool


Fusion Mechanism - Instance


Scenario

Maybe I can find a book about it in the

library

Ronald is moving towards the book

shelves


Results

managed spatial relationships based on the fixed objects in the room

made semantic fusion of events not coinciding in time achieved good results in speaker identification -

synchronization between image and speech identification created an open framework to manage fusion between two

(in our case) or more modalities (in enhanced future work) designed the system so that each component can run in a

separate machine due to the distribution mechanism interchanging data through a TCP/IP network


Next Steps

Implementing the segmentation and annotation of 3D content

Migrate the framework to a real-time implementation

Evaluate other methods under the rules of the framework

Continuously extend the framework to support other fusion concepts and methods of implementation


Thank you for your attention!

a fusion framework for multimodal interactive applications

Technology