mediahub: an intelligent multimedia distributed platform hub glenn campbell, tom lunney & paul...
TRANSCRIPT
MediaHub: An Intelligent MultiMedia
Distributed Platform Hub
Glenn Campbell, Tom Lunney & Paul Mc Kevitt
School of Computing and Intelligent Systems Faculty of Engineering
University of Ulster, Magee CampusDerry/Londonderry
Northern Ireland
{Campbell-g8, TF.Lunney, P.McKevitt} @ulster.ac.uk
Outline Research objectives Related research Architecture of MediaHub Dataflow Semantic representation/storage Communication Decision-making in MediaHub Future development
Research Objectives
Interpret/generate semantic representations of multimodal input/output
Perform fusion and synchronisation of
multimodal data (decision-making) Implement and evaluate a multimodal platform
hub (MediaHub)
Semantic representation and storage?
Communication?
• Decision-making?
Key research problems
Related Research
CORBA (Vinoski 1993) COLLAGEN (Rich et al. 1997) Open Agent Architecture (Cheyer et al. 1998) Chameleon (Brøndsted et al. 1998) Ymir (Thórisson 1999) Interact (Jokinen et al. 2002) SmartKom (Wahlster 2003, 2006) Psyclone (Thórisson et al. 2005) Hugin (Jensen 2001)
Architecture of MediaHub
Architecture of MediaHub
Marked-up MultiModal Input/Output (XML)
Dataflow in MediaHub
Dialogue Manager
MediaHub Whiteboard (EMMA)
Decision-Making ModuleHugin Decision Engine
Semantic Representation XML used for input/output data Well established standard mark-up language Allows MediaHub to be integrated into other existing
multimodal systems XML input is validated against a Document Type
Definition (DTD) Using EMMA (Extensible MultiModal Annotation
mark-up language) for semantic representation EMMA is a derivative of XML EMMA is suited to representing confidences relating
to multimodal data (confidence tag)
Example XML input file<?xml version="1.0"?><!DOCTYPE multimodal SYSTEM "C:\Psyclone2\MediaHubInput.dtd"> <hypotheses> <hypothesis1><language>
<match> <yes>0.8</yes><no>0.2</no>
</match><confidence>
<yes>0.9</yes><no>0.1</no>
</confidence></language><gesture> …
…</gesture><referentObject>Object 1</referentObject></hypothesis1><hypothesis2>
…
Semantic Storage
Blackboard-based method of semantic storage
Marked-up input in EMMA format stored on central whiteboard (MediaHub Whiteboard)
All input/output messages in MediaHub are stored on whiteboard and can be accessed at any stage in the decision-making process
Whiteboard and Dialogue Manager form kernel of MediaHub
Communication MediaHub uses Psyclone for distributed processing Psyclone uses OpenAIR specification for
communication Modules of MediaHub communicate by passing
messages through MediaHub Whiteboard Implements a publish-subscribe architecture For example, Decision-Making Module registers for
messages of type *input* All messages relating to input posted on whiteboard
will automatically be sent to Decision-Making Module
Module registration is done in XML specification file, called PsyProbe, run automatically at start-up
PsySpec Example<executable name="DMM" consoleoutput="yes"> <sys ostype="Win32"> java -cp .;JavaOpenAIR.jar DMM psyclone=%host%:%port% name=
%name% </sys>
</executable>
<spec> <triggers from="any" allowselftriggering="no">
<trigger type="*input*"/><trigger type="MediaHub.shutdown"/>
</triggers>
<posts> <post to="MediaHub_Whiteboard" type="dmm.register" /> </posts> </spec> </module>
Decision-making
MediaHub employs Bayesian decision-making over multimodal data
Bayesian networks developed using Hugin software tool (Jensen 2001)
Networks are accessed using Hugin API (Java)
A unique approach to decision-making in an intelligent multimedia distributed platform hub
Hugin
Tool for implementing Bayesian Networks as CPNs (Causal Probabilistic Networks)
Hugin GUI Graphical user interface to Hugin decision engine
Hugin API Library implemented in Java Allows programs to implement Bayesian
Networks for decision-making
Bayesian Networks
AKA Bayes nets, Causal Probabilistic Networks (CPNs), Bayesian Belief Networks
Consists of nodes and directed edges between nodes Node represents a variable Influence between nodes represented by edges
Exercise
Weight Loss
Diet
‘Diet’ and ‘Exercise’ nodes have influence over ‘Weight
Loss’ node
MediaHub Example Network
G1-3 represents the belief that the user is referring to Objects 1-3, based on gesture input L1-3 represents the belief that the user is referring to Objects 1-3, based on language input CG1-3 and CL1-3 represent the confidence associated with G1-3 and L1-3
Bayesian Network Design Process
1. Characterise decision-making scenarios
2. Design Bayesian networks for decision-making scenarios
3. Use the Hugin GUI to build Bayesian networks and complete conditional probability tables
4. Run and test networks, making changes to networks and tables as required
5. Develop Java code that will open, edit and run the Bayesian network using the Hugin API
Decisions in MediaHub
Input: Determining semantic content of input Fusing semantics of input Resolving ambiguity at input
Output: Synchronising multimodal output Best modality for output
Input example
“Copy all files from the ‘process control’ folder of this computer to a new folder called ‘check data’ on that computer”.
Output Example
P
“This is the route from Paul’s office to Tom’s office”.
T
Conclusion
An intelligent multimodal distributed platform hub called MediaHub is under development
MediaHub interprets/generates semantic representations of multimodal input and output
MediaHub performs fusion and synchronisation of multimodal data
MediaHub provides a new method of decision-making within a distributed platform hub
Future development
Define all necessary decisions for example scenarios
Develop Bayesian decision-making using Hugin API (Java)
Develop a GUI to illustrate the functionality of MediaHub
Test MediaHub on example scenarios
Compare MediaHub to other systems
Write thesis
Questions?