technology for superimposed information

Superimposed Information - Stanford DB talk 1

Technology for

SuperimposedInformation

Lois Delcambre

with Shawn Bowers, David Maier, Mat Weaver

Database and Object Technology LabComputer Science and Engineering Department

Oregon Graduate Institute


Outline

• introduction to superimposed information

• a superimposed application: SLIMPad (DLI2 Project)

• model-based representation and transformation of information

• harvesting information to sustain our forests (NSF Digital Government project)


What is Superimposed Information?

data “placed over” existing information sources to:

highlight annotate elaborate select collect organize connect reuse information elements

often to support new applications, beyond the original


Examples of Superimposed Information

Non-electronic examples:

Commentaries on religious texts, law, literature Concordances, citation indexes

Electronic examples:

Your bookmark file in your web browser RDF metadata


Why work on it now?

• Broadening range of digital information

– Easier to overlay than “hard copy” forms– More and more sources of base information

• Accessibility/addressability to base information

– Reference (e.g., URL) can be resolved quickly

– Addressing at various levels of granularity

• Emerging Standards: RDF, Topic Maps, XLink


The superimposed and base layers with marks

Superimposed Layer

BaseLayer

Information Source1

Information Source2

Information Sourcen

…

marks


Outline






Paul Gorman, MD Lois Delcambre, PhDDavid Maier, PhD


Bundles in the wild………..

Observational team:Paul GormanJoan AshMary LavelleJason Lyman

…………..Bundles in captivityComputer science team:

Lois DelcambreDave MaierShawn BowersLongxing DengMathew Weaver


Let’s take a trip to the ICU


(Wild) Bundles


(Wild) Bundles

• manage information for diverse, complex tasks• contain selected, collected, structured, annotated• are often used in settings with:

– high uncertainty– low predictability– potentially grave outcomes– time & attention are highly constrained


(Wild) Bundles

• There is benefit in creating (active processing of information)

• There is benefit in reusing (trigger memory)

• There is benefit in sharing (establish collective, situated awareness)


Given….

• bundles are everywhere! • access to bundles provides access to important

information• information in bundles is often copied from other

information sources

• we can keep copied/referenced information linked through the use of marks


(Captive) Bundles

• SLIMPad - a scratchpad application to create bundles but….with referenced information connected to the underlying source data

• helping us explore architectural issues for building superimposed applications

• motivating definition of a metamodel to represent information with mappings to transform

• inspired by the observational work (but not focused on a specific medical task)


SLIMPad demo


Superimposed Layer Information Manager (SLIM) Architecture:

Contributions

• Mark Management - to create/resolve marks

• SLIM API - for the application developer

• TRIM store - for generic storage of superimposed information


SuperimposedApplication

The general architecture for managingsuperimposed information

Superimposed Information Management

ApplicationData

ApplicationSpecific

API

GenericManagement

TRIMStore

creates and manages

Mark Management


Mark Management

SLIMPad

Mark Manager

Mark DB

user

XML Documents

PDF files

Web Pages

Excel Spreadsheets

PPT Files


XML Viewer

PDF Viewer

IE Explorer

MS Excel

MS PowerPoint

HTML Module

Excel Module

PowerPoint Module

XML Module

PDF Module


SLIM API: as seen by application

Bundle

bundleName : StringbundleXPos : NumberbundleYPos : NumberbundleHeight : NumberbundleWidth : Number

Scrap

scrapName : StringscrapXPos : NumberscrapYPos : Number

SLIMPad

padName : String Mark

markId : String

1 *

1

*

*

0..1

Structured Bundle Model for SLIMPad.

AbstractBundle


What’s Next for this Project?

• Validation - cardiologists, ICU nurses, …

• Extend the informational model of SLIMPad

• Extend SLIMPad to suit a selected medical task

• Extension of observational work to other domains


www.cse.ogi.edu/footprints

• demos - including the QTVR of the ICU (with toys) and SLIMPad

• personnel• project description• papers

– “Bundles in the Wild: Tools for Managing Information to Maintain Situation Awareness”

– “Bundles in Captivity: An Application of Superimposed Information”

– papers discussing superimposed information


Outline






Model

Schema Data

Instance Data with Marks

InformationSource1

InformationSource2

SuperimposedLayer

BaseLayer

marksmarks

Model-Based Superimposed Information

But the model and schema are optionalBut the model and schema are optional


Our Goals

• Represent information generically, for various models

• Convert information from one representation scheme to another


Transforming Information

Generic Rep. (XML model)

Generic Rep. (XML model)

convert

Generic Rep.(Topic Map model)

XML

DB

XML Viewer

SQL

TM BrowserPaintingPainting PainterPainter

by painter

Influenced by

mentioned biographymentionedcritiqued

convert

Generic Rep.(Relational model)


Our Approach

• Metamodel – to represent multiple data models

• Generic, Uniform Representation Scheme– to store model, schema, and instances for model-based

information

• Mapping Formalism – to transform between representation schemes


The Metamodel

• Provides a level of abstraction above models• Describes the structural features of models

Topic Map

Topic Map Defintions

Topic Map Instances

XML

DTD

XML Document

Basic Set of Abstractions

Model Constructs and Relationships

Schema-LevelData

Instance-LevelData

Metamodel


XML Model, Schema, and Instance

• Elements, Element Types, Attributes, Attribute Types• Elements contain Attributes• Elements can be nested

<!ELEMENT schedule (flight*)><!ELEMENT flight (from, to, price)><!ATTLIST flight name CDATA #REQUIRED>

<schedule> <flight name=“Air Canada Flight 1575”> <from> PDX </from> <to> YVR </to> <price> $213.84 </price> </flight> ...</schedule>

XMLModel

XML DTD(Schema)

XML Document

(Instances)

Model constructs and relationships defined using the metamodel


Topic Map Example

PaintingPainting PainterPainterby painter

Influenced by

“Captive”“Captive” “Paul Klee”“Paul Klee”by painter influenced by

“Francisco de Goya”“Francisco de Goya”

“1914”“1914”by painter

mentioned biographymentioned

mentionedhttp://...

biography biography

http://...http://...

critiqued

critiqued

mentioned

http://...

http://...


Topic Map Model in UML

TopicType

ttypename : String

TopicRelType

relType : String

AnchorType

anchorRole : String

TopicInstance

title : StringtopicInsID : Number

TopicRelInst

AnchorInst

<<Mark>>Address

markID : String

*

*

*

**

* 1

1

1 11

1

<<conformance>>topic_instOf

<<conformance>>rel_instOf

<<conformance>>anchor_instOf

address

topicInstopicType

1 1

* *

topicType1

topicType2 1 1

* *

topicIns1

topicIns2


Generic, Uniform Representation

• We use RDF and RDF Schema to represent model, schema, and instance uniformly

http://…/~johncreator (creator, ‘http://…/~john’, person1)

(name, ‘person1’, ‘John Smith’)

Class

Property

creator

type

Person

WebPagetype

type

domain

range

(type, ‘creator’, Property)(domain, ‘creator’, WebPage)(range, ‘creator’, Person)(type, ‘Person’, Class)(type, ‘WebPage’, Class)

person1 ‘John Smith’name

RDF TriplesRDF Graph

RDF Schema TriplesRDF Schema Graph


The Metamodel Definition

ConstructStructuralConnector

Mark Lexical Conformance Generalization

connects 2 constructsBasic

MetamodelElements

Special Elements

Construct: A basic structural unit

Mark: A connection-point to the base-layer

Lexical: A primitive-value type

Connector: A relationship between 2 constructs

Conformance: A schema-instance relationship

Generalization: An inheritance relationship


Representing Models

(instanceOf, “TopicType”, Construct)(instanceOf, “TopicInstance”, Construct)

(instanceOf, “topic_instOf”, Conformance)(domain, “topic_instOf”, TopicInstance)(range, “topic_instOf”, TopicType)(domainMult, “topic_instOf”, “*”)(rangeMult, “topic_instOf”, “1”)

(instanceOf, “ttypename”, Connector)(domain, “ttypename”, TopicType)(range, “ttypename”, String)(domainMult, “ttypename”, “*”)(rangeMult, “ttypename”, “1”)

TopicType

ttypename : String

TopicInstance

*

1

<<conformance>>

topic_instOf


Representing Schema(instanceOf, “painting_tt”, TopicType)(ttypename, “painting_tt”, “painting”)(instanceOf, “painter_tt”, TopicType)(ttypename, “painter_tt”, “painter”)

(instanceOf, “byPainter_rt”, TopicRelType)(relType, “byPainter_rt”, “by painter”)(topicType1, “byPainter_rt”, painting_tt)(topicType2, “byPainter_rt”, painter_tt)

(instanceOf, “biography_at”, AnchorType)(anchorRole, “biography_at”, “biography”)(topicType, “biography_at”, painter_tt)

Topic Types (schema):painting, painter

Topic Rel Types (schema):by painter

Anchor Types (schema):biography

paintingpainting painterpainterby painter

biography


Representing Instances(instanceOf, “painter1”, TopicInstance)(title, “painter1”, “Paul Klee”)(topicInsID, “painter1”, “5”)(topic_instOf, “painter1”, painter_tt)(instanceOf, “painting1”, TopicInstance)(title, “painting1”, “Captive”)(topicInsID, “painting1”, “19”)(topic_instOf, “painting1”, painting_tt)

(instanceOf, “byPainter1”, TopicRelInst)(rel_instOf, “byPainter1”, byPainter_rt)(topicIns1, “byPainter1”, painting1)(topicIns2, “byPainter1”, painter1)

(instanceOf, “biography1”, AnchorInst)(anchor_instOf, “biography1”, biography_at)(address, “biography1”, a1)

(instanceOf, “a1”, Address)(markID, “a1”, “URLMarkManager@954308545”)

Topic (instances):Paul Klee, Captive

Topic Relationship (instance):a by painter relationship

Anchor (instance):a biography anchor

Address (instance):mark to URL


Basic Types of MappingsMapped

Converted

Converted

Converted

Converted

Converted

Inter-Model

Inter-Schema

Model-to-Schema

Model2

Schema1

Instances1

Model1

Schema1

Instances1

Model1

Schema1

Instances1

Model1

Schema1

Instances1

Model1

Schema2

Instances1

Model2

Schema2

Instances2

Mapped

Mapped


S(‘source’, (‘instanceOf’, X, ‘TopicInstance’))S(‘target’, (‘instanceOf’, X, ‘XMLElem’))

XMLElemTopicInstanceMapped

Mapping Rules

Simple production rules over triples


Mapping Rules (cont.)

XMLElemTopicInstance

XMLElemTypeTopicType

Mapped elem_instOftopic_instOf

S(‘source’, (‘topic_instOf’, X, Y))S(‘target’, (‘instanceOf’, X, ‘XMLElem’))S(‘target’, (‘instanceOf’, Y, ‘XMLElemType’))S(‘target’, (‘elem_instOf’, X, Y))


SuperimposedApplication

The general architecture for managingsuperimposed information


ApplicationData

ApplicationSpecific

API

GenericManagement

TRIMStore

creates and manages

Mark Management


Applications

• SLIM Pad– Scratchpad application with Bundle-Scrap model

(uses superimposed information)

• XML Extractor– “Extracts” XML information and transforms it into a Topic

Map for searching/browsing

XML FilesGeneric Rep.(XML model)

Generic Rep.(TM model)

DBMS

Topic Map BrowserTopic Map Browser

XML Extractor

XML Extractor

out mapped storedin


IDMEF to CISL

• IDMEF - Intrusion Detection


Harvesting Information to Harvesting Information to Sustain our Forests:Sustain our Forests:

Creating anCreating anAdaptive Management PortalAdaptive Management Portal

NSF DIGITAL GOVERNMENT PROGRAMNSF DIGITAL GOVERNMENT PROGRAM

Tim Tolle & Lois DelcambreTim Tolle & Lois [email protected] [email protected]@fs.fed.us [email protected]

Co-Project DirectorsCo-Project Directors


Project focuses on the:

Adaptive Management

Areas

USDA Forest ServiceUSDI Bureau of Land

ManagementUSDI Fish and Wildlife Service


Adaptive Management Portal: a value-added, Internet-based service

• Provide multiple access paths to forest information.

• Preserve local autonomy and local focus of each site.

• Support diverse users and types of information.

• Use proposed, existing, and de facto standards for content, classification, and technology.

• Be low-cost, scalable, extensible.


Project Funding

• Duration: 3 years

• Budget: $1.5 million

• Principal financial sponsors– National Science Foundation– Bureau of Land Management (Oregon State Office)– Forest Service (R-6 and PNW Station)– National Park Service (Western Region)


Team MembersTeam Members

Tim Tolle Tim Tolle Regional Coordinator for AMA, US Forest ServiceRegional Coordinator for AMA, US Forest Service

Eric LandisEric Landis Forest Information System Specialist, ConsultantForest Information System Specialist, Consultant

Craig PalmerCraig Palmer Natural Resources Monitoring Expert, UNLVNatural Resources Monitoring Expert, UNLV

Fred PhillipsFred Phillips Professor, Head, Mgt. of Science and Tech., OGIProfessor, Head, Mgt. of Science and Tech., OGI

Patty ToccalinoPatty Toccalino Asst. Prof., Environmental Science and Eng., OGIAsst. Prof., Environmental Science and Eng., OGI

Lois DelcambreLois Delcambre Professor, Computer Science and Eng., OGIProfessor, Computer Science and Eng., OGI

David MaierDavid Maier Professor, Computer Science and Eng., OGIProfessor, Computer Science and Eng., OGI

Shawn BowersShawn Bowers PhD Student, Computer Science and Eng., OGIPhD Student, Computer Science and Eng., OGI

Mat WeaverMat Weaver PhD Student, Computer Science and Eng., OGIPhD Student, Computer Science and Eng., OGI

Forest/environmental expertiseForest/environmental expertise Computer science expertiseComputer science expertise


Staff Scientist, Pacific Northwest National LaboratoryMark Whiting

Science Advisor, USDI, National Park ServiceRegina Rochefort

Communications Director, USDA Forest Service, PNW Research StationCynthia L. Miner

Chief, Office of Technical Support, Forest Resources, USDI Fish and Wildlife ServiceMonty Knudsen

Executive Director, IMFN SecretariatFred Johnson

MD, Asst. Professor, Division of Medical Informatics and Outcomes Research, OHSU Paul Gorman

Sustainable NorthwestMartin Goebel

USDA Forest Service, Pacific NW RegionRobert Devlin

President, IUFRO, Oxford Forestry Institute, Dept of Plant SciencesJeff Burley

Co-Inventor of the Topic Map ModelMichel Biezunski

Advisory Board

Forest/environmental expertiseForest/environmental expertise Computer science expertiseComputer science expertise


Task 1 – Status• Workshops @ Snoqualmie Pass Adaptive Management Area,

Cle Elum, WA (June and July)

• Interviews with Forest Service Corvallis Forest Sciences Lab and USGS FRESC, Corvallis (August)

• Interviews with Central Cascades Adaptive Management Area, Eugene (August)

• Interviews with the Applegate Partnership and its associated agencies (August)

• Rainier National Park (planned for October)


Things we’ve learned from Task 1 NSF Digital Government

• work is project-based

• primary product is information: assessments, studies, surveys, environmental impact statements

• multiple agencies are involved

• each agency serves as information gatherer; information broker; information consumer

• even though information is a primary product, information technology is secondary (stewardship of the land is the primary mission)


Research Issues

• Models for the superimposed layer• How does the superimposed model influence the

capabilities it supports?• How does the form of superimposed information

affect the effort to construct and maintain it?– Are some forms more robust to updates in the base layer– What forms map onto current information management tools


Research Issues (2)

• Challenges when superimposed and base layer have different models– E.g., structured over unstructured, or vice versa

• Bi-level tools– Browsing between layers– Queries over both layers

• How do we delimit the universe of discourse in the base layer?

• Is it easier to fuse superimposed information than base information?


Research Issues (3)

• Variations on the conceptual architecture– Commingled layers– “Super-superimposed information”

• How do capabilities of base layer affect structure and operations over superimposed information?– Addressing modes– Address comparison– Querying

• Addressing for non-web sources– Relational, object-oriented DBs


Research Issues (4)

• How to extend DBMSs to better deal with information they don’t store.

• How to help population superimposed information spaces.

• What are good formats for representation and exchange of superimposed information?


Why Databases Don’t (Currently) Solve It

• Seems closely related to view and data integration• However

– Superimposed information can’t always be derived from the base data

– DB approaches assume schema and common model– DBs like to work with data they control– Traditional approaches are heavy weight

• semantic analysis• schema integration• query mapping• On a source-by-source basis