from documents to knowledge models

30
From Documents to Knowledge Models Max Völkel [email protected] Forschungszentrum Informatik an der Universität Karlsruhe (TH)

Upload: eliana-mckee

Post on 31-Dec-2015

40 views

Category:

Documents


0 download

DESCRIPTION

Max Völkel [email protected] Forschungszentrum Informatik an der Universität Karlsruhe (TH). From Documents to Knowledge Models. Personal Knowledge Management. Definition: knowledge cues [Haller] - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: From Documents to Knowledge Models

From Documents to Knowledge Models

Max Vö[email protected]

Forschungszentrum Informatik an der Universität Karlsruhe (TH)

Page 2: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km2

Personal Knowledge Management

Definition: knowledge cues [Haller]

any kind of symbol, pattern or artefact which evokes some knowledge in a person’s mind, when viewed or used.

Knowledge cues can be stored and retrieved on a computer – while knowledge may or may not.

Ok, in fact you store bits (signals)

Page 3: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km3

What is a Document?

A team of 50 French researchers discussed …

Page 4: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km4

Definition: Document

A team of 50 French researchers could agree on:

Document as form Document as a container, which assembles and structures the

content to make it easier for the reader to understand it.

Document as sign Emphasize argumentative structure of the content. Document can be referenced acts as a sign for its meaning.

Document as medium “Reading contract“ = intention or assumption of the author what

will happen with the document.

Page 5: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km5

Document (my definition) I/II

A document consists of information atoms. An information atom is the smallest unit of content which can be

interpreted without a documents context (but of course requiring background knowledge). For text, these atoms are single words.

Document

Author, audience, goal

Packaging – establishes a context

Reference-ability – reference to a published document can act as a placeholder for the content expressed within.

Process metadata – should be sent along such as authors, audience, goal

Page 6: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km6

Document (my definition) II/II

A document is a knowledge artefact consisting of several layers:

Linearity

Visual Structure

Logical Structure

Argumentative Structure

Content Semantics

– content means something. Building upon logical and argumentative structure, the author

encodes statements about a domain within the content.

– defined order for navigating through all information

items

– guides the reader informally type-setting (i.e. bold, italics, different font styles and

size), placement of figures, pages – carries additional information

– can reference smaller parts within a document i.e. paragraphs, headlines, footnotes, citations, and title

– to convey its content to the reader. Argumentative structures appear on all scales. A typical

structure is the “Introduction - Related work – Contribution - Conclusion”-pattern of scientific articles. On smaller scales, patterns like “claim-proof” and “question-answer” are used.

Page 7: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km7

Ted Nelson

I propose a different document agenda:

I believe we need new electronic documents which are transparent, public, principled, and freed from the traditions of hierarchy and paper.

Page 8: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km8

What do people want?

Why?

Page 9: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km9

What is a Wiki? What‘s new compared to CMS?

Easy Contribution shorter time-to-publication Wiki pages can be created and edited by any user quickly and easily

Easy Writing Simple text formatting without the need to learn HTML Wiki Syntax

Easy Linking Automatic linking converts written names of pages, images and websites

to links

Recent Changes See what has happened – Awareness

Diff function shows the latest changes Easily check whether changes are ok

Fulltext search for page titles and text

Backlink function shows which pages link to the current page Find the context of this page

Directly link deep into a wiki using readable names

Wikis were the firstdeployed, collaborative

hypertext authoring environments

People want more links

Page 10: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km10

EntityX

EntityY

ArtifactX

ArtifactY

TypeA1

TypeB1

Real world from theviewpoint of the individual

Modelling

TypeC1

TypeA2

TypeB2

Type C2

(Meta-)Modelling

What is a Model? Typed entities and typed relations

My definition based on OMG metamodel MOF

Page 11: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km11

What is a Knowledge Model?

Document Ontology Knowledge Model

Information atoms

Text (paragraphs, images, multimedia resources)

Concepts Items (text, images, other binary resources)

- Text Short (headlines) and longer (paragraphs)

Short labels

Anything from short labels to structured documents

Order Strict linear order – Yes, may be partial and have cycles

Hierarchy Yes (chapters, sections, paragraphs, sentences)

Yes Yes, may be partial and have cycles

Annotations Yes (footnotes) Yes Yes

- Tagging (annotation with keywords)

– – Yes

- Typing (inc. Inferencing)

– Yes Yes

Hyperlinks Yes (internal references and external citations)

– Yes, don‘t have to occur inside text

Visual layout Yes – –

Page 12: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km12

From Documents to Knowledge Models

From analogue to digital documents

smaller content granularity

more interconnected content

more explicit structures.

Knowledge models

very small information atoms, such as single words

Richly connected items

explicit semantics for the links.

Definition

A knowledge model is a superset of documents and formal ontologies.

Annotated documents, stored together with their annotations, can be seen as a knowledge model.

Page 13: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km13

What is a CDS? Conceptual Data Structures

context

detail

before after

target

source

annotationmember

annotation

Item

M. Völkel and H. Haller: Conceptual Data Structures (CDS) - Towards an Ontology for Semi-Formal Articulation of Personal

Knowledge In Proc. of the 14th International Conference on Conceptual Structures

2006. Aalborg University - Denmark, July 2006.

Page 14: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km14

What is a CDS-based Knowledge Model?

A set of addressable items (text, images, maybe even multimedia elements)

Relations between items, classified in four types Source/target: the generic, directed hyperlink link Before/after: ordering relations, linear navigation Context/detail: hierarchical relations, document and concept

hierarchies Annotation/annotationMember:

annotations, to give the ability to type items and relations, items are used as types meta-modeling

Knowledge models must be able to capture work-in-progress CDS is not strict, you can have cycles, untyped items, paradox

ordering, …

Page 15: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km15

CDS: A Hierarchy of Relations

Undirected Relation: related/related

Directed Linking: source/target

Relation Typerelation/inverse

Labelled Links:…/…-inverse

Order: before/after

Hierarchy: detail/context

Instantiation: type/instance

Tagging: tag/tagMember

Subclassing: is-a/superclass-of

informal

formal

Equivalency: equivalent

Legend

Annotation: annotation/annotationMember

Taskpriority

Documentorder

Page 16: From Documents to Knowledge Models

Motivation

Page 17: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km17

Examples for Knowledge Models

Fiction Writing

SimulationReq. Engineering

Engineering

Thinking

Page 18: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km18

How does Writing/Reading works?

Writing / Sending

Write down ideas

Group them

Structure them

Add argumentation structures

Add references to literature

Link pieces in a first draft

Add introduction and conclusion

Repeat until coherent flow

Publish document

Reading / Recieving

Visualise the structure graphically

Connect new structures with existing own structures

Mind maps

Textprocessing

Reference Manager

???

???

Mind maps

„Von der Idee zum Text“ [Esselborn 2004]

Page 19: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km19

The tool chains break

Create a new slide show out of three old presentation plus one from your colleague Why not have the content in smaller, more logical chunks?

Re-use the motivation part of an old paper for a new one If you find a mis-spelling, why have to fix it twice?

Search a stack of paper notes with good ideas Why are those not in your computer?

Search email archives to find out what the high-level architecture for the new authentication system is Why not browse your PKM and see the relations?

Page 20: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km20

Technological Developments

accelerated distribution by many orders of magnitude

lower costs

timewritten languagein

tern

et

Analog Digital

Communicationspeed

printing press

cost

Page 21: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km21

Cost of Communication Data transmission is cheap now

Total cost of communication to send content to n people:

| choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical | + n ·( | data transmission |

| linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

Page 22: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km22

Cost of Communication Where can we save, if n is small?

Total cost of communication to send content to n people:

| choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical | + n ·( | data transmission |

| linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

Page 23: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km23

Cost of Communication

Total cost of communication to send content to n people:

| choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical | + n ·( | data transmission |

| linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

Page 24: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km24

Current process – culture is document-centric

Sender

Recipient(s)

Cost

Page 25: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km25

Ideal process - What if not documents, but knowledge models would be exchanged between people?

Sender

Recipient(s)

Cost

Page 26: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km26

Realistic (improved) process – use both

Sender

Recipient(s)

Cost

Page 27: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km27

Information Management Problems Solution: Knowledge Models

Under-utilisation of the interlinked nature of information [Oren] fine-granular nature of knowledge models allows for precise and effective linking – and browsing

People have problems in using strict hierarchies [Oren] classification methods like tagging and non-strict taxonomies

Keep the context [Oren] networked nature of a knowledge model is more suited to represent contextual links than a set of documents

Granularity Represent more than the content of just one document

Page 28: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km28

When to use Knowledge Models?

Use domain specific tools & languages Standardised representation formalisms Established data exchange processes

Fixed domain

Open domain- or –

Multiple domains

Use personal knowledge models Unstructured, semi-structured,

semi-formal and formal parts Ad-hoc formalisation Cheaper to create, easier to integrate

Use Documents Costly to create Cheap to read sometimes the best

solution Hard to integrate

Broad audience

Myself!

My TeamMy Community

Page 29: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km29

Related Work in Semantic Authoring

Initial ideas - although that term was not used - can be found already in V. Bush and D. Engelbart

ABCDE Format from Anita de Waard

Semantically annotated Latex (SALT) by Tudor Groza

Systems allowing end-users to construct ontologies out of their linked information objects. L. Ludwig sees redundancy within and among documents as a hurdle to efficient

information usage. Traditional notion of a document is replaced by virtual documents, which render parts of the knowledge base as an interactive tree.

Bernstein describes TinderBox, a "personal content management assistant", which offers sophisticated HTML generation via templates.

Gnowsis system by Sauermann allows to link desktop objects, integrates with wiki

iMapping – semantic concept maps by Haller

Same direction in the fields of semantic desktop and semantic wiki

Semantic Web Content Repository (swecr)

Page 30: From Documents to Knowledge Models

© 2007 Max Völkel, FZI29.03.07, ProKW @ WM2007, Potsdam, Germany

http://xam.de/2007/doc2km30

Conclusion

Documents Document-centered culture is a

costly legacy artefact and bottleneck for our society

Personal knowledge models Superset of documents and

ontologies Integrate with the semantic desktop Make knowledge worker happier and

more productive

Authoring is the bottleneck We should bring the power

of modeling to the end-user Don‘t break the tool chain Focus on work-in-progress

Thank You very muchfor Your attention

Contact:Max Völkel, [email protected]