provenance in the semantic web - uni koblenz-landaustaab/... · defining provenance provenance …...

48
Web Science & Technologies University of Koblenz Landau, Germany Provenance in the Semantic Web Steffen Staab Joint work with Simon Schenk, Renata Dividino, Christoph Ringelstein

Upload: others

Post on 10-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Web Science & TechnologiesUniversity of Koblenz ▪ Landau, Germany

Provenance in the Semantic Web

Steffen Staab

Joint work withSimon Schenk, Renata Dividino, Christoph Ringelstein

Steffen [email protected]

Summer School Semantic Web2

WeST

Semantic Web Web Retrieval Interactive Web Multimedia Web Software Web

Institut WeST – Web Science & Technologies

eGovernment eMedia eScience eOrganizations ePerson

Institute for Computer Science

Institute for Information Systems

Leibniz Institute for Social Sciences (GESIS)

Steffen [email protected]

Summer School Semantic Web3

WeST

Do you care where your data comes from?

Steffen [email protected]

Summer School Semantic Web4

WeST

How to loose 1,000,000,000 US$ in half a day

Via @Bauckhage

Steffen [email protected]

Summer School Semantic Web5

WeST

+++ „Los Angeles (dpa) – In der kalifornischen Kleinstadt Bluewater soll es nach einem Bericht des örtlichen Senders vpk-tv zu einem Selbstmordanschlag gekommen sein. Es habe in einem Restaurant zwei Explosionen gegeben...“ +++

German Press Agency DPA, 10 Sep 2009

Steffen [email protected]

Summer School Semantic Web6

WeST

Guerilla Marketing

Steffen [email protected]

Summer School Semantic Web7

WeST

Hoaxbetter check who said what when and

whether you actually want to trustsome information

Loosing your reputation quickly…

Steffen [email protected]

Summer School Semantic Web8

WeST

Defining Provenance

Provenance … means the origin… of something, or the history of the ownership or location of an object. The term was …used for works of art, but is now …

including science and computing. … In most fields, the primary purpose of provenance is to

confirm or gather evidence as to the time, place, and—when appropriate—the person responsible for the creation, production, or discovery of the object.

This will typically be accomplished by tracing the whole history of the object up to the present

http://en.wikipedia.org/wiki/ProvenanceMay 31, 2011

Steffen [email protected]

Summer School Semantic Web9

WeST

The situation…

Call to Ontoprise from an insurance company:„Can you integrate our 5000 databases?“

EU IP experience (Large Engineering Company):„oh, we just found another PC that has several tens of thousands of relevant documents“

Linked open data cloud

Steffen [email protected]

Summer School Semantic Web10

WeST

Some of the problems…

I have this piece of data.Can I actually believe it? Default answer: Find some expert and ask him.

I have this inconsistency in my data.Who has introduced it and why? Default answer: Try to find it in the sources.

I have this piece of data. How can I use it? Can I show it to anyone? Default answer:

• You are not allowed to do anything with it. Just throw it away.

Steffen [email protected]

Summer School Semantic Web11

WeST

Two Types of Provenance Knowledge

Provenance labels for facts

Which authority?

Which confidence?

When?

Who? Which privileges?

Bluewater is a City

Steffen [email protected]

Summer School Semantic Web12

WeST

Two Types of Provenance Knowledge

Provenance labels for facts Open Provenance Model RDF graph representing

Who did• what• when• why• …

to a data item

„ex post“ workflow instance audit/re-enact

Which authority?

Which confidence?

When?

Who? Which privileges?

admission prepareshare

1 5exami-nation

exami-nation

2askingpermit

43

Bluewater is a City

Steffen [email protected]

Summer School Semantic Web13

WeST

SPARQL QUERYINGUSING PROVENANCE

R. Dividino, S. Sizov, S. Staab, B. Schüler. Querying for Provenance, Trust, Uncertainty and other Meta Knowledge in RDF. In: Journal of Web Semantics. Elsevier, 7(3), 2009, pp. 204-219.

Steffen [email protected]

Summer School Semantic Web14

WeST

Representing Provenance Using URIs

http://bluewater.us a dbpedia:city.http://bluewater.us assertedBy http://neverest.de

BUT:http://bluewater.us a dbpedia:fakecity.http://bluewater.us assertedBy http://dpa.de

Who said what?

Steffen [email protected]

Summer School Semantic Web15

WeST

Representing Provenance Using URIs - 2

http://neverest.de/bluewater a dbpedia:city.http://neverest.de/bluewater assertedBy

http://neverest.de

http://dpa.de/bluewater owl:sameAshttp://neverest.de/bluewater.

http://dpa.de/bluewater a dbpedia:fakecity.http:// dpa.de/bluewater assertedBy http://dpa.de

What is the meaning of owl:same as now forprovenance?

Steffen [email protected]

Summer School Semantic Web16

WeST

Representing Provenance Using Named Graphs

http://dpa.de/ontology{ dbpedia:locatedIn rdf:domain dbpedia:company.dbpedia:locatedIn rdf:range dbpedia:city. …. }

http://neverest.de/kb{ http://bluewater.us a dbpedia:city.

http://vkptv.com a dbpedia:company. }

http://dpa.de/provenance{ http://dpa.de/ontology dpa:lrm „2000-01-01“.http://dpa.de/ontology dpa:trust „highest“.http://neverest.de/kb dpa:lrm „2009-09-09“.http://neverest.de/kb dpa:trust „lowest“. }

Steffen [email protected]

Summer School Semantic Web17

WeST

Ambiguity in Representing Provenance

Distributive Reading Each of the two facts is

assigned low trust

Cumulative Reading Taken together the two

facts are assigned lowtrust

http://neverest.de/kb{ http://bluewater.us a dbpedia:city.

http://vkptv.com a dbpedia:company. }

What does {…http://neverest.de/kb dpa:trust „lowest“. ..} mean?

Both readings are plausible under appropriate circumstances, but • Cumulative reading is harder to specify• Cumulative reading requires closing of sets of facts

(contrast to RDF open world semantics)

Steffen [email protected]

Summer School Semantic Web18

WeST

Meta Knowledge: When?Meta knowledge dimension Set of values plus two operators and such that and are partial orders with a maximum D D , D D

Least Recently Modified DateL xsd:dateTime , L Ls.t.lrm(a L b) = max(lrm(a), lrm(b))lrm(a L b) = min(lrm(a), lrm(b))

Total order, and are dual operators

Steffen [email protected]

Summer School Semantic Web19

WeST

Meta Knowledge: Who?Meta knowledge dimension Set of values plus two operators and such that and are partial orders with a maximum D D , D D

ProvenanceP 2^SOURCES , P Ps.t.prov(a P b) = prov(a) prov(b)prov(a P b) = prov(a) prov(b)

Partial order and are same operator

Steffen [email protected]

Summer School Semantic Web20

WeST

SPARQL: Algebraic Graph Query Languages

SELECT ?city, ?broadcasterWHERE {

?city a ex:city.{ ?broadcaster ex:activeIn ?city }UNION{ ?broadcaster ex:locatedIn ?city }

}

[WWW08,JoWS09]

Steffen [email protected]

Summer School Semantic Web21

WeST

SPARQL: Algebraic Graph Query Languages

SELECT ?city, ?broadcasterWHERE {

?city a ex:city.{ ?broadcaster ex:activeIn ?city }UNION{ ?broadcaster ex:locatedIn ?city }

}

1

2

3

2 13

[WWW08,JoWS09]

Steffen [email protected]

Summer School Semantic Web22

WeST

SPARQL: Algebraic Graph Query LanguagesSELECT ?city, ?broadcasterWHERE {

?city a ex:city.{ ?broadcaster ex:activeIn ?city }UNION{ ?broadcaster ex:locatedIn ?city }

}

1

2

3

2 13

|><|

min

max

2009-09-09

2009-09-09

2009-09-08

2009-09-08 2009-09-09 2009-09-09

2 13

[WWW08,JoWS09]

Steffen [email protected]

Summer School Semantic Web23

WeST

OWL REASONINGUSING PROVENANCE

S. Schenk, R. Dividino, S. Staab.Ontology Debugging Using Provenance.In: Journal of Web Semantics, Elsevier, accepted for publication.

Steffen [email protected]

Summer School Semantic Web24

WeST

Do we trust that bluewater is a real city?

German Press Agency, Highest trust, 2001-01-03

Neverest,Low trust, 2009-09-09

Steffen [email protected]

Summer School Semantic Web25

WeST

Explanation (Pinpointing)Given Ontology O, Axiom , O' O

O' is an explanation (pinpoint) for wrt. O, iffO' andO* for all O* O'

( ) ( )

1

2

3

4

1 2 3 4

Explanation formula?

Steffen [email protected]

Summer School Semantic Web26

WeST

Finding Pinpoints

OO‘

Steffen [email protected]

Summer School Semantic Web27

WeST

Computation of meta knowledge for OWL

Query: Meta Knowledge for

Compute Pinpointing Formula for wrt O(A1 … Am) … (Z1 … Zn)

Insert Meta Knowledge degrees and operatorsmin(max(lrm(A1), …, lrm(Am)), max(lrm(Z1), …, lrm(Zn))

Evaluate [KI 2009, SWPM2009]

Steffen [email protected]

Summer School Semantic Web28

WeST

Steffen [email protected]

Summer School Semantic Web29

WeST

(A1 … Am) … (Z1 … Zn)

min(max(lrm(A1)),…, lrm(Am)),…,max(lrm(Z1),…,lrm(Zn))

„Least recently modified?“

Steffen [email protected]

Summer School Semantic Web30

WeST

Optimization: Syntactic relevance

Steffen [email protected]

Summer School Semantic Web31

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7 TimeOrder

SyntacticRelevance

Oracle for you: relevant pinpoint

Color codes reachability

Steffen [email protected]

Summer School Semantic Web32

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7

Steffen [email protected]

Summer School Semantic Web33

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7

Steffen [email protected]

Summer School Semantic Web34

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7

Steffen [email protected]

Summer School Semantic Web35

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7

Steffen [email protected]

Summer School Semantic Web36

WeST

8

5

9

2

9

2

7

5

7

2

9

Optimized Computation of Provenance

9

3

7Relevant

pinpoint only contained

Steffen [email protected]

Summer School Semantic Web37

WeST

Evaluation: Computing Provenance in Milliseconds

Real-world provenance!

Steffen [email protected]

Summer School Semantic Web38

WeST

PROVENANCE AWARE POLICY LANGUAGE

Steffen [email protected]

Summer School Semantic Web39

WeST

Steffen [email protected]

Summer School Semantic Web40

WeST

create

Middle Rhine Hospital

Health Record

Policies

admission

create

1

Sticky Log create(P1): ukob is allowed to process health records for research purposes.

However, ukob is not allowed to transfer the health records of patients to other organizations.

(P2): The mrh demands that the record is only accessed by ukob afterthe sharing of the health records is approved by the patient and the approval must have been confirmed by a doctor.

Steffen [email protected]

Summer School Semantic Web41

WeST

create

exami-nation

Middle Rhine Hospital

Health Record

Policies

admission

create

create

update

update

update

askingpermit

exami-nation

update

update

prepareshare

de-id.

updateencrypt

fulfill

1 2 3 4 5

Sticky Log:

step (record, {mrh}, {}, create, patient_treatment, 1, {0})step (record, {mrh}, {}, update, examination, 2, {1})reduced (record, hidden, hidden, update, hidden, 4, {2})step (record, {mrh}, {}, de-identified, privacy, 5, {4})attribute (record, de-identified, true, 5)step (record, {mrh}, {ukob}, transfer, research, 6, {5})

Sticky Log

You

share forresearch

update

transfer

transfer

checktransfer

6

Steffen [email protected]

Summer School Semantic Web42

WeST

create

exami-nation

Middle Rhine Hospital

Health Record

Policies

admission

create

create

update

update

update

askingpermit

exami-nation

update

update

prepareshare

de-id.

updateencrypt

fulfill

1 2 3 4 5

Sticky Log:

step (record, {mrh}, {}, create, patient_treatment, 1, {0})step (record, {mrh}, {}, update, examination, 2, {1})reduced (record, hidden, hidden, update, hidden, 4, {2})step (record, {mrh}, {}, de-identified, privacy, 5, {4})attribute (record, de-identified, true, 5)step (record, {mrh}, {ukob}, transfer, research, 6, {5})

Sticky Log

You

share forresearch

update

transfer

transfer

checktransfer

6

permit (6)?

(P3):permit (ID) IF (step (record, _, _, transfer, _, ID, _) AND

attribute (record, de-identified, true, ID)).

Steffen [email protected]

Summer School Semantic Web43

WeST

create

exami-nation

Middle Rhine Hospital

Health Record

Policies

admission

create

create

update

update

update

askingpermit

exami-nation

update

update

prepareshare

de-id.

updateencrypt

fulfill

1 2 3 4 5

Sticky Log:

step (record, {mrh}, {}, create, patient_treatment, 1, {0})step (record, {mrh}, {}, update, examination, 2, {1})reduced (record, hidden, hidden, update, hidden, 4, {2})step (record, {mrh}, {}, de-identified, privacy, 5, {4})attribute (record, de-identified, true, 5)step (record, {mrh}, {ukob}, transfer, research, 6, {5})

Sticky Log

You

share forresearch

update

transfer

transfer

checktransfer

6

permit (6)?

(P3):permit (ID) IF (step (record, _, _, transfer, _, ID, _) AND

attribute (record, de-identified, true, ID)).

Steffen [email protected]

Summer School Semantic Web44

WeST

CONCLUSION

Steffen [email protected]

Summer School Semantic Web45

WeST

Data Value lies in

Past Knowing what happened to your data Knowing why it happened to your data

Present Drawing the right conclusions from your data

Future Deciding upon the destiny of your data

Your Strategy is based on Provenance!You better take care!

Steffen [email protected]

Summer School Semantic Web46

WeST

Core References

W3C working group: http://www.w3.org/2011/prov/wiki/Main_Page

IEEE Internet Computing, Vol 15, Issue 1, Jan/Feb 2011Special Issue on „Provenance in Web Applications“http://www.computer.org/portal/web/csdl/abs/html/mags/ic/2011/01/mic2011010017.htm

Journal of Web Semantics, Volume 9, Issue 2, 2011, Special Issue on „Provenance in the Semantic Web“http://www.sciencedirect.com/science/journal/15708268http://websemanticsjournal.org

Steffen [email protected]

Summer School Semantic Web47

WeST

Core References of Our Own WorkProvenance in RDF R. Dividino, S. Sizov, S. Staab, B. Schüler. Querying for Provenance, Trust,

Uncertainty and other Meta Knowledge in RDF. In: Journal of Web Semantics. Special issue on "The Web of Data". Elsevier, 7(3), 2009, pp. 204-219.

Provenance in OWL S. Schenk, R. Dividino, S. Staab, N. Kurz. Ontology Debugging Using

Provenance. In: Journal of Web Semantics. Special issue on “Ontology Dynamics“, Elsevier, 9(3), 2011.

Provenance for Policy Languages C. Ringelstein, S. Staab. Provenance-aware Policy Definition and Execution. In:

IEEE Internet Computing, special issue on Provenance in Web Applications, Jan/Feb 2011, pp. 49-58.

Capturing Provenance in Distributed Workflows C. Ringelstein, S. Staab. DiALog: A Distributed Model for Capturing Provenance

and Auditing Information. International Journal of Web Services Research (JWSR), Idea Group Publishing, 7(2): 1-20, 2010.

Steffen [email protected]

Summer School Semantic Web48

WeST

Thank You!http://west.uni-koblenz.de

See you again at…