rough set semantics for identity management on the web

20
Rough Set Semantics for Identity management on the Web Wouter Beek (wouterbeek.com ) Stefan Schlobach Frank van Harmelen

Upload: wouter-beek

Post on 14-Jun-2015

1.188 views

Category:

Education


0 download

DESCRIPTION

Presented at the AAAI Fall Symposium for Big Data on 2013-11-15.

TRANSCRIPT

Page 1: Rough Set Semantics for Identity Management on the Web

Rough Set Semantics forIdentity management on the

WebWouter Beek

(wouterbeek.com)Stefan Schlobach

Frank van Harmelen

Page 2: Rough Set Semantics for Identity Management on the Web

Problems of identity

• Statements only hold in certain contexts (no substitution salva veritate)• Identity is mistaken for representation.• Identity is mistaken for (close) relatedness.

But more importantly:• Semantics: identity assertion (claim about meaning)• Pragmatics: data linking (import additional properties)• Due to: Open World Assumption

Quirinus Kuhlmann
Oedipus wants to marry Jocasta.Oedipus wants to marry his mother.
Quirinus Kuhlmann
The Website of the White House is not the White House.
Page 3: Rough Set Semantics for Identity Management on the Web

owl:differentFrom(Semantics,Pragmatics)

SEMANTICS iff

PRACTICE

“Link your data to other people’s data to provide context.”

[5-star LOD]

“RDF links often have the owl:sameAs predicate.”

[VoID]

Page 4: Rough Set Semantics for Identity Management on the Web

Can Leibniz help?

• Indiscernibility of identicals (Leibniz’ principle)

• Identity of indiscernibles

• Trivially true, since is one of the ’s

Page 5: Rough Set Semantics for Identity Management on the Web

Solutions (as identified in the literature) [1/2]1) Weaken owl:sameAsE.g. skos:closeMatch

2) Extend owl:sameAsAnnotate with Fuzzyness or uncertainty.

3) Make contexts explicitE.g. use named graphsE.g. use namespaces“That is the star that can be seen in the morning, but not in the evening”@geolocation

Page 6: Rough Set Semantics for Identity Management on the Web

Solutions (as identified in the literature) [2/2]4) Use domain-specific identity relations“x and y have the same medical use” @medicine“x and y are the same molecule” @chemistry

5) Change modeling practiceNotification upon read.Require reciprocal confirmation upon change.“On the Web of Data, anybody can say anything about anything.”[Van Harmelen]

Page 7: Rough Set Semantics for Identity Management on the Web

Indiscernibility

Identity is the smallest equivalence relation.

Indiscernibility: resources are the same w.r.t. a limited set of predicates.Indiscernibility is an equivalence relation (reasoning!), although not necessarily the smallest one.

Every indiscernibility relation is also an identity relation, but over a different domain:• Example: Take the set of people and property Context induces the identity

relation between income-groups.

Page 8: Rough Set Semantics for Identity Management on the Web

Indiscernibility 1Two resources are indiscernible w.r.t. a set of predicates (predicate terms in G), if they share the predicate-object pairs for .where Example: “Wouter and Stefan have the same employer, so they are indiscernible w.r.t. predicate hasEmployer.

Quirinus Kuhlmann
Example: in IMDB, (1) IIMBTBOX:spoken_in, (2) IIMBTBOX:form_of_government
Quirinus Kuhlmann
Note that we are not only interested in the properties that resources share with one other (e.g., where they are spoken, or which form of government they have), but we are also interested in resource pairs that share the same sharing properties.
Quirinus Kuhlmann
E.g., one subset of the identity relation does not discern resources that are spoken in the same language, whereas another subset of the identity relation does not discern resources that have the same form of government.
Page 9: Rough Set Semantics for Identity Management on the Web

Indiscernibility 2

• We take a given identity relation and partition it into subsets (i.e. identity sub-relations) which are described in terms of the vocabulary.• Subsets of the given identity relation are -indiscernible, for sets of

predicates

Example:• “(Wouter and Albert) and (Stefan and Paul) belong to the same

identity sub-relation, since they are indiscernible w.r.t. the same collections of properties.• Wouter and Albert are “employedAs PhD”; Stefan and Paul are

“employedAs Assistant Professor”.

Quirinus Kuhlmann
Page 10: Rough Set Semantics for Identity Management on the Web

Indiscernibility 2

For comparison:

Page 11: Rough Set Semantics for Identity Management on the Web

Example of an indiscernibility partition

Page 12: Rough Set Semantics for Identity Management on the Web

Rough set approximation

Higher approximation:

Lower approximation:

But what is (‘resemblance’)?

Quirinus Kuhlmann
Pawlak 1991
Page 13: Rough Set Semantics for Identity Management on the Web

Example of indiscernibility approximations

Page 14: Rough Set Semantics for Identity Management on the Web

Quality

• Based on the rough set approximation .• Since a consistently applied identity relation has relatively many

partition sets that contain either no identity pairs (small value for ) or only identity pairs (large value for ), a more consistent identity relation has a higher quality metric.

Quirinus Kuhlmann
The crispness of a set should be proportional to the quality of the identity relation on which it is based.
Page 15: Rough Set Semantics for Identity Management on the Web

Generalizations

• This works for any binary relation (not only owl:sameAs).• We only discussed the identity of non-property resources, but properties

can also be identical.• We skipped the treatment of blank nodes and typed literals (which have

special identity criteria).• The indiscernibility ‘language’ can be made must stronger, allowing more

fine-grained identity sub-relations:• Length-1 paths, e.g. “Wouter lives in the Netherlands.”• Length-2 paths, e.g. “Wouter lives in a country which borders Germany.”• Length- paths.• Intervals in the value space of typed literals, e.g. “was published between 1901 and

1905”• Natural language translation, e.g. “lives in Germany” and “lives in Deutschland”

Page 16: Rough Set Semantics for Identity Management on the Web

Depth- Predicate Path Map (PPM)

A sequence of predicates denoting a (functional) mapping from subject terms into sets of object terms:

Page 17: Rough Set Semantics for Identity Management on the Web

Indiscernibility 1 (generalized)Two resources are indiscernible w.r.t a set of PPMs , if they share the properties denoted by .

Example: “Wouter and Stefan have the same employer, so they are indiscernible w.r.t. has-employer.Details:•

Quirinus Kuhlmann
Example: in IMDB, (1) IIMBTBOX:spoken_in, (2) IIMBTBOX:form_of_government
Quirinus Kuhlmann
Note that we are not only interested in the properties that resources share with one other (e.g., where they are spoken, or which form of government they have), but we are also interested in resource pairs that share the same sharing properties.
Quirinus Kuhlmann
E.g., one subset of the identity relation does not discern resources that are spoken in the same language, whereas another subset of the identity relation does not discern resources that have the same form of government.
Page 18: Rough Set Semantics for Identity Management on the Web

Indiscernibility 2 (generalized)

We take a given set of pairs (e.g. an identity relation) and partition it into subsets which are described in terms of the schema.Subsets of the given (identity) relation are -indiscernible, for sets of PPNs

Quirinus Kuhlmann
Page 19: Rough Set Semantics for Identity Management on the Web

Indiscernibility 2 (generalized)

For comparison:

Page 20: Rough Set Semantics for Identity Management on the Web

Conclusion

Problem:• There is a conflict between semantics and pragmatics of identity.• This will not be fixed in the short term by using extensions to existing

logics (e.g. contexts, fuzziness, probability).Solution:• Identify different identity relations automatically, and in terms of the

domain predicates (no extra constructs are needed!).• Define the meaning of a specific identity relation in terms of its

indiscernibility criteria.

Quirinus Kuhlmann
If a LOD user does not like the use indiscernibility set s/he can:redefine the existing indiscernibility relation, by adding additional predicates.define a completely new indiscernibility relation, by adding new predicates.