rough sets - general introduction and one...

30
Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu Rough Sets General introduction and one theorem V.W. Marek Department of Computer Science University of Kentucky October 2013

Upload: others

Post on 25-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Rough SetsGeneral introduction and one theorem

V.W. Marek

Department of Computer ScienceUniversity of Kentucky

October 2013

Page 2: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

What it is about?

◮ “Rough Sets” is a popular formalism for talking aboutapproximations

◮ Esp. studied in Poland, Canada, US, China, India◮ Actually, invented by late Zdzisław Pawlak◮ Ties several areas of science: Statistics, Logic,

Universal Algebra, Topology, Combinatorics evenFunctional Analysis

◮ Motivated by situations when the language isinadequate to describe collections of objects

◮ (I wrote few papers on RS and was a coauthor of thefirst paper on RS)

Page 3: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

What it is about, cont’d

◮ Also I read a paper by Yanfang Liu and William Zhu,of Zhangzhou Normal University “Parameterizedmatroid of rough set” and will present one theorem(and its proof)

◮ The reason is that, originally, I doubted the result istrue, and the proof did not make sense

◮ This specific result ties Rough Sets with an importantarea of Combinatorial Optimization and explains whysome algorithms for Rough Sets work

◮ This is (unfortunately) the only actual proof that I willpresent in this series of lectures

Page 4: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Plan

◮ One-table bags of records, and associatedequivalence relation

◮ Rough sets◮ Matroids◮ Liu and Zhu theorem

Page 5: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Database background

◮ A table is a collection of records, possibly withrepetition

◮ In other words a bag, not set, of records◮ Let us assume now that we assign to each of these

records a unique identifier◮ Then there is an equivalence relation ∼ on the set of

the identifiers, namely:

i1 ∼ i2 if i1, i2 are identifiers of the same record

Page 6: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Example

Here is a table patients:

Id lname fname temp1 marek victor 104.22 morek vector 101.23 marek victor 104.24 marek victor 99.65 morek vector 101.2

(But remember that the id’s are NOT the part of data)

Here the relation ∼ has three equivalence classes:{1,3}, {2,5}, and {4}

Page 7: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Not every subset is describable

◮ If we implement this table in SQL (what is SQL?) theset consisting of records with identifiers 1,2, and 3,can not be described

◮ The point is that from the point of view of SQL,records where we set id 2 and 5 can not bedistinguished

◮ Only sets that are unions of equivalence classes of ∼can be described

◮ The set {1,3,4} can be described by:SELECT ∗ FROM patients WHERE lname = marek

Page 8: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Two kinds of linguistic inadequacy

◮ Say, we have a language for description (thinkmedicine, the original motivation of Pawlak)

◮ There may be sets of objects we can not describe(given that language)

◮ It is also possible that there is a description, but it isjust too big

◮ This happens when we have plenty of attributes andneed to perform attribute reduction to get ahuman-readable description

◮ When you do this records may becomeindistinguishable

◮ (Ever heard about Johnson-LindenstraussTheorem?)

Page 9: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

So, we have to approximateA

Ba1 a2 a3 a4 a5

b1

b2

b3

b4

b5

S

X

◮ There is a largest definable set included in a givenset X , often called interior of X , X

◮ There is a smallest definable set containing a givenset, often called closure of X , X

◮ We see them in our figure◮ There is a large number of obvious identities for

interior and closure

Page 10: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Topological angle

◮ (What about Alexandrov Topology in our context?)◮ It is just that we are not topologists, and think about

our objects as database objects◮ This, of course, has consequences; we implement

Rough Sets

Page 11: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Few important facts

◮ An equivalence class of x , [x ] is {y : y ∼ x}◮ The interior of the set X consists of the union of

equivalence classes included in X◮ Closure of the set X consists of the union of

equivalence classes that have a nonemptyintersection with X

◮ There are various characterizations of interior andclosure, in various terms

◮ One such characterization, by Mirek Truszczynskiand myself is that the pair 〈X ,X 〉 is the bestapproximation of X in the Kleene ordering of pairs ofdefinable sets

◮ (Kleene ordering is the order of approximationswhere the lower class goes “up” and the upper classgoes “down”)

Page 12: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Rough sets, formally◮ Given an equivalence relation ∼ in a set U, a rough

set determined by a set X such that X ⊆ U is thepair 〈X ,X 〉

◮ Then a rough subset of U is a pair determined byany subset X of U

◮ Besides of characterization mentioned above, thereare other characterizations of rough sets: in terms oftopology, in terms of Boolean Algebras withoperators, etc.

◮ The person who invented Rough Sets (no longer withus), Professor Zdzisław Pawlak, wrote a often quotedbook on the subject

◮ There is a journal Transactions on Rough Sets andeven a Rough Sets Society

◮ There is plenty of conferences on Rough Sets, in allsort of places

Page 13: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Matroid

◮ Matroid is a combinatorial structure that attempts tocapture notions behind concepts such asindependent set of vectors in a vector space

◮ But also cycle-free subgraphs of an undirected graph◮ Formally, a matroid is a pair 〈A,M〉 where M

consists of (some) subsets of A and satisfies thefollowing conditions:

◮ ∅ ∈ M◮ If A ⊆ B and B ∈ M then A ∈ M◮ If A,B ∈ M, |A| < |B| then for some x ∈ B \ A,

A ∪ {x} ∈ M

◮ (This definition of a matroid abstracts out of linearlyindependent subset of a vector space)

Page 14: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Matroids, cont’d

◮ This last property is called Steinitz exchangeproperty and whoever had a class of linear algebramust have heard about it

◮ The concept of matroid is one of fundamentalcombinatorial structures

◮ There are many other characterizations of matroidsin various terms

◮ One important connection of matroids and ComputerScience is so-called Rado-Edmonds Theorem thatcharacterizes greedy algorithms in terms of(weighted) matroids

◮ (Look up an absolute classic: Witold Lipski, jr.,Kombinatoryka dla programistow, ISBN82-204-2968-4)

Page 15: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Why are matroids important?

◮ They occur in many places, but the important point isthe characterization of Greedy algorithms viamatroids

◮ Say, we have a set A and a weight function,wt : A → R+

◮ Weight of set S ⊆ A is Σx∈Swt(x)

Page 16: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Rado-Edmonds Theorem

◮ We sort the set A according to weights in descendingorder

◮ Rado-Edmonds Theorem tells us that if a familyF ⊆ P(A) is a matroid, and we select greedily (i.e.we initialize X to the empty set and in each step weselect fresh maximum weight element x so thatX ∪ {x} is in F and then set X := X ∪ {x}) then wewill compute a base of maximum weight (what is abase?)

◮ When no fresh x ∈ A so that X ∪ {x} belongs to Fcan be found, we return X

Page 17: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Rado-Edmonds Theorem, cont’d

◮ Conversely, if F is not a family of independent sets ofa matroid, then there is a weight function where wewill not get a maximum-weight element of F

◮ (If you had a serious data-structures course, thencertainly these facts were learned - if you werepaying attention)

Page 18: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Matroids associated with rough sets

◮ Let U be a set of objects, and ∼ an equivalencerelation on U. Let Y ⊆ U

◮ Then Y determines a collection MY of subsets of Unamely

{A ⊆ U : A ⊆ Y}

◮ In our simple example, with 1 ∼ 3, 2 ∼ 5 and 4 inrelation with itself only, The set X = {1,2,3}determines the following class MX :empty set ∅. one-element sets {1}, {2}, {3}, {5}(but not {4}). What about {1,2}, {1,3}, {2,3}? Andare there more?

Page 19: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Matroids and rough sets, cont’d

◮ Here is the result of Liu and Zhu:◮ Let ∼ be an equivalence relation in the set U. For

every set Y ⊆ U, the structure MY is a matroid◮ (We will prove that)◮ Since the structure MY obviously is closed under

subsets (“if a set grows(?) smaller, the interior growssmaller”) first two conditions are obvious

◮ So now, let us assume that we have two sets, A, B inMY , with |A| < |B|. We need to find a objectx ∈ (B \ A) so that A ∪ {x} also belongs to MY

Page 20: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Case 1

◮ Some x ∈ B \ A has the property that [x ] = {x}◮ Specifically, that x is in relation ∼ with itself but not

with any other object◮ We claim that this specific x has the property that

A ∪ {x} ∈ MY

◮ Indeed, because [x ] is a singleton, x /∈ A,

A ∪ {x} = A ∪ {x}

◮ Now, A ⊆ Y (because A ∈ MY ) and also x ∈ Ybecause B ∈ MY , and {x} ⊆ B ⊆ Y

◮ Thus in this case the matter is easy

Page 21: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Case 2

◮ No x ∈ B \ A has the property that [x ] = {x}◮ Our idea now is to assume that for no x ∈ B \ A,

A ∪ {x} belongs to MY and work for a contradiction◮ The fact that A ∪ {x} /∈ MY means that A ∪ {x} is

strictly larger than A◮ But there are only two possibilities: either A ∪ {x} is

A or it is A ∪ [x ]◮ The first possibility does not hold - so the other one

must hold

Page 22: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

What does it mean?

◮ This means that all y 6= x , y ∼ x , are in A!◮ And this happens for all x ∈ B \ A◮ Next question we ask if it is possible that for some

x , y ∈ B \ A, x 6= y , x ∼ y◮ If that would be the case then [x ] = [y ] and

[x ] \ {x} ⊆ A and also [y ] \ {y} ⊆ A◮ But then [x ] ⊆ A, contradicting the fact that x /∈ A

Page 23: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

What is going on?

◮ For each x ∈ (B \ A) all the elements y such thatx ∼ y are in A!

◮ Moreover, because we are in Case 2, everyx ∈ B \ A is in relation ∼ with some element of A(actually of A \ B)

◮ Let us select, for each x ∈ (B \ A) one element ysuch that x ∼ y , x 6= y

◮ Then, because [x ] \ {x} ⊆ A, this function mapsB \ A into A (in fact into A \ B, because no object inB \ A is ∼ to any object in B \ A

Page 24: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

A bit of combinatorics

A\B B\A

Figure: The injection of B \ A into A \ B

◮ Then, because [x ] \ {x} ⊆ A, this function mapsB \ A into A (in fact into A \ B, because no object inB \ A is ∼-related to any different object in B \ A

◮ In fact it is an injection of B \ A into A \ B!◮ But then |B \ A| ≤ |A \ B|!◮ This contradicts the fact that |A| < |B| and completes

the argument

Page 25: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Few other things

◮ Here is a characterization of MY

MY = {A : A ⊆ Y}

◮ And one more:

MY = {A : (A \ Y ) = ∅}

◮ There are plenty of other similarly easycharacterizations of MY

Page 26: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

More stuff

◮ Discussing with Professor M. Truszczynski of myDepartment, I learned few things

◮ It is quite possible that these things are in manypapers of W. Zhu and his coauthors

◮ We will present Mirek’s suggestions now

Page 27: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Selectors

◮ Let F be a family of pairwise disjoint nonempty sets◮ A set Z is a selector for F if for all T ∈ F , |Z ∩ T | = 1◮ Something called Axiom of Choice (what is it?)

requires that selectors exist, but if F is a finite familyof finite sets, then no special axioms are needed

Page 28: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Family FY

◮ Now, let Y be a subset of U◮ The family FY is defined as follows:

{[x ] : [x ] ∩ Y = ∅}

◮ Thus FY consists of equivalence classes of ∼ whichare disjoint with Y

◮ This family FY , if nonempty, consists of nonemptysets only and so, has nonempty selectors

Page 29: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Family FY , cont’d

◮ All the selectors S for FY are of the same size,namely |FY |

◮ Here is what Prof. Truszczynski observed: Themaximal sets in MY are precisely the sets of theform U \ S where S is a selector for FY

◮ Therefore the family MY can be characterizes asfollows:

MY = {X : ∃T (T is a selector for FY

and X ∩ T = ∅)}

◮ (This can be used for an alternative proof of the factthat MY is a matroid)

Page 30: Rough Sets - General introduction and one theoremphdopen.mimuw.edu.pl/zima13/marek-slides/rough13.pdf · Rough Sets Introducing rough sets Matroids Theorem of Liu and Zhu What it

Rough Sets

Introducing roughsets

Matroids

Theorem of Liuand Zhu

Conclusions

◮ Even though rough sets are such a fundamental datastructure, people still find new and interesting facts

◮ In the case of the result we presented there was anew technology (matroids) that we used

◮ Maybe it will be useful in further investigations