rough sets - general introduction and one...
TRANSCRIPT
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Rough SetsGeneral introduction and one theorem
V.W. Marek
Department of Computer ScienceUniversity of Kentucky
October 2013
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
What it is about?
◮ “Rough Sets” is a popular formalism for talking aboutapproximations
◮ Esp. studied in Poland, Canada, US, China, India◮ Actually, invented by late Zdzisław Pawlak◮ Ties several areas of science: Statistics, Logic,
Universal Algebra, Topology, Combinatorics evenFunctional Analysis
◮ Motivated by situations when the language isinadequate to describe collections of objects
◮ (I wrote few papers on RS and was a coauthor of thefirst paper on RS)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
What it is about, cont’d
◮ Also I read a paper by Yanfang Liu and William Zhu,of Zhangzhou Normal University “Parameterizedmatroid of rough set” and will present one theorem(and its proof)
◮ The reason is that, originally, I doubted the result istrue, and the proof did not make sense
◮ This specific result ties Rough Sets with an importantarea of Combinatorial Optimization and explains whysome algorithms for Rough Sets work
◮ This is (unfortunately) the only actual proof that I willpresent in this series of lectures
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Plan
◮ One-table bags of records, and associatedequivalence relation
◮ Rough sets◮ Matroids◮ Liu and Zhu theorem
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Database background
◮ A table is a collection of records, possibly withrepetition
◮ In other words a bag, not set, of records◮ Let us assume now that we assign to each of these
records a unique identifier◮ Then there is an equivalence relation ∼ on the set of
the identifiers, namely:
i1 ∼ i2 if i1, i2 are identifiers of the same record
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Example
Here is a table patients:
Id lname fname temp1 marek victor 104.22 morek vector 101.23 marek victor 104.24 marek victor 99.65 morek vector 101.2
(But remember that the id’s are NOT the part of data)
Here the relation ∼ has three equivalence classes:{1,3}, {2,5}, and {4}
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Not every subset is describable
◮ If we implement this table in SQL (what is SQL?) theset consisting of records with identifiers 1,2, and 3,can not be described
◮ The point is that from the point of view of SQL,records where we set id 2 and 5 can not bedistinguished
◮ Only sets that are unions of equivalence classes of ∼can be described
◮ The set {1,3,4} can be described by:SELECT ∗ FROM patients WHERE lname = marek
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Two kinds of linguistic inadequacy
◮ Say, we have a language for description (thinkmedicine, the original motivation of Pawlak)
◮ There may be sets of objects we can not describe(given that language)
◮ It is also possible that there is a description, but it isjust too big
◮ This happens when we have plenty of attributes andneed to perform attribute reduction to get ahuman-readable description
◮ When you do this records may becomeindistinguishable
◮ (Ever heard about Johnson-LindenstraussTheorem?)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
So, we have to approximateA
Ba1 a2 a3 a4 a5
b1
b2
b3
b4
b5
S
X
◮ There is a largest definable set included in a givenset X , often called interior of X , X
◮ There is a smallest definable set containing a givenset, often called closure of X , X
◮ We see them in our figure◮ There is a large number of obvious identities for
interior and closure
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Topological angle
◮ (What about Alexandrov Topology in our context?)◮ It is just that we are not topologists, and think about
our objects as database objects◮ This, of course, has consequences; we implement
Rough Sets
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Few important facts
◮ An equivalence class of x , [x ] is {y : y ∼ x}◮ The interior of the set X consists of the union of
equivalence classes included in X◮ Closure of the set X consists of the union of
equivalence classes that have a nonemptyintersection with X
◮ There are various characterizations of interior andclosure, in various terms
◮ One such characterization, by Mirek Truszczynskiand myself is that the pair 〈X ,X 〉 is the bestapproximation of X in the Kleene ordering of pairs ofdefinable sets
◮ (Kleene ordering is the order of approximationswhere the lower class goes “up” and the upper classgoes “down”)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Rough sets, formally◮ Given an equivalence relation ∼ in a set U, a rough
set determined by a set X such that X ⊆ U is thepair 〈X ,X 〉
◮ Then a rough subset of U is a pair determined byany subset X of U
◮ Besides of characterization mentioned above, thereare other characterizations of rough sets: in terms oftopology, in terms of Boolean Algebras withoperators, etc.
◮ The person who invented Rough Sets (no longer withus), Professor Zdzisław Pawlak, wrote a often quotedbook on the subject
◮ There is a journal Transactions on Rough Sets andeven a Rough Sets Society
◮ There is plenty of conferences on Rough Sets, in allsort of places
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Matroid
◮ Matroid is a combinatorial structure that attempts tocapture notions behind concepts such asindependent set of vectors in a vector space
◮ But also cycle-free subgraphs of an undirected graph◮ Formally, a matroid is a pair 〈A,M〉 where M
consists of (some) subsets of A and satisfies thefollowing conditions:
◮ ∅ ∈ M◮ If A ⊆ B and B ∈ M then A ∈ M◮ If A,B ∈ M, |A| < |B| then for some x ∈ B \ A,
A ∪ {x} ∈ M
◮ (This definition of a matroid abstracts out of linearlyindependent subset of a vector space)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Matroids, cont’d
◮ This last property is called Steinitz exchangeproperty and whoever had a class of linear algebramust have heard about it
◮ The concept of matroid is one of fundamentalcombinatorial structures
◮ There are many other characterizations of matroidsin various terms
◮ One important connection of matroids and ComputerScience is so-called Rado-Edmonds Theorem thatcharacterizes greedy algorithms in terms of(weighted) matroids
◮ (Look up an absolute classic: Witold Lipski, jr.,Kombinatoryka dla programistow, ISBN82-204-2968-4)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Why are matroids important?
◮ They occur in many places, but the important point isthe characterization of Greedy algorithms viamatroids
◮ Say, we have a set A and a weight function,wt : A → R+
◮ Weight of set S ⊆ A is Σx∈Swt(x)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Rado-Edmonds Theorem
◮ We sort the set A according to weights in descendingorder
◮ Rado-Edmonds Theorem tells us that if a familyF ⊆ P(A) is a matroid, and we select greedily (i.e.we initialize X to the empty set and in each step weselect fresh maximum weight element x so thatX ∪ {x} is in F and then set X := X ∪ {x}) then wewill compute a base of maximum weight (what is abase?)
◮ When no fresh x ∈ A so that X ∪ {x} belongs to Fcan be found, we return X
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Rado-Edmonds Theorem, cont’d
◮ Conversely, if F is not a family of independent sets ofa matroid, then there is a weight function where wewill not get a maximum-weight element of F
◮ (If you had a serious data-structures course, thencertainly these facts were learned - if you werepaying attention)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Matroids associated with rough sets
◮ Let U be a set of objects, and ∼ an equivalencerelation on U. Let Y ⊆ U
◮ Then Y determines a collection MY of subsets of Unamely
{A ⊆ U : A ⊆ Y}
◮ In our simple example, with 1 ∼ 3, 2 ∼ 5 and 4 inrelation with itself only, The set X = {1,2,3}determines the following class MX :empty set ∅. one-element sets {1}, {2}, {3}, {5}(but not {4}). What about {1,2}, {1,3}, {2,3}? Andare there more?
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Matroids and rough sets, cont’d
◮ Here is the result of Liu and Zhu:◮ Let ∼ be an equivalence relation in the set U. For
every set Y ⊆ U, the structure MY is a matroid◮ (We will prove that)◮ Since the structure MY obviously is closed under
subsets (“if a set grows(?) smaller, the interior growssmaller”) first two conditions are obvious
◮ So now, let us assume that we have two sets, A, B inMY , with |A| < |B|. We need to find a objectx ∈ (B \ A) so that A ∪ {x} also belongs to MY
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Case 1
◮ Some x ∈ B \ A has the property that [x ] = {x}◮ Specifically, that x is in relation ∼ with itself but not
with any other object◮ We claim that this specific x has the property that
A ∪ {x} ∈ MY
◮ Indeed, because [x ] is a singleton, x /∈ A,
A ∪ {x} = A ∪ {x}
◮ Now, A ⊆ Y (because A ∈ MY ) and also x ∈ Ybecause B ∈ MY , and {x} ⊆ B ⊆ Y
◮ Thus in this case the matter is easy
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Case 2
◮ No x ∈ B \ A has the property that [x ] = {x}◮ Our idea now is to assume that for no x ∈ B \ A,
A ∪ {x} belongs to MY and work for a contradiction◮ The fact that A ∪ {x} /∈ MY means that A ∪ {x} is
strictly larger than A◮ But there are only two possibilities: either A ∪ {x} is
A or it is A ∪ [x ]◮ The first possibility does not hold - so the other one
must hold
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
What does it mean?
◮ This means that all y 6= x , y ∼ x , are in A!◮ And this happens for all x ∈ B \ A◮ Next question we ask if it is possible that for some
x , y ∈ B \ A, x 6= y , x ∼ y◮ If that would be the case then [x ] = [y ] and
[x ] \ {x} ⊆ A and also [y ] \ {y} ⊆ A◮ But then [x ] ⊆ A, contradicting the fact that x /∈ A
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
What is going on?
◮ For each x ∈ (B \ A) all the elements y such thatx ∼ y are in A!
◮ Moreover, because we are in Case 2, everyx ∈ B \ A is in relation ∼ with some element of A(actually of A \ B)
◮ Let us select, for each x ∈ (B \ A) one element ysuch that x ∼ y , x 6= y
◮ Then, because [x ] \ {x} ⊆ A, this function mapsB \ A into A (in fact into A \ B, because no object inB \ A is ∼ to any object in B \ A
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
A bit of combinatorics
A\B B\A
Figure: The injection of B \ A into A \ B
◮ Then, because [x ] \ {x} ⊆ A, this function mapsB \ A into A (in fact into A \ B, because no object inB \ A is ∼-related to any different object in B \ A
◮ In fact it is an injection of B \ A into A \ B!◮ But then |B \ A| ≤ |A \ B|!◮ This contradicts the fact that |A| < |B| and completes
the argument
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Few other things
◮ Here is a characterization of MY
MY = {A : A ⊆ Y}
◮ And one more:
MY = {A : (A \ Y ) = ∅}
◮ There are plenty of other similarly easycharacterizations of MY
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
More stuff
◮ Discussing with Professor M. Truszczynski of myDepartment, I learned few things
◮ It is quite possible that these things are in manypapers of W. Zhu and his coauthors
◮ We will present Mirek’s suggestions now
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Selectors
◮ Let F be a family of pairwise disjoint nonempty sets◮ A set Z is a selector for F if for all T ∈ F , |Z ∩ T | = 1◮ Something called Axiom of Choice (what is it?)
requires that selectors exist, but if F is a finite familyof finite sets, then no special axioms are needed
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Family FY
◮ Now, let Y be a subset of U◮ The family FY is defined as follows:
{[x ] : [x ] ∩ Y = ∅}
◮ Thus FY consists of equivalence classes of ∼ whichare disjoint with Y
◮ This family FY , if nonempty, consists of nonemptysets only and so, has nonempty selectors
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Family FY , cont’d
◮ All the selectors S for FY are of the same size,namely |FY |
◮ Here is what Prof. Truszczynski observed: Themaximal sets in MY are precisely the sets of theform U \ S where S is a selector for FY
◮ Therefore the family MY can be characterizes asfollows:
MY = {X : ∃T (T is a selector for FY
and X ∩ T = ∅)}
◮ (This can be used for an alternative proof of the factthat MY is a matroid)
Rough Sets
Introducing roughsets
Matroids
Theorem of Liuand Zhu
Conclusions
◮ Even though rough sets are such a fundamental datastructure, people still find new and interesting facts
◮ In the case of the result we presented there was anew technology (matroids) that we used
◮ Maybe it will be useful in further investigations