relevance in data mining & image segmentationmath.unipa.it/daa_erice07/invited/pal.pdfrough...

Rough Sets, Knowledge Encoding and Uncertainty Analysis:

Relevance in Data Mining & Image Segmentation

Sankar K. PalSankar K. PalIndian Statistical InstituteIndian Statistical Institute

CalcuttaCalcuttahttp://http://www.isical.ac.in/~sankarwww.isical.ac.in/~sankar

ContentsRough Sets and Information GranulesRough Sets and Information Granules

Uncertainty HandlingUncertainty HandlingGranular ComputingGranular ComputingRough Set Rules and Information GranulesRough Set Rules and Information GranulesRole in Soft Computing and RoughRole in Soft Computing and Rough--fuzzy Integrationfuzzy Integration

RoughRough--Fuzzy Case GenerationFuzzy Case GenerationCase Based ReasoningCase Based ReasoningCase Selection and Generation Case Selection and Generation (from PR point of views)(from PR point of views)

Fuzzy GranulationFuzzy GranulationMapping Dependency Rules to CasesMapping Dependency Rules to CasesCase Retrieval and MeritsCase Retrieval and Merits

EM + MST + Granular Case Generation for EM + MST + Granular Case Generation for MultispectralMultispectralImage SegmentationImage Segmentation

Why (EM + MST + Granular Computing) ?Why (EM + MST + Granular Computing) ?Results and Quantitative IndexResults and Quantitative Index

Rough Image Entropy and SegmentationWhy Rough Sets in Image Processing ?Why Rough Sets in Image Processing ?Image Granules and Lower & Upper ApproximationEntropy DefinitionAlgorithmsEffect of Granular Size

Rough-Fuzzy ClusteringConcept of crisp core and fuzzy boundary regionsAlgorithmsResults and Quantitative Indices

ConclusionsConclusionsApplications to WWW and BioinformaticsApplications to WWW and BioinformaticsSoft Computing Research CenterSoft Computing Research Center

Rough Sets and Information Granules: Basic Concepts

Rough Sets

. x

UpperApproximation BX

Set X

LowerApproximation BX

[x]B (Granules)

[x]B = set of all points belonging to the same granule as of the point xin feature space ΩB.

[x]B is the set of all points which are indiscernible with point xin terms of feature subset B.

UB ⊆ΩZ. Pawlak 1982, Int. J. Comp. Inf. Sci.

Approximations of the set UX ⊆

B-lower: BX = }][:{ XxUx B ⊆∈

B-upper: BX = }][:{ φ≠∩∈ XxUx B

If BX = BX, X is B-exact or B-definable

Otherwise it is Roughly definable

Rough Set is a Crisp Set with Rough Descriptions

Granules definitelybelonging to X

w.r.t feature subset B

Granules definitelyand possibly belongingto X

Rough SetsRough Sets

UncertaintyHandling

GranularComputing

(Using lower & upper approximations) (Using information granules)

Two Important Characteristics

• At this junction, is one of the principal constituents of Soft Computing with Fuzzy Logic , Neurocomputing , and Genetic Algorithms .

RS

• Within Soft Computing FL, NC, GA, RS are Complementary rather than Competitive

FL NCGA

Example: Neuro-fuzzy, Rough evolutionary network, Rough-fuzzy …

Role of FLFL : algorithms for dealing with imprecision and uncertainty (arising from overlapping concept)

RSRShandling uncertainty arising fromthe granularity in the domain of discourse

Rough-fuzzy Computing

Stronger paradigm for uncertainty handling

Applications

Granular Computing : Rough-fuzzy Case Generation (using dependency rules)

Uncertainty Handling : Rough Entropy and Rough-fuzzy Clustering(using concepts of granules and upper & lower approximations)

Information Granules and Rough-Fuzzy Case Generation

Information Granules: A group of similar objects clubbed together by an indiscernibilityrelation.Granular Computing: Computation is performed using information granules and not the data points (objects)

Information compressionComputational gain

IEEE Trans. Knowledge Data Engg., 16(3), 292, 2004

Features

low medium high F1

low

med

ium

high

F2

Rule 21 MM ∧←

• Rule provides crude description of the class using granule

Information Granules and Rough Set Theoretic Rules

Case Based Reasoning (CBR)

What are Cases ?

• Some typical situations, already experienced by the system.

Conceptualized piece of knowledge representing an experience that teaches a lesson for achieving the goals of the system.

Informative patterns or examples characterizing some problems or experience

Prototypes or representative patterns/rules/ features of classes or concepts (PR perspective)

CBR CBR involvesinvolvesadaptingadapting old solutions to meet new demandsold solutions to meet new demandsusingusing old cases to explain new situations or old cases to explain new situations or to justify new solutions to justify new solutions reasoningreasoning from precedents to interpret new from precedents to interpret new situations.situations.

System learnsSystem learns and becomes moreand becomes more efficientefficientas a byproduct of its reasoning activityas a byproduct of its reasoning activity

• Example : Medical diagnosis and Law interpretation where the knowledge available is incomplete and/or evidence is sparse.

CBR Cycle

PreviousCases

General/DomainKnowledge

NewCase

RetrievedCase New

Case

SolvedCase

Tested/Repaired

Case

LearnedCase

Problem

SuggestedSolution

ConfirmedSolution

-Case Representation& Indexing (to facilitateretrieval process)

-Case Matching(compute similarity)

- Case Adaptation(- build a causal model

between problem & soln.space of related cases

- find soln. of query case)

- Case Base Maintenance(eliminate redundant cases& maintain consistency amongcases in case base)

(Aamodt & Plaza, 1994)

Case SelectionCase Selection →→ Cases belong to the set of Cases belong to the set of examples encountered. examples encountered. (no change in no. of dimension)(no change in no. of dimension)

Case GenerationCase Generation →→ Constructed Constructed ‘‘CasesCases’’ need need not be any of the examples. not be any of the examples. (possibility in dimension (possibility in dimension reduction)reduction)

(PR point of view)(PR point of view)

Cases – Informative patterns (prototypes) characterizing the problems.

• In rough set theoretic framework:Cases ≡ Information Granules

• In rough-fuzzy framework:Cases ≡ Fuzzy Information Granules

• Cases are cluster granules, not sample points

• Involves only reduced number of relevant features with variable size

•• Less storage requirements

•• Fast retrieval

Suitable for mining data with large

dimension and size

Characteristics and Merits

How to Achieve?How to Achieve?

Fuzzy sets help in Fuzzy sets help in linguistic representationlinguistic representation of of patterns, providing a patterns, providing a fuzzy granulationfuzzy granulation of the feature of the feature spacespaceRough sets help in generating Rough sets help in generating dependency rulesdependency rules to to model model ‘‘informative/representative regionsinformative/representative regions’’ in the in the granulated feature space.granulated feature space.

Fuzzy membership functionsFuzzy membership functions corresponding to the corresponding to the ‘‘representative regionsrepresentative regions’’ are stored as are stored as Cases.Cases.

Fuzzy (F)-Granulation:

1

0.5

μlow μmedium μhigh

cL cM cH

λL λMπ−function

Feature j

Mem

bers

hip

valu

e

Mapping Dependency Rules to Cases

1.1. Each conjunction e.g., LEach conjunction e.g., L11 ∧∧ MM22 represents represents a region (block) a region (block)

2.2. For each conjunction, store as a For each conjunction, store as a casecase::•• Parameters of the fuzzy membership Parameters of the fuzzy membership

functions corresponding to linguistic functions corresponding to linguistic variables that occur in the conjunction.variables that occur in the conjunction.(thus, multiple cases may be generated from a rule.)(thus, multiple cases may be generated from a rule.)

• Class information

Note:Note: All features may not occur in a rule. All features may not occur in a rule. ⇒⇒ Cases may be represented by Cases may be represented by Different Different

Reduced number of featuresReduced number of features..

Structure of a Case:Structure of a Case:Parameters of the membership functions Parameters of the membership functions ((center, radiicenter, radii), Class information), Class information

Example IEEE Trans. Knowledge Data Engg., 16(3), 292, 2004F2

X X XX X X

X X X

0.9

0.4

0.2

0.1 0.5 0.7 F1

CASE 2

211C HL ∧←

212C LH ∧←

CASE 1

Parameters of fuzzy linguistic sets low, medium, high4.0,7.0,7.0,5.0,5.0,1.0 :1 Feature ====== HHcMMcLLc λλλ

5.0,9.0,7.0,4.0,5.0,2.0 :2 Feature ====== HHcMMcLLc λλλ

Case Retrieval

Similarity Similarity ((sim(x,csim(x,c)))) between a pattern between a pattern xx and and a case a case cc is defined as:is defined as:

nn: number of features present in case : number of features present in case cc

2

1

))((1),( xn

cxsim jfuzzset

n

j

μ∑=

=

: the degree of belongingness of : the degree of belongingness of pattern pattern xx to fuzzy linguistic set to fuzzy linguistic set fuzzsetfuzzset for for feature feature jj..

For classifying an unknown pattern, the For classifying an unknown pattern, the case closest to the pattern in terms of case closest to the pattern in terms of sim(x,csim(x,c)) is retrieved and its class is assigned is retrieved and its class is assigned to the pattern.to the pattern.

)(xjfuzzsetμ

Iris Flowers: 4 features, 3 classes, 150 samples

00.5

11.5

22.5

33.5

4

avg. feature/case

Rough-fuzzy

IB3

IB4

Random

Number of cases = 3 (for all methods)

80%82%84%86%88%90%92%94%96%98%

100%

Classification Accuracy (1-NN)

Rough-fuzzyIB3IB4Random

00.5

11.5

22.5

33.5

44.5

tgen(sec)


0

0.002

0.004

0.006

0.008

0.01

tret(sec)


Forest Cover Types: 10 features, 7 classes, 5,86,012 samples

0

2

4

6

8

10

avg. feature/case

Rough-fuzzy

IB3

IB4

Random

Number of cases = 545 (for all methods), GIS (cartographic & RS measurements)

0%

10%

20%

30%

40%

50%

60%

70%

Classification Accuracy (1-NN)


0

10002000

30004000

50006000

70008000

tgen(sec)


0

10

20

30

40

50

60

tret(sec)


Example:

Rough Set Knowledge Encoding, EM & MST for Multi-spectral Image Segmentation

IEEE Trans. Geoscience and Remote Sensing, 40(11), 2495-2501, 2002

• Image segmentation ≡ Partitioning the image space into meaningful homogeneous regionsor clusters.

• Clusters are represented by Cases.

Image segmentation ≡ Determination of cases, representing different clusters.

Case Generation and Image SegmentationCase Generation and Image Segmentation

Clusters represented by cases may be crude⇒ need refinement

EM Algorithmo Handles prob. uncertainty out of overlapping classes• Number of clusters (k) needs to be known• Solution depends strongly on initial conditions• Models only convex clusters

Rough Set Theoretic Knowledge Encoding• Automatically determines the number of clusters k• Provides ‘good’ initialization

(avoidance of local minima, fast convergence)• Granular computing

Minimal Spanning Tree (MST) Clustering• Can model Non-convex clusters, but time consuming

RS Knowledge Encoding + EM + MST

RS Knowledge Encoding + EM + MST

Efficient Image Segmentation• Computational gain (via Granular Computing)• Local minima problem reduced, number of clusters need not be known (by RS Knowledge encoding)

• Probabilistic uncertainty handling (by EM)• Detection of arbitrary shaped clusters (by MST )

Band1

Band2

Band3

Bandn

Rough setrules – crudeclusters

Intl. mixturemodel param.

Refined mixt.model param.

FinalClusters

EM MSTMapping Rules toDistributionParametersG

ray-

leve

l thr

esho

ldin

gof

indi

vidu

al b

ands

Gra

nula

ted

ndi

men

. im

age

spac

e

RuleGeneration

Segm

ente

d M

ulti-

spec

tral

Imag

e

…

InputMulti-spectralImage Bands

IEEE Trans. Geoscience and Remote Sensing, 40(11), 2495-2501, 2002Pattern Recognition Algorithms for Data Mining, CRC Press, Boca Raton, 2004

Multi-Spectral IRS Image of Calcutta(Spatial resolution = 36.25 m X 36.25 m, wavelengths = 0.77-0.86μm)

Band 1 Band 2

Band 3 Band 4

Index βn : total number of pixels in imagex : mean gray value of the imagexi : number of pixels in the ith (I = 1,…,c) region obtained by a

segmentation method. xij : gray value of jth pixel (j=1,…, ni) in region i

ix : the mean of ni gray values of ith region. Then

∑ ∑

∑ ∑

∑ ∑

∑ ∑

= =⎟⎠⎞

⎜⎝⎛ −

= =⎟⎠⎞

⎜⎝⎛ −

=

= =⎟⎠⎞

⎜⎝⎛ −×

= =⎟⎠⎞

⎜⎝⎛ −

=c

i jixij

c

i jxij

c

i jixij

in

i

c

i jxijn

n

n

n

n

i

i

i

i

x

x

xnn

x

1 1

21 1

2

1 1

21

1 1

21

β

Int. J Remote Sensing, 21(11), 2269-2300, 2000

0

1

2

3

4

5

6

7

8

EMKMREMRKMKMEMEMMSTFKMREMMST

EM/KM: Random initialization + EM/K-means,REM/RKM: Rough set theoretic initialization + EM/K-means, KMEM: K-means initialization + EM, EMMST: Random init. + EM + MSTFKM: Fuzzy K-means, REMMST: Rough set init. + EM + MST

Quantitative Index β: Measuring Segmentation Quality(IRS-1A image of Calcutta, No. of bands = 4 )Final no. of clusters (land cover type) = 5

0

500

1000

1500

2000

2500

EMKMREMRKMKMEMEMMSTFKMREMMST

Computation Time (seconds)

Segmented image of Calcutta using K-means algorithmwith randomly initialization (KM)β = 5.25, No. of Clusters = 5

Segmented image of Calcutta using EM algorithm with random initialization (EM)β = 5.91, No. of Clusters = 5

Segmented image of Calcutta using EM algorithm withRough set theoretic initialization and MST clustering (REMST) β = 7.37, No. of Clusters = 5

(a) (b)

(c) (d)

Zoomed images of a bridge on river Ganges(a) Rough set initialized EM + MST, (b) K-means algorithm

Zoomed images of the air-strips of Calcutta airport (a) Rough set initialized EM + MST, (b) K-means algorithm

• Rough sets generate information granules. Fuzzy sets provide efficient granulation of feature space (F -granulation).

• Rough-fuzzy case generation method (with reduced and variable feature subset) is suitable for CBR systems involving datasets large both in dimension and size.

• Rough case generation + EM + MST provide efficient multispectral image segmentation

Application to Granular Information Retrieval in heterogeneous media (e.g., text, hypertext, image) like WWW

Summary

• Unsupervised Case Generation: Rough-SOM (Applied Intelligence, 21(3), 289-299, 2004)

• A Rough Set-based Case-based Reasoner for Text Categorization (Int. J. Approx. Reasoning, 41, 229-255, 2006)

• Combining Feature Reduction and Case Selection in Building CBR Classifiers(IEEE Trans. Knowledge and Data Engg., 18(3), 415-429, 2006)

• Application to WWW and Bioinformatics

Further related references

Rough Entropy and Object ExtractionRough Entropy and Object Extraction

Pattern Recog. Letters, 26(16), 2509-2517, 2005

Rough SetsRough Sets

. x

UpperApproximation BX

Set X

LowerApproximation BX

[x]B (Granules)

[x]B = set of all points belonging to the same granule as of the point xin feature space ΩB.

[x]B is the set of all points which are indiscernible with point xin terms of feature subset B.

UB ⊆ΩFig. 1

Approximations of the set UX ⊆

B-lower: BX = }][:{ XxUx B ⊆∈

B-upper: BX = }][:{ φ≠∩∈ XxUx B

If BX = BX, X is B-exact or B-definable

Otherwise it is Roughly definable

Rough Set is a Crisp Set withRough Set is a Crisp Set with Rough DescriptionsRough Descriptions

Granules definitelybelonging to X

w.r.t feature subset B

Granules definitelyand possibly belongingto X

Rough set can be characterized numerically byRough set can be characterized numerically by

called called AAccuracy of Approximationccuracy of Approximation

|X| : Cardinality of X |X| : Cardinality of X ≠≠ φφ. . ααBB (X)(X) lies in 0 and 1 lies in 0 and 1

If If ααBB (X)(X) = 1, X is = 1, X is crispcrisp ((i.ei.e, X is precise) with , X is precise) with respect to Brespect to B

If If ααBB (X)(X) < 1, X is < 1, X is roughrough (i.e., X is vague) with (i.e., X is vague) with respect to Brespect to B

( )||||

XBXBX

B=α

Roughness of Set

Roughness of set XRoughness of set X with respect to B can be with respect to B can be characterized using the characterized using the accuracy of accuracy of approximationapproximation ααBB (X)(X) as as

If roughness of the set X is 0 then X is If roughness of the set X is 0 then X is crisp with respect to Bcrisp with respect to B

||||1

XBXBR −=α

Why Rough Sets in Image Processing?Why Rough Sets in Image Processing?

In gray scale images boundaries between object In gray scale images boundaries between object regions are often illregions are often ill--defined. This uncertainty can be defined. This uncertainty can be handled by describing the different objects as rough handled by describing the different objects as rough sets with upper (outer) and lower (inner) sets with upper (outer) and lower (inner) approximationsapproximations

The set approximation capability of rough sets The set approximation capability of rough sets can be exploited to formulate an entropy measure, can be exploited to formulate an entropy measure, called called rough entropyrough entropy, quantifying the uncertainty in , quantifying the uncertainty in locating boundary in an objectlocating boundary in an object--background imagebackground image

Image as a Rough SetImage as a Rough Set

Let the universe U be an image consisting of a Let the universe U be an image consisting of a collection of pixels. Then if we partition U into a collection of pixels. Then if we partition U into a collection of noncollection of non--overlapping windows (of size overlapping windows (of size mm××n, say), n, say), each window can be considered as a each window can be considered as a granule Ggranule G..

The induced equivalence classes The induced equivalence classes IImm××nn have have mm××nn pixels in each nonpixels in each non--overlapping window. overlapping window. Given this granulation, object regions in the image Given this granulation, object regions in the image can be approximated using rough setscan be approximated using rough sets..

ObjectObject--Background SeparationBackground Separation of an Mof an M××N, L N, L level image (two class problem) level image (two class problem)

Let prop(B) and prop(O) represent two Let prop(B) and prop(O) represent two properties (say, gray level intervals 0, 1, properties (say, gray level intervals 0, 1, ……, T , T and T+1, T+2, and T+1, T+2, ……, L, L--1) that characterize 1) that characterize background and object regions.background and object regions.

Given these properties and granulated image space, Given these properties and granulated image space, object and background regions can be viewed as object and background regions can be viewed as twotwo sets with their rough representationssets with their rough representations as follows:as follows:

Inner approximation of the object

Outer approximation of the object

Inner approximation of the background

:)( TO

}

,....1,{ ,

ij

jiiT

GinpixelaisPwhere

TPtsmnjjGUO >=∃=

( ):TB

}ij

jiiT

GtobelongingpixelaisPand

mnjTPGUB ,,...1,|{ =∀≤=

}ij

jiiT

GtobelongingpixelaisPand

mnjTPGUO ,,...,1,|{ =∀>=

:)( TO

Outer approximation of the backgroundOuter approximation of the background

Therefore, the rough set representation of Therefore, the rough set representation of the image (i.e, object Othe image (i.e, object OTT and background Band background BTT) ) for a given for a given IImm ×× nn depends on the value of Tdepends on the value of T..

}

,....1,{ ,

ij

jiiT

GinpixelaisPwhere

TPtsmnjjGUB ≤=∃=

( ):TB

Roughness of object ORoughness of object OTT and background Band background BTT

||||||

||||1

||||||

||||1

T

TT

T

TBT

T

TT

T

TOT

BBB

BBR

OOO

OOR

−=−=

−=−=

(1)

represents cardinality of the set|| o o

Rough Entropy MeasureRough Entropy Measure

Rough Entropy (RE) of an image can be defined as

(i) RET lies between 0 and 1.

(ii) RET has a maximum value of unity when

and minimum value of zero when

)2()](log)(log[2 BTeBTOTeOTT RRRReRE +−=

}.1,0{, ∈TT BO RR

,/1 eRRTT BO ==

Pattern Recog. Letters, 26(16), 2509-2517, 2005

Fig. 2: Rough entropy for various values of roughness of the object and background

Fig. 3: Plot of rough entropy for the values (0,0) to (1,1) on the diagonal of Fig. 2 (i.e., when ROT = RBT)

In either case, RET will decrease from its maximum value of unity and will reach a value of zero at (0, 0), (0, 1), (1, 0) and (1, 1) in the (ROT , RBT ) plane (Fig. 2).

Method of object enhancement/ extraction is based on the principle of minimizing the roughness of both object and background regions, i.e., maximizing RET.

Compute for every T, the RET of the image, representing the background and object regions (0,… ,T) and (T+1, … L-1) respectively, and select the one for which RET is maximum.

Object Extraction Minimizing RoughnessObject Extraction Minimizing Roughness

SelectSelect

as the optimum thresholdas the optimum threshold to provide object to provide object background segmentation.background segmentation.

Maximizing the rough entropyMaximizing the rough entropy to get the to get the required threshold required threshold ⇒⇒ MinimizingMinimizing both theboth theobject object roughnessroughness and background and background roughness.roughness.

TTRET maxarg* =

The determination of T* by maximization of rough entropy or minimization of roughness depends on the granule size.

A choice of granule size can be made from gray level distribution of the image by selecting a value approximately equal to the minimum of half the width of base regions corresponding to all the peaks in the histogram.

Choice of Granule SizeChoice of Granule Size

This will allow the algorithm to take into This will allow the algorithm to take into account the local information (details) of all account the local information (details) of all the regions, as indicated by different peaks in the regions, as indicated by different peaks in the histogram, and the histogram, and facilitate the detection of facilitate the detection of the smallest region.the smallest region.

Any granule Any granule largerlarger ((or smalleror smaller) than this may ) than this may result in result in losing some desirable regionslosing some desirable regions (or (or detection of spurious undesirable regionsdetection of spurious undesirable regions) by ) by the the decreasedecrease ((or increaseor increase) in the value of T*, ) in the value of T*, assuming that the assuming that the regions of interest regions of interest correspond to lower side of the histogramcorrespond to lower side of the histogram. .

RemarksRemarks

Given the Given the max_graymax_gray and and min_graymin_gray values, the values, the computation of Rough entropy (and hence the computation of Rough entropy (and hence the algorithm) algorithm) requires only a single scan of pixelsrequires only a single scan of pixels in in the image, since the image, since max_granulemax_granuleii and and min_granulemin_granuleiiare computed exactly once for each i. are computed exactly once for each i.

The computational complexity of the The computational complexity of the algorithm is same as that of histogram algorithm is same as that of histogram computationcomputation..

Text Image

Histogram of TEXT image (Flat valley), minimum estimated base width is 30 between gray-levels 105 to 135 granule size = 15×15(half of the smaller base width).

(a) original (b) threshold = 169, granule size 15 x 15 (c) threshold = 171, granule size = 10 x 10 (d) threshold = 168 granule size = 19 x 19

(a) (b)

(c) over segm (d) under segm

• Flat valley makes no significant change in result with large change in granule size

Calcutta Image

Histogram of CALCUTTA image (Sharp valley), minimum estimated base width is 8 between 16 and 24 gray-level granule size = 4×4

(a)

(d) over segm(c) under segm

(b)

(a) Calcutta Image (b) Threshold = 30, Granule size 4 X 4

(c) Threshold = 26, Granule size 6 X 6 (d) Threshold = 33, Granule size 2 X 2

While all the three output images are able to segment the While all the three output images are able to segment the water bodies (represented by the lower peak region in water bodies (represented by the lower peak region in histogram) from the rest of the objects, histogram) from the rest of the objects, increase in T* increase in T* value to 33 introduces more spurious (undesirable) value to 33 introduces more spurious (undesirable) regionsregions (Fig. d), whereas (Fig. d), whereas decrease in T* value to 26 fails decrease in T* value to 26 fails to detect some useful regions to detect some useful regions (e.g., airport runways, roads, (e.g., airport runways, roads, canals) as object (Fig. c). canals) as object (Fig. c).

This justifies the This justifies the selection of 30 as the more selection of 30 as the more appropriate threshold,appropriate threshold, and hence the choice of granule and hence the choice of granule size 4size 4××4.4.

Since the valley is sharp Small change in T*T*causes significant changes in segmentation result

SummarySummaryRough entropy of image is defined using the concept of Rough entropy of image is defined using the concept of image granules and upper & lower approximations.image granules and upper & lower approximations.

Granules carry local information and reflect the inherent Granules carry local information and reflect the inherent spatial relation of the image by treating pixels of a spatial relation of the image by treating pixels of a window as indiscernible or homogeneous.window as indiscernible or homogeneous.

Maximization of homogeneity in both object and Maximization of homogeneity in both object and background regions during their partitioning is achieved background regions during their partitioning is achieved through maximization of rough entropy; thereby through maximization of rough entropy; thereby providing optimum results for object background providing optimum results for object background classification.classification.

Extension of the algorithm to multiExtension of the algorithm to multi--class segmentation class segmentation problem constitutes a part of future investigation.problem constitutes a part of future investigation.

Rough-Fuzzy Clustering and Segmentation

Rough-Fuzzy Clustering

Integrates the concepts of Integrates the concepts of membership of fuzzy sets, and lower membership of fuzzy sets, and lower and upper approximations of rough and upper approximations of rough sets into hard clusteringsets into hard clusteringWhile fuzzy membership enables While fuzzy membership enables handling of overlapping partitions, handling of overlapping partitions, rough sets deal with vagueness and rough sets deal with vagueness and incompleteness in class definitionincompleteness in class definition

Objective Function:

w and ŵ (= 1-w) : reflect relative importance of lower and boundary regions μij : membership of jth pattern in ith class

IEEE Trans. Knowledge Data Engg., 19(6), 1--14, 2007 (to appear)

According to rough sets, According to rough sets, if if xjxj belongs to lower approximationbelongs to lower approximation of of ithithcluster, it does not belong to lower approximation of any other cluster, it does not belong to lower approximation of any other clusters clusters

xjxj belongs to belongs to ithith cluster definitelycluster definitely

Memberships of the objects in lower approximation of a cluster sMemberships of the objects in lower approximation of a cluster should hould be independent of other clusters (& their be independent of other clusters (& their centroidscentroids), and should not have ), and should not have any effect in computing their memberships for other clustersany effect in computing their memberships for other clustersObjects in lower approximation of a cluster should have similar Objects in lower approximation of a cluster should have similar influence on their own cluster prototypeinfluence on their own cluster prototype

Objective Function:


IfIf xjxj belongs to boundary regionbelongs to boundary region of of ithith cluster, it possibly belongs to cluster, it possibly belongs to ithithcluster and potentially belongs to other clusters cluster and potentially belongs to other clusters

objects in boundary regions should have (unlike lower region) objects in boundary regions should have (unlike lower region) different influence on the cluster prototypesdifferent influence on the cluster prototypes

In roughIn rough--fuzzy clustering, assign membership values fuzzy clustering, assign membership values μμijij of objects in of objects in lower region as lower region as 11, while those in boundary region in , while those in boundary region in [0, 1] [0, 1] , as in , as in conventional fuzzy clustering conventional fuzzy clustering

Only objects in boundary are Only objects in boundary are fuzzifiedfuzzified

Objective Function:


Each cluster - represented by a Cluster prototype, a Crisp lower approximation, and a Fuzzy boundary

Rough-Fuzzy C-Means

Cluster Prototype (Mean):


How to Select Core and Boundary Regions ?

Compute fuzzy memberships of each object Compute fuzzy memberships of each object w.r.tw.r.t. c . c centroidscentroids and compare the difference of its two highest and compare the difference of its two highest memberships with a threshold memberships with a threshold δδ..

Let Let μμijij and and μμkjkj be the highest and second highest be the highest and second highest memberships of object memberships of object xjxj. If (. If (μμijij −−μμkjkj ) < ) < δδ, then , then xjxjbelongs to boundary regions of belongs to boundary regions of ithith and and kthkth clusters clusters and and xjxj does not belong to lower approximation of does not belong to lower approximation of any cluster; otherwise any cluster; otherwise xjxj belongs to lower belongs to lower approximation of approximation of ithith cluster. cluster. δδ determines determines ““corecore”” and and ““overlappingoverlapping”” regions of regions of each cluster.each cluster.

LetLet

δδ represents the average difference of two highest represents the average difference of two highest memberships of all the objects in the data setmemberships of all the objects in the data set

δδ implements the role of granulesimplements the role of granules to define lower and upper to define lower and upper approximations of rough sets approximations of rough sets

Results on Iris Data Set

DB Index of Different C-Means

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

1

Different C-Means Algorithms

DB In

dex

FCMFPCMRCMRFCM(MBP)RFCM

FCM: fuzzy c-means; FPCM: fuzzy-possibilistic c-means; RCM: rough c-means; RFCM(MBP): rough-fuzzy c-means of Mitra et al.; RFCM: rough-fuzzy c-means

RCM: O.225069, RFCM(MBP): 0.224806, RFCM: 0.224164

IEEE Trans SMC (communicated)


Dunn Index of Different C-Means

00.5

11.5

22.5

33.5

44.5

55.5

66.5

7

1


Dunn

Inde

x



RCM: 6.755984, RFCM(MBP): 6.907512, RFCM: 6.936064



Execution Time of Different C-Means

0

10

20

30

40

50

60

70

80

1


Exec

utio

n Ti

me

(in m

illi s

ec)


(Pentium IV, 3.2 GHz, 1 MB cache, and 1 GB RAM)

RoughRough--fuzzy clustering is more effective fuzzy clustering is more effective for overlapping clusters.for overlapping clusters.

Note:Note: PCM generates coincident clusters,PCM generates coincident clusters,i.e., two of three final prototypes are i.e., two of three final prototypes are identical even when three initial identical even when three initial centroidscentroidswere selected from three different classeswere selected from three different classes

IMAGEIMAGE--20497774: original and segmentation by HCM, FCM, RCM, RFCM20497774: original and segmentation by HCM, FCM, RCM, RFCMMBPMBP, RFCM, RFCM

Initial value of Initial value of δδ chosenchosen = 0.145= 0.145Final value of δδ obtainedobtained = 0.652 = 0.652

Brain MRI Image IEEE Trans SMC (communicated)

c = 4c = 4Background, White matter, Background, White matter, Gray matter, and Gray matter, and Cerebrospinal fluidCerebrospinal fluid

Results on Brain MR Images

original HCM FCM RCM RFCMMBP RFCM


HCM: hard c-means; FCM: fuzzy c-means; RCM: rough c-means; RFCM(MBP): rough-fuzzy c-means of Mitra et al.; RFCM: rough-fuzzy c-means

DB Index of Different C-Means

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

1 2 3 4

Sample Images

DB In

dex

HCMFCMRCMRFCM(MBP)RFCM

IEEE Trans SMC (communicated)



Dunn Index of Different C-Means

0

0.5

1

1.5

2

2.5

3

1 2 3 4

Sample Images

Dunn

Inde

x




β Index of Different C-Means

0

2

4

6

8

10

12

14

1 2 3 4

Sample Images

β In

dex




Execution Time of Different C-Means

0

200

400

600

800

1000

1200

1400

1600

1800

2000

1 2 3 4

Sample Images

Exec

utio

n Ti

me

(in m

illi s

ec)


(Pentium IV, 3.2 GHz, 1 MB cache, and 1 GB RAM)

•• Concept of crisp Concept of crisp ‘’‘’corecore’’’’ (lower) and fuzzy (lower) and fuzzy ‘’‘’boundaryboundary’’’’(overlapping) regions of a cluster is introduced in the notion o(overlapping) regions of a cluster is introduced in the notion of f rough set theoryrough set theory

• Each cluster characterized by: {a cluster prototype, a crisp lower approximation, and a fuzzy boundary}

•• Incorporating roughness over fuzzy clustering/ segmentation Incorporating roughness over fuzzy clustering/ segmentation makes it more effective for dealing with overlapping clustersmakes it more effective for dealing with overlapping clusters

•• Use of rough sets and fuzzy memberships adds a small Use of rough sets and fuzzy memberships adds a small computational load to HCM algorithm; however the computational load to HCM algorithm; however the corresponding integrated method (RFCM) shows a definite corresponding integrated method (RFCM) shows a definite improvement in terms of improvement in terms of ββ--index, Dunn index, and DB indexindex, Dunn index, and DB index

•• Extension to RoughExtension to Rough--Fuzzy CFuzzy C--medoidsmedoids clustering for relational clustering for relational clustering and sequence clustering (e.g., Bioinformatics)clustering and sequence clustering (e.g., Bioinformatics)Biobasis function: IEEE Trans. Knowledge Data Engg., 19(6), 1--14, 2007

Summary

Acknowledgement

PabitraPabitra MitraMitra, B. , B. UmaUma ShankarShankar and and PradiptaPradipta MajiMaji

Center for Soft Computing Research:A National Facility

http://www.isical.ac.in/~scc

(Funded by DST under its IRHPA Scheme and partial supported by CIMPA, France)

Thank You!!

relevance in data mining & image segmentationmath.unipa.it/daa_erice07/invited/pal.pdfrough...

Documents