witold pedrycz department of electrical & computer engineering university of alberta, edmonton,...

52
Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of Sciences Warsaw, Poland [email protected]

Upload: rodney-wood

Post on 14-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Witold PedryczDepartment of Electrical & Computer Engineering University of Alberta, Edmonton, CanadaandSystems Research Institute, Polish Academy of SciencesWarsaw, Poland

[email protected]

Page 2: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Agenda

Introduction: human-centricity of intelligent systems and information granules

Conceptualization and realization of information granules

Information content and its characterization

Context-based information granules

Successive refinements of information granules

Information granules –based architectures

Conclusions

Page 3: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Human-system interactionand system modeling

Perception and processing processing realized at certain level of abstraction

Acceptance of granular (non-numeric) data

Effective two-way communication with the user at the level of information granules

Adjustment of level of detail (abstraction) dependent upon the needs of the individual user (personalization); avoidance of unnecessary details and focus on essentials

Page 4: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Clustering and fuzzy clustering

Discovery of structures and relationships in data

Data analysis

Construction of fuzzy sets

Clustering in fuzzy modeling and modeling…

Page 5: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Information granules: from conceptualization to realization

Implicit information granulesImplicit information granulesExplicit (operational)information granules

Humans Computer realizations

Various points of view (models)Various points of view (models) Fuzzy setsRough setsIntervals (sets)Shadowed setsProbability functions

Information granulesInformation granules

Page 6: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Development of information granules

Usage of all available experimental evidence (numeric and non-numeric;knowledge hints)

Information granules capturing existing domain knowledge (especiallyguidance –knowledge hints provided by the designer/user)

Results of information granulation dependent upon the underlying formalism of Granular Computing

Information granules are context-dependent; the ensuing design framework should incorporate this aspect in an explicit way

Page 7: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Information granules: from conceptualization to realization

Data Information granules Construction of Intelligent systems

auxiliary guidance mechanisms

Content of information granules

Formalisms of sets, fuzzy sets, rough sets

Page 8: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Clustering and fuzzy clustering

Clustering 2,670,000

Fuzzy clustering 443,000

Rough clustering 268,000

Information granulation/granules 158,000

Granular Computing 104,000

Google Scholar, October 29, 2014

Page 9: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Objective function-based clustering

{x1, x2,…, xN}

Objective function Minimize w.r.t. structure

information granules G1, G2 , …,Gc Prototypes, medoidsv1, v2,…, vc

Partition matrix U

Page 10: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Fuzzy C-Means (FCM) as an exampleof fuzzy clustering

vi – prototypes

U- partition matrix

Page 11: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

FCM – representation of information granules (granules)

Partition matrix U

prototypes v1, v2, …, vc

c 1,2,...,i ,Nu0N

1kik

Page 12: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Fuzzy Clustering: Fuzzy C-Means (FCM)

Given data x1, x2, …, xN, determine its structure byforming a collection of information granules – fuzzy sets

Objective function

2ik

N

1k

mik

c

1i||||uQ vx

Minimize Q; structure in data (partition matrix and prototypes)

Page 13: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

FCM – flow of optimization

2ik

N

1k

mik

c

1i||||uQ vx

Minimize

subject to

(a) prototypes

(b) partition matrix

Page 14: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Construction of information granules: Fuzzy C-Means (FCM)

Data {x1, x2, …, xN} xk in Rn.

Performance index (objective function)

Construct information granules (clusters) - fuzzy sets A1, A2, …, Ac organized aspartition matrix U

Partition matrix

Prototype vi

Page 15: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Quality of clustering

Cluster validity indexes

Cluster content

Granulation-degranulation: reconstruction criterion

Page 16: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Information content of clusters

Description of information content:

•Variability of data

•Classification content

•Variability with regard to auxiliary (output) data

Page 17: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Variability of data

Description of data residing within a given ith cluster

Variability of data around the prototype vi

Variability in terms of membership grades of data

Page 18: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Classification content of clusters

Applied to classification problems.

Dominant class present in ith cluster

Classification content:

count index

cumulative membership grades of classes

Page 19: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Granular mapping: an architecture

Aggregation of contents of information granules and their activation levels

Page 20: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Auxiliary variable content

Problems in which occur some additional variables (say output variable, y) whose values determine the content of the cluster.

Clustering realized for data in the multivariable input space

Page 21: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Information granulation and degranulation: reconstruction criterion

v1 v2 vc

granulation

u1, u2, …, uc

v1 v2 vc

degranulation

Results of degranulation made more abstract (in the form of information granules):granular clustering

Page 22: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Key challenges of clustering

Selection of distance function (geometry of clusters)

Number of clusters

Quality of clustering results

Page 23: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Landscape of clustering

Graph-oriented and hierarchical (single linkage, complete linkage, average linkage..)

Objective function-based clustering

Variety of formalisms and optimization tools(e.g., methods of Evolutionary Computing)

Diversity

CommonalityData-driven methods

Page 24: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

The dichotomy and a paradigm shift

Human-centricityGuidance mechanisms

Page 25: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Knowledge –based clustering

dataknowledge

Partial supervision

Context-based guidance

Proximity –based

Viewpoints

Page 26: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Domain Knowledge:categories of knowledge-oriented

guidance

Context-based guidance: clustering realized in a certain contextspecified with regard to some attribute

Viewpoints: some structural information is provided

Partially labeled data: some data are provided with labels (classes)

Proximity knowledge: some pairs of data are quantified interms of their proximity (resemblance, closeness)

Page 27: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based clustering

Clustering : construct clusters in input space X

Context-based Clustering : construct clusters in input space X given some context expressed in output space Y

Active role of the designer [customization of processing]

Page 28: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based clustering:Conmputational considerations

•computationally more efficient,•well-focused, •designer-guided clustering process

Data

structure

Data

structure

context

Page 29: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based clustering:focus mechanism

Determine structure in input space given the output is high

Determine structure in input space given the output is medium

Determine structure in input space given the output is low

Input space (data)

Page 30: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based clustering:examples

Find a structure of customer data [clustering]

Find a structure of customer data considering customers making weekly purchases in the range [$1,000 $3,000]

Find a structure of customer data considering customers making weekly purchases at the level of

around $ 2,500

Find a structure of customer data considering customers making significant weekly purchases who

are young

no context

context

context

context(compound)

Page 31: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based Fuzzy C-Means

data(xk, yk), k=1,2,…,N

contexts: fuzzy sets W1, W2, …, Wp defined in the output space

wjk = Wj(yk)

c

1i

N

1kikjkikikj iNu0andk wu|0,1u)(WU

Context-drivenpartition matrix

Page 32: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-based clustering:the use of context

xk

Context Wj

yk

Wj(yk)

xkContext-based fuzzy clustering

Page 33: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Context-oriented FCM:Optimization flow

Objective function

Iterative adjustment of partition matrix and prototypes

2ik

c

1i

N

1k

mik ||||uQ vx

c

1j

1m

2

jk

ik

jkik

wu

vx

vx

N

1k

mik

N

1kk

mik

i

u

u xv

Subject to constraint U in U(Wj)

Page 34: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Successive refinements of information granules

Information granules constructed in a successive manner forming a hierarchy of refined constructs of higher specificity

The refinement applied to information granules based on their information content

Successive usage of context-based fuzzy clustering

Page 35: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Successive refinements of information granules

information granuleto be refined

membership function Ai [1] used as a contextrefinement process

membership function Aj [2] used as a context

Page 36: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Expansion formulas:Context-based FCM

information granuleto be refined

membership function Ai [1] used as contextrefinement process

property of fuzzy partition

membership function Aj [2] used as context

Page 37: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Expansion formulas:Context-based FCM

information granuleto be refined

membership function Ai [1] used as context

membership function Aj [2] used as context

Page 38: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Successive fuzzy partitions

Page 39: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Fuzzy clustering with viewpoints

Page 40: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Viewpoints: definitionDescription of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in castingan overall analysis in a required setting

“external” , “reinforced” clusters

Page 41: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Viewpoints: examples

-150

-100

-50

0

50

100

150

200

0 100 200 300 400 500

x1

x2

a

b

x1

x2

a

viewpoint (a,b) viewpoint (a,?)

Page 42: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Viewpoints in fuzzy clustering

x1

x2

a

b

otherwise 0,

viewpointby the determined is B of rowth -i theof featureth -j theif 1,b ij

0

0

1

0

0

1

B

0

0

b

0

0

a

F

B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data)

Page 43: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Viewpoints in fuzzy clustering

Q = 2ijkj

n

1:bji,1j

mik

c

1i

N

1k

2ijkj

n

0:bji,1j

mik

c

1i

N

1k

)f(xu)v(xu

ijij

1b if f

0bif vg

ijij

ijijij

2ijkj

n

1j

mik

c

1i

N

1k

)g(xuQ

Page 44: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Fuzzy clustering with proximity guidelines

Page 45: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Proximity hints

Characterization in terms of proximity degrees:

Prox(k, l), k, l=1,2, …., N

and supervision indicator matrix B = [bkl], k, l=1,2,…, N

Prox(k,l)

Prox(s,t)

Page 46: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Proximity measureProperties of proximity:

(a)Prox(k, k) =1

(b)Prox(k,l) = Prox(l,k)

Proximity induced by partition matrix U

Linkages with kernel functions K(xk, xl)

Page 47: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Augmented objective function

> 0

Page 48: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Granular fuzzy clustering

Page 49: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Granular prototypes and reconstruction criterion

prototypes granular prototypes

Selection/construction of prototypes

Forming granular prototypes to capture existing structural variability and satisfyingdegranulation criterion

(a) information granules of prototypes built around prototypes(b) optimization of allocation of granularity by minimizing reconstructioncriterion

Page 50: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Formation of granular (interval) membership grades – details

xVi

ui-(x)=min(w1(x), w2(x))

ui+(x)=max(w1(x), w2(x))

Page 51: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Overview

Information granules Blueprint of model

content

Model development,refinements, augmentations

Page 52: Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of

Conclusions

Fuzzy clustering as a conceptual and algorithmic backbone ofdesign of information granules

Human-centric (knowledge-oriented) design of information granules

Emergence of higher type granular constructs

Needs for further advancements in optimization frameworks of fuzzy clustering