fuzzy models for pattern recognition def.:
DESCRIPTION
Fuzzy Models for Pattern Recognition Def.: A field concerned with machine recognition of meaningful regularities in noisy or complex environment. The search for structure in data. Categories: Numerical pattern recognition, Syntactic pattern recognition. - PowerPoint PPT PresentationTRANSCRIPT
Fuzzy Models for Pattern Recognition1. Def.:
a) A field concerned with machine recognition of meaningful regularities in noisy or complex environment.
b) The search for structure in data.2. Categories:
a) Numerical pattern recognition,b) Syntactic pattern recognition.
The pattern primitives are themselves considered to be labels of fuzzy sets. (sharp, fair, gentle)
The structural relations among the subpatterns may be fuzzy, so that the formal grammar is fuzzified by weighted production rules.
3. Elements of a numerical pattern recognition system:1) Process description: data space →pattern space
Data: drawn from any physical process or phenomenon.
Pattern space (structure): the manner in which this information can be organized so that relationships between the variables in the process can be identified.
2) Feature analysis: feature space Feature space has a much lower dimension than
the data space.→essential for applying efficient pattern search technique.
Searches for internal structure in data items. That is, for features or properties of the data which allow us to recognize and display their structure.
3) Cluster analysis: search for structure in data sets.4) Classifier design: classification space.
Search for structure in data spaces. A classifier itself is a device, means, or algorithm
by which the data space is partitioned into c decision regions.
4. Fuzzy Clustering There is no universally optimal cluster criteria: distance,
connectivity, intensity, … Hierarchical clustering
1) Generate a hierarchy of partitions by means of a successive merging or splitting of clusters.
2) Can be represented by a dendogram, which might be used to estimate an appropriate number of clusters for other clustering methods.
3) On each level of merging or splitting a locally optimal strategy can be used, without taking into consideration policies used on preceding levels.
4) The methods are not iterative; they cannot change the assignment of objects to clusters made on proceeding levels.
5) Advantage: conceptual and computational simplicity.6) Correspond to the determination of similarity trees.
Graph-theoretic clustering1) Based on some kind of connectivity of the nodes of a
graph representing the data set.2) The clustering strategy is often breaking edges in a
minimum spanning tree to form subgraphs.3) Fuzzy data set →fuzzy graph.4) Let G = [V,R] be a symmetric fuzzy graph. Then the
degree of a vertex v is defined as d(v) = ∑u/=vμR(u,v).
The minimum degree of G is δ(G) = min v V∈ {d(v)}.
5) Let G be a symmetric fuzzy graph. G is said to be connected if, for each pair of vertices u and v in V,
.0),( vuR G is called degree Connected for some
)(,0 Gif And G is connected.
6) Let G be a symmetric fuzzy graph. Clusters are thendegreeDefined as maximal Connected subgraph
of G.
Objective-function clustering1) The most precise formulation of the clustering criterion.2) Local extrema of the objective function are defined as
optimal clusterings.3) Bezdek’s c-means algorithm.
Objective-function clustering1) The most precise formulation of the clustering criterion.2) Local extrema of the objective function are defined as
optimal clusterings.3) Bezdek’s c-means algorithm.4) Butterfly example.5) Similarity measure: distance of two objects
d: X × X→R+ which satisfiesD(xk,x1) = dk1 0≧dk1 = 0 <= => xk = x1
dk1 = d1k
(xk,x1 are the points in the p-dimensional space.)
2) Let X = {x1,…,xn} be any finite set. Vcn is the set of all real c X n matrixes, and 2 c n is an integer. The matrix U = [u≦ ≦ ik] V∈ cn is called a crisp c-partition if it satisfies the following conditions:
cinuc
nkubnkciua
nkik
ciik
ik
10)(
11)(1,1}1,0{)(
,1
,1
The set of all matrixes that satisfy these conditions is called Mc.
]1,0[:
}1,0{:
X
orXu
i
i
S
S
Clustering:
1) Each partition of the set X into crisp or fuzzy subsets Si(i = 1,….,c) can fully be described by an indicator function
110
011
1.02.
9.18.
010
101
8.5.2.
2.5.8.
100
011
15.0
05.1
100
011
],,[
321
4
321~
3
321
3
321~
2
321
2
321~
1
321
1
321
xxx
U
xxx
U
xxx
U
xxx
U
xxx
U
xxx
U
xxx
U
xxxX
cinc
nkb
nkcia
nkik
ciik
ik
10)(
11)(
1,1]1,0[)(
,1
,1
The set of all matrixes that satisfy these conditions is called Mfc.
4) Cluster center: vi = (vi1, …,vip): represents the location of a cluster.vector of all cluster centers v = (vi,…,vc).
3) Let X = {x1,…,xn} be any finite set. Vcn is the set of all real c X n matrixes, and 2 c n is an integer. The ≦ ≦matrix U = [uik] V∈ cn is called a fuzzy-c partition if it satisfies the following conditions:
5) Variance criterion: measures the dissimilarity between the points in a cluster and its cluster center by the Euclidean distance.
2/12,1 ])([ ijkjpjik vxd
minimize the sum of the variances of all variables j in each cluster i (sum of the squared Euclidean distances)
kSxii
ikSxciC
XSv
dvSSZ
ik
ik
1
2,11
)(thatsuch
);,,(min
For crisp c-partition:
kiknkiknki
ikiknkci
Xuuv
du
,11
,1
2,1,1
)(thatsuch
min
For fuzzy c-partition:
)2.(,,1,,,1,)(/)(
)1.(.1))((thatsuch
min
)1/(12,1
)1/(12
,11
,1
2,1,1
Eqnkcidd
EqmXv
d
mjkcj
mikik
km
iknkm
iknki
ikm
iknkci
6) Fuzzy c-means algorithmStep1: Choose c and m. Initialize U0 M∈ fc, set r= 0Setp2: Calculate the c fuzzy cluster centers {vr} by using Ur from Eq. 1.Setp3: Calculate the new membership U1+1 by using {vr}
ij
ij
vx rik
for0
for1
setElse.if2Eq.from
jk
IFUU .111Step4: Calculate Set r = r+1 and
Go to step2. IF ,stop.
Decision Making1. Characterized by
1) A set of decision alternatives(decision space; constraints);
2) A set of states of nature (state space);3) Utility (objective ) function: orders the results
according to their desirability.2. Fuzzy decision model: Bellman and Zadeh [1970] Consider a situation of decision making under
certainty, in which the objective function as well as the constraints are fuzzy.
The decision can be viewed as the intersection of fuzzy constraints and fuzzy objective function.
The relationship between constraints and objective functions in a fuzzy environment is therefore fully symmetric, that is , there is no longer a difference between the former and the latter.
The interpretation of the intersection depends on the context.• Intersection (minimum): no positive compensation
(trade-off) between the membership degrees of the fuzzy sets in question.
• Union (max): leads to a full compensation for lower membership degrees.
Decision = Confluence of Goads and Constraints.
Neither the noncompensatory “and” (min, product, Yager-conjunction) nor the fully compensatory “or” (max, algebraic sum, Yager-disjunction) are appropriate to model the aggregation of fuzzy sets representing managerial decisions.
Def: Let μCi(x), i=1,… ,m, x X, be membership ∈functions of constraints, defining the decision space and μGj(x), j=1,…,n, x X the membership functions of ∈objective functions or goals.A decision is then defined by its membership function
njmixxx GjjCiiD ,,1,,,1),()()(
where ji ,, denote appropriate, possibly context-
dependent aggregators.
Individual decision making
•
)](inf ),(infmin[)(
))(()(
))(()(
constraint:C action;:a goal;:G
'
'
ji
aCaGaD
acCaC
agGaG
jNi
iNi
jjj
iii
Multiperson decision making
• Difference with individual decision making– Each places a different ordering on the
alternatives– Each have access to different information
• n-person game theories: both
• Team theories: the second
• Group decision theories: the first.
Multiperson decision making
• Individual preference ordering:
• Social choice function:– The degree of group preference of xi over xj
• procedure to arrive at the unique crisp ordering that constitutes the group choice.
nk NkP ,
]1,0[: XXS
3. Fuzzy Linear Programming Classical model: maximize f(x) = cTx
such that Ax b≦ x 0≧
with c,x R∈ n,b R∈ m,A R∈ mxn.
Modification for fuzzy LP:1) Do not maximize or minimize the objective
function; might want to reach some aspiration levels which might not even be definable crisply.“improve the present cost situation considerably”
2) The constraints might be vague:coefficients, relations
3) Might accept small violations of constraints but might also attach different degrees of importance to violations of different constraints.
4. Symmetric fuzzy IP: Find x such that cTx z (aspiration level)≧
Ax b≦ x 0≧
0))(,)((
xdbBAdBx zc
The membership function of the fuzzy set “decision” the above model is )}({min)( xux i
iD
μi(x) can be interpreted as the degree to which x satisfies the fuzzy unequality Bix d≦ i.
Crisp optimal solution:
)}.({minmax)(max 00 xux ii
xDx
Membership function:
iii
iiii
ii
i
pdxBifmipdxBdif
dxBifx
01,,1]1,0[
1)(
e.g.,
iii
iiiiiii
ii
i
pdxBif
mipdxBdifpdxB
dxBif
x
0
1,,1/)(1
1
)(
optimal solution:
)1(minmax0
i
ii
ix p
dxB
that is
maximizeλ
such that
0
1,,1
x
mipdxBp iiii
→ (λ,x0)→ the maximum solution can be found by solving one crisp LP with only one more variable and one more constraint.
Multistage Decision Making
• Task-oriented control belongs to such kind of decision-making problem
• Fuzzy decision making fuzzy dynamic programming a decision problem regarding a fuzzy finite-state automaton– State-transition relation is crisp– Next internal state is also utilized as output.
ttt xzfz
functiontransitionstateZXZf
fZXA
,
) ( :
;,,
1
Sone-timestorage
xt zt
zt+1
Sone-timestorage
At Ct
Ct+1
Multistage Decision Making
• Fuzzy input states as constraints: A0, A1
• Fuzzy internal state as goal: CN
• Principle of optimality: An optimal decision sequence has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with the state resulting from the first decision.
Multistage Decision Making
),(
))(),(minmax)(
))(),(minmax)(
)(),(),...,(),((minmax
))),((
),(),...,(),(minmax(max
,...,,maxˆ,...,ˆ,ˆ
)(),(),...,(),(min
,...,,
~~...
~~
1
11
1111
11221100
,...,
11
111100
,...,
110
,...,
110
111100
110
110
1
20
120
10
kNkNkN
kNkNkNkN
x
kNkN
NNNN
x
NN
NNNN
xx
NNN
NN
xxx
N
xx
N
NNNN
N
NN
xzfz
zCxAzC
zCxAzC
zCxAxAxA
xzfC
xAxAxA
xxxDxxxD
zCxAxAxA
xxxD
CAAAD
kN
N
N
NN
N
5. Fuzzy LP with crisp objective function Constraints: define the decision space in a crisp of
fuzzy way. Objective function: induce an order of the decision
alternatives. Problem: the determination of an extremum of a crisp
function over a fuzzy domain. Approaches:
1) The determination of the fuzzy set “decision.”2) The determination of a crisp “maximizing decision
by aggregating the objective function after appropriate transformations with the constraints.
Fuzzy “decision”1) Decision space is (partially) fuzzy.2) Compute the corresponding optimal values of the
objective function for all α-level sets of the decision space.
3) Consider as the fuzzy set “decision” the optimal values of the objective functions with the degree of membership equal to the corresponding α-level of the solution space.
Crisp maximizing decision.
6. Fuzzy Multi Criteria Analysis
Problems can not be done by using a single criterion or a single objective function.
Multi Objective Decision Making: concentrates on continuous decision space.
Multi Attribute Decision Making: focuses on problems with discrete decision spaces.
MODM: also called vector-maximum problemDef.: maximized {Z(x)|x X}∈
where Z(x) = (z1(x),…,zk(x)) is a vector-valuedfunction of x R∈ n into Rk and X is the “solution space”
Stage in vector-maximum optimization:1) The determination of efficient solution2) The determination of an optimal compromise solution
Efficient solution:xa is an efficient solution if there is no xb X such that ∈Zi(xb) z≧ i(xa) I=1,…,k andZi(xb)>zi(xa) for at least one i =1,…,k.Complete solution: the set of all efficient solutions.Example:
MADM:Def.: Let X = {xi | i = 1,…,n} be a set of decision
alternatives and G = {gj | j = 1,…,m} a set of goals according to which the desirability of an action is judged. Determine the optimal alternative x0 with the highest degree of desirability with respect to all relevant goals gj.
Stages:1) The aggregation of the judgments with respect to all
goals and per decision alternative.2) The rank ordering of the decision alternatives according
to the aggregated judgments.
Fuzzy MADM:Yager model:Let X = {xi | i = 1,…,n} be a set of decision alternatives.The goals are represented by the fuzzy sets Gj, j = 1,…,m.The importance (weight) of goal j is expressed by wj. The
attainment of goal Gj by alternative xi is expressed by the degree of membership μGj
(xj). The decision is defined as the intersection of all fuzzy goals, that is D = G1 ∩ G2 ∩…∩ Gm.
The optimal alternative is defined as that achieving the highest degree of membership in D.
FUZZY IMAGE TRANSFORM CODING
1. Transform coding: a transformation, perhaps an energy-preserving transform such as the discrete cosine transform (DCT), converts an image to uncorrelated data, (keep the transform coefficients with high energy and discard the coefficients with low energy, and thus compress the image data.)
2. (HDTV) systems have reinvigorated the image-coding field. (TV images correlate more highly in the time domain than in the spatial domain. Such time correlation permits even higher compression than we can achieve with still image coding.)
3. Adaptive cosine transform coding [Chen, 1977] produces high-quality compressed images at the less than I-bit/pixel rate.
1) Classifies subimages into four classes according to their AC energy level and encodes each class with different bit maps.
2) Assigns more bits to a subimage if the subimage contains much detail (large AC energy), and less bits if it contains less detail (small AC energy).
3) DC energy refers to the constant background intensity in an image and behaves as an average.
4) AC energy measures intensity deviations about the background DC average. So the AC energy behaves as a sample-variance statistic.
DCTX Coding
SubimageClassification
Decoding DCT-1 X,
Figure10.1 Block diagram of adaptive cosine transform coding.
4. Selection of quantizing fuzzy-set values1) Use percentage-scaled values of Ti and Li scaled by
the maximum possible AC power value.2) Compute the maximum AC power Tmax form the
DCT coefficients of the subimage filled with random numbers from 0 to 255.
jj LT and3) Calculate the arithmetic average AC powersfor each class.
ADAPTIVE FAM SYSTEMS FOR TRANSFORM CODING
1. Classified subimage into four fuzzy classes B: HI, MH, ML, LO.(encode the HI subimage with more bits and the LO subimage with less bits.)
2. The four fuzzy sets BG, MD, SL, and VS quantized the total AC power T of a subimage.
3. L (low-frequency AC power): assumed only the two fuzzy-set values SM and LG.
4. Fuzzy transform image coding uses common-sense fuzzy rules for subimage classification.1) Fuzzy associative memory (FAM) rules encode
structured knowledge as fuzzy associations.2) The fuzzy association (Ai, Bi) represents the
linguistic rule “IF X is Ai, THEN Y is Bi.”3) In fuzzy transform image coding, Ai represents the
AC energy distribution of a subimage, and Bi denotes its class membership
4) Product-space clustering estimates FAM rules from training data generated by the Chen system.
5) The resulting FAM system estimates the nonlinear subimage classification function f: E→m, where E denotes the AC energy distribution of a subimage, and m denotes the class membership of a subimage.
6) We added a FAM rule to the FAM system if a DCL-trained synaptic vector fell in the FAM cell. (DCL-hased product-space clustering estimated the five FMA rules (1,2,6,7,and 8). We added three common-sense FAM rules (3,4,and 5) to cover the whole input space.)
7) FAM rule 1 (BG, LG; HI) represents the association;,IF the total AC power T is BG AND the low-frequency AC power L is LG,THEN encode the subimage with the class B corresponding to HI.
8) The Chen system sorts subimages according to their AC-energy content to produce the subimage-classification mapping. (requires comparatively heavy computations.)
9) The FAM system does not sort subimages. Once we have trained the FAM system, the FAM system classifies sublimage with almost no computation. (FAM only adds and multiplies comparatively few real numbers.)
5. Product-Space Clustering to Estimate FAM Rules1) Product-space clustering with competitive learning
adaptively quantizes pattern clusters in the input-output product-space Rn.
2) Stochastic competitive learning systems are neural adaptive vector quantization (AVQ) systems.• P neurons compete for the activation induced by
randomly sampled input-output patterns.• The corresponding synaptic fan-in vectors mj
adaptively quantize the pattern space Rn.• The p synaptic vectors mj define the p columns of
a synaptic connection matrix M.
3) Fuzzy rules (Ti, Li; Bi) define cluster or FAM cells in the input-output product-space R3.
4) Define FAM-cell edges with the nonoverlapping intervals of the fuzzy-set values.(There are total 32 possible FAM cells and thus 32 possible FAM rules.)
5) Differential competitive learning (DCL) classified each of the 256 input-output data vectors generated from the Chen system into one of the 32 FAM cells.
6. Simulation: Lenna image → F-16 image1) FAM also performed well for F16 image.2) When we encode multiple images with fixed bit maps,
we cannot optimize or tune the bit maps to a specific image.
3) FAM encoding performed slightly better (had a larger signal-to-noise ratio) than did Chen encoding and maintained a slightly higher compression ratio (fewer bits/pixel).
4) FAM reduces side information and uses only 8 FAM rules to achieve 16-to-1 image compression.
5) If a system leaves numerical I/O footprints in the data, an AFAM system can leave similar footprints in similar contexts. Judicious fuzzy engineering can then refine the system and sharpen the footprints.