ts modeling based on gmdh and its application changzheng he dept. of management science, sichuan...
TRANSCRIPT
TS Modeling Based on GMDH and Its application
Changzheng He
Dept. of Management Science,Sichuan University of P.R.China
Fuzzy modeling
☆Two main type in fuzzy modeling
——Mamdani Type
——TS Type
Self-organizing Fuzzy Rule Induction
GMDH Mamdani Type fuzzy model
FRI
+=
1w
1v
2v
3v
2w
4v
5v10w
2z
2y )(* vfy w 5
Z 6
Initial organisaction
1. layer
2. layer
3. layer
best models
not selected neuron
selected neuron
GMDH algorithm
Self-organizing Fuzzy Rule Induction
J.A.MuellerjF. Lemke
fuzzification
FRI in marketing
☆ Extract features from data
automatically
☆ Form fuzzy models similar to natural
language
TS model
Takagi-Sugeno fuzzy model
☆ Proposed by Japanese
researcher Takagi and
Sugeno in 1985.
☆ Widely used in
control 、 prediction
Basic form of TS model
☆ Consist of several If-then rules, each rule is as following:
Where and are input\output variables
are fuzzy set defined in input variable
TS fuzzy model
1 1
0 1 1
: k kk m m
k k km m
R If x is A and and x is A
then y C C x C x
ix y
kiA ix
TS fuzzy model
Advantage of TS model
☆ Approximates complex nonlinear systems
with fewer rules and high modeling accuracy
TS-GMDH
GMDH TS Type fuzzy model
TS-GMDH
+=
Steps of algorithm
( 1 ) Fuzzification of variables and data division
Test set Validation setTraining set
A B N
Steps of algorithm
Bell-shaped membership functions are used
Steps of algorithm
( 2 ) Forming of the first generation TS
models.
Input fuzzy sets are combined in pairs to form the first generation TS models
11 10 1
1 21 10 1
i
j
si i i il
sj j j j
if x is A then y a a xR :
if x is A then y b b x
……
Steps of algorithm
In the TS fuzzy rule Parameters are estimated by Ordinary Least Square in the training set A.
a,b
11 10 1
1 21 10 1
21 2 1 2
i
j
si i i il
s
j j j j
mn
if x is A then y a a xR :
if x is A then y b b x
i, j , ,...,n,i j;l , ,...,C
Steps of algorithm
( 3 ) Model selection
F best TS models are selected in the test set B by Regularity criterion
where and are firing strength of each rule , and are predicted output of each rule
21 21 21 11 2 1 2
1
B
i
G Gˆ ˆy ( y y )
G G G G
1 isi iG A ( x ) 2 js
j jG A ( x )11y 2
1y
Steps of algorithm
( 4 ) Rules fusion
F best TS model are merged into F rules
11 10 1
1 21 10 1
21 2 1 2
i
j
si i i il
s
j j j j
mn
if x is A then y a a xR :
if x is A then y b b x
i, j , ,...,n,i j;l , ,...,C
1
1 10 1 1 11
jissl
i i j j
li i j j
R : if x is A and x is A
then y a a x a x ,l ,...,F
Steps of algorithm
( 5 ) Forming the 2th generation TS models
F best rule are combined in pairs to form models
12 20 2 2
2 22 20 2 2
ji
k h
ssl i i j j i i j j
s sk k h h k k h h
if x is A and x is A then y a a x a xR :
if x is A and x is A then y b b x b x
1
1 10 1 1 11
jissl
i i j j
li i j j
R : if x is A and x is A
then y a a x a x ,l ,...,F
Steps of algorithm
(6)Circulation of algorithm
External Criterion
stop
Network of TS-GMDH modeling
Fuzzification
1x
ix
nx
11x
mx1 1ix
mix 1nx
mnx
TS
TS
TS
TS
TS
TS
TS
TS
TS
TS
TS
TS
Selecting of
1th generation
Selecting of 2th generation
Selecting of kth generation
Weighted Average
y
1th generation TS model
2th generation TS model
kth generation TS model
Initial input
Simulation Experiment
12 benchmark data sets from UCI Number of
sampleNumber of
attribute Credit 1000 20Pima 768 8
Haberman 306 3Endgame 958 9
Echocardiogram 132 12Hepatitis 155 19MAGIC 19020 10
monks-1.train 432 7monks-1.test 432 7
Mass 961 5Breast Cancer_D 569 31Breast Cancer_P 198 33
Experiment Results
FRI TS-GMDH
Credit 70.18% 72.42%Pima 71.33% 75.30%
Haberman 52.58% 73.53%Endgame 67.64% 69.13%
Echocardiogram 87.27% 96.12%Hepatitis 84.50% 84.52%MAGIC 67.79% 79.38%
monks-1.train 74.42% 80.08%monks-1.test 74.36% 80.57%
Mass 79.58% 81.76%Breast Cancer_D 90.56% 94.48%Breast Cancer_P 76.24% 75.19%
Simulation Experiment
☆ TS-GMDH have better accuracy in 11of 12
data set;
☆ In the exceptional case it is not statistically
significant which means TS-GMDH in not
worse than FRI
Conclusion:
Empirical research
Feature extraction of cigarette market
Problem description
Draw features of two segments:
heavy smokers and mild smokers
Data size:
150 sample and 50 variables
Empirical research
MethodHeavy Smoker Mild Smoker
N M(A+B) N M(A+B)
FRI 83.33% 70% 83.33% 70%
TS-GMDH 94% 93.33% 89.33% 87.33%
TS-GMDH have a better accuracy in both modeling set M and validation set N