模糊决策树—soft decision tree
TRANSCRIPT
清晰决策树( Crisp Decision Tree )
数据预处理 属性值及分类是清晰的;连续值属性数据需做离散化处理;每个属性值(术语) 为属性空间上的经典集。
生成决策树 决策树的每个结点是属性空间上的经典子集;每一条从根到叶子的路径对应一条清晰规则; 位于同一层的结点的交集为空集。
匹配 测试示例仅与一条路径匹配。
适用性 适用于符号值属性和分类较清晰、 噪音小的中小型数据库。
模糊决策树( Fuzzy Decision Tree )
数据预处理 属性值及分类是模糊的;连续值属性数据需做模糊化处理;每个属性值( 术语) 为属性空间上的模糊集。生成决策树 决策树的每个结点是属性空间上的模糊子集;每一条从根到叶子路径对应一条模糊规则;位于同一层的结点的交集一般不空。
匹配 测试示例可与多条路径近似匹配。
适用范围 适用于各种情况的数据库, 特别是对属性和类模糊性强, 有噪音的数据库。
Fuzzy set theory
• 模糊集合 (Fuzzy set) 来描述模糊事物的概念。其基本思想是把经典集合中的绝对隶属关系模糊化。
Fuzzy set theory
• Let U be a collection of objects denoted generically by {u}. U is called the universe of discourse and u represents the generic element of U.
Fuzzy set theory
• Definition 1. A fuzzy set A in a universe of discourse U is
characterized by a membership function μA which takes values in the interval [0, 1].
For uЄU, μA (u)=1 means that u is definitely a member of A and μA (u) = 0 means that u is definitely not a member of A, and 0 < μA (u) < 1 means that u is partially a member of A. If either μA (u) = 0 or μA (u) = 1 for all u Є U, A is a crisp set.
Fuzzy set theory
Soft Decision Tree
A variant of classical decision tree inductive learning using fuzzy set theory
Soft decision trees
VS.
Crisp regression trees
T1->T2->D5
Crisp regression tree
• a single threshold and having two possible answers: yes or no (left or right)
• split into two (in our case of binary trees) non-overlapping subregions of objects
Leaf L4 with membership degree of0.43 Leaf L5 with membership degree of 0.57
1:0 0:43 label∗ ∗ L4 +1:0 0:57 label∗ ∗ L5 =1:0 0:43 0:44+1:0 0:57 1:00=0.76 ∗ ∗ ∗ ∗
Soft decision tree
• discriminator function– piecewise linear (widely used)– two parameters
• α : corresponds to the split threshold in a test node of a decision or a regression tree
• β : the width, the degree of spread that defines the transition region on the attribute chosen in that node
– split (fuzzy partitioned) into two overlapping subregions
– reaches multiple terminal nodes – the output estimations given by all these terminal
nodes are aggregated through some defuzzification scheme in order to obtain the final estimated membership degree to the target class.
Building a soft decision tree
GS: growing set PS: pruning setLS: learning setTS: test set LS =GS PS∪
Soft tree semantics
Soft tree semantics
Soft tree semantics
Fuzzy set S into two fuzzy subsets, SL the left one and SR the right one
Discriminator function v(a(o),α,β,γ…)→[0,1]
Soft tree semantics
• Membership degree of an object to the left successor’s subset SL
– μSL(0)=μS(0)v(a(o),α,β)
strictly positive
• Membership degree of an object to the left successor’s subset SR
– μSR(0)=μS(0)(1-v(a(o),α,β))
strictly positive
Soft tree semantics
• j: a node of a tree• Lj: numerical value (or label) attached to node
• SLj: the fuzzy subset corresponding to this node
SDT growing
SDT growing
• a method to select a (fuzzy) split at every new node of the tree
• a rule for determining when a node should be considered terminal
• a rule for assigning a label to every identified terminal node
Automatic fuzzy partitioning of a node
Objective: Given S, fuzzy set in a soft decision tree, and attribute a(·), threshold α and width β together with successors labels LL and LR, so as to minimize the squared error function
Automatic fuzzy partitioning of a node
Strategy: •Searching for the attribute and split location. With a fixed β=0 (crisp split) we search among all the attributes for the attribute a(·) yielding the smallest crisp ES , its optimal crisp split threshold α, and its corresponding (provisional) successors labels LL and LR, by using crisp heuristics adapted from CART regression trees.
Automatic fuzzy partitioning of a node
Strategy: •Fuzzication and labeling. With the optimal attribute a(·) and threshold α kept frozen, we search for the optimal width β by Fibonacci search; for every new β value, the two successors labels LL and LR are automatically updated to every candidate value of β by explicit linear regression formulas.
SDT pruning
SDT pruning
• Objective: Given a complete SDT and a pruning sample of objects PS, find
• i.e. find the subtree of the given SDT with the best mean absolute error (MAE) on the pruning set among all subtrees that could be generated from the complete SDT.
SDT pruning
SDT pruning
Strategy:•Subtrees sequence generation.The first node in the list is removed and contracted, and the resulting tree is stored in the trees sequence. Finally, we obtain a sequence of trees in decreasing order of complexity.
SDT pruning
Strategy:•Best subtree selection.
– “One-standard-error-rule” to select a tree from the pruning sequence. –Use the PS to get an unbiased estimate of
the MAE, together with its standard error estimate.– Selecting among the trees not the one of
minimal MAE but rather the smallest tree in the sequence
SDT tuning
• Refitting– optimize only terminal nodes parameters– based on linear least squares
• Backfitting– optimize all model free parameters– based on a Levenberg-Marquardt non-linear
optimization technique
Empirical results
References
• Cristina Olaru, Louis Wehenkel A complete fuzzy decision tree technique Fuzzy Sets and Systems 138 (2003) 221–254
• Yufei Yuan, Michael J. Shaw Induction of fuzzy decision trees Fuzzy Sets and Systems 69 (1995) 125-139
• 王熙照 孙 娟 杨宏伟 赵明华 模糊决策树算法与清晰决策树算法的比较研究 计算机工程与应用 2003.21