ccf 贝叶斯网络在中国的应用和发展学术沙龙
DESCRIPTION
CCF 贝叶斯网络在中国的应用和发展学术沙龙. 香港科技大学 BN 理论研究和应用的情况 2012-05-22. Early Work (1992-2002) Inference: Variable Elimination Inference: Local Structures Others: Learning, Decision Making, Book Latent Tree Models (2000 - ) Theory and Algorithms Applications - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/1.jpg)
CCF贝叶斯网络在中国的应用和发展学术沙龙
香港科技大学BN 理论研究和应用的情况
2012-05-22
![Page 2: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/2.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making, Book
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 2
![Page 3: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/3.jpg)
Bayesian NetworksPage 3
![Page 4: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/4.jpg)
Variable Elimination
Papers: N. L. Zhang and D. Poole (1994),
A simple approach to Bayesian network computations, in Proc. of the 10th
Canadian Conference on Artificial Intelligence, Banff, Alberta, Canada, May 16-22.
N. L. Zhang and D. Poole (1996),
Exploiting causal independence in Bayesian network inference,Journal of Artificial
Intelligence Research, 5: 301-328.
Idea
Page 4
![Page 5: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/5.jpg)
Variable EliminationPage 5
![Page 6: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/6.jpg)
Variable EliminationPage 6
![Page 7: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/7.jpg)
Variable Elimination
First BN inference algorithm in
Page 7
Russell & Norvig wrote on page 529: “The algorithm we describe is closest to that developed by Zhang and
Poole (1994, 1996)”
Koller and Friedman wrote on page: “… the variable elimination algorithm, as presented here, first described
by Zhang and Poole (1994), …”
The K&F book cites 7 of our papers
![Page 8: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/8.jpg)
Local StructurePage 8
![Page 9: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/9.jpg)
Local Structures: Causal Independence
Papers: N. L. Zhang and D. Poole (1996),
Exploiting causal independence in Bayesian network inference,Journal of Artificial
Intelligence Research, 5: 301-328.
N. L. Zhang and D. Poole (1994),
Intercausal independence and heterogeneous factorization,i in Proc. of the 10th
Conference on Uncertainties in Artificial Intelligence., Seattle, USA, July 29-31
Page 9
![Page 10: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/10.jpg)
Page 10
Local Structures: Causal Independence
![Page 11: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/11.jpg)
Local Structure: Context Specific Independence
Papers: N. L. Zhang and D. Poole (1999),
On the role of context-specific independence in Probabilistic Reasoning IJCAI-99,
1288-1293.
D. Poole and N. L. Zhang (2003).
Exploiting contextual independence in probablisitic inference. Journal of Artificial
Intelligence Research, 18: 263-313.
Page 11
![Page 12: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/12.jpg)
Other Works
Parameter Learning N. L. Zhang (1996),
Irrelevance and parameter learning in Bayesian networks, Artificial
Intelligence, An International Journal, 88: 359-373.
Decision Making N. L. Zhang (1998), Probabilistic Inference in Influence Diagrams,
Computational Intelligence , 14(4): 475-497.
N. L. Zhang R. Qi and D. Poole (1994) A computational theory of decision
networks, International Journal of Approximate Reasoning, 1994, 11 (2):
83-158. PhD Thesis
Page 12
![Page 13: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/13.jpg)
Other WorksPage 13
![Page 14: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/14.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 14
![Page 15: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/15.jpg)
Latent Tree Models: Overview
Concept first mentioned by Pearl 1988
We are the first one to conduct systematic research on LTMs. N. L. Zhang (2002). Hierarchical latent class models for cluster analysis.
AAAI-02, 230-237.
N. L. Zhang (2004). Hierarchical latent class models for cluster analysis.
Journal of Machine Learning Research, 5(6):697--723, 2004.
Earlier Followers:
Aarlborg U of Denmark, Norwegian University of Science and
Technology
Recent papers from:
MIT, CMU, USC, Goergia Tech, Edinburgh
Page 15
![Page 16: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/16.jpg)
Latent Tree Models
Recent survey by French researcher:
Page 16
![Page 17: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/17.jpg)
Latent Tree Models (LTM)
Bayesian networks with
Rooted tree structure
Discrete random variables
Leaves observed (manifest variables)
Internal nodes latent (latent variables)
Also known as hierarchical latent class (HLC)
models, HLC models
P(Y1),
P(Y2|Y1),
P(X1|Y2), P(X2|Y2), …
![Page 18: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/18.jpg)
ExamplePage 18
Manifest variables
Math Grade, Science Grade, Literature Grade, History Grade
Latent variables
Analytic Skill, Literal Skill, Intelligence
![Page 19: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/19.jpg)
Theory: Root Walking and Model Equivalence
M1: root walks to X2; M2: root walks to X3
Root walking leads to equivalent models on manifest variables
Implications:
Cannot determine edge orientation from data
Can only learn unrooted models
![Page 20: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/20.jpg)
Regular latent tree models: For any latent node Z with neighbors X1, X2,
…,
Regularity
Can focus on regular models only
Irregular models can be made regular
Regularized models better than irregular models
The set of all such models is finite.
![Page 21: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/21.jpg)
Effective Dimension
Standard dimension:
Number of free parameters
Effective dimension
X1, X2, …, Xn: observed variables
P(X1, X2, …, Xn) is a point in a high-D space for each value of the
parameter
Spans a manifold as parameter value varies.
Effective dimension: dimension of the manifold.
Parsimonious model:
Standard dimension = effective dimension
Open question: How to test parsimony?
Page 21
![Page 22: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/22.jpg)
Effective DimensionPage 22
Paper: N. L. Zhang and Tomas Kocka (2004). Effective dimensions of hierarchical
latent class models. Journal of Artificial Intelligence Research, 21: 1-17.
Open question: Effective of LTM with one latent variable
![Page 23: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/23.jpg)
Learning Latent Tree Models
Determine Number of latent variables
Cardinality of each latent variable
Model Structure
Conditional probability distributions
![Page 24: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/24.jpg)
Search-Based Learning: Model Selection
Bayesian score: posterior probability P(m|D)
P(m|D) = P(m)∫ P(D|m, θ) d θ/ P(D)
BIC Score: large sample approximation
BIC(m|D) = log P(D|m, θ*) – d logN/2
BICe Score:
BICe(m|D) = log P(D|m, θ*) – de logN/2
effective dimension de.
Effective dimensions are difficult to compute
BICe not realistic
![Page 25: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/25.jpg)
Search Algorithms Papers:
T. Chen, N. L. Zhang, T. F. Liu, Y. Wang, L. K. M. Poon (2011). Model-based multidimensional clustering of categorical data. Artificial Intelligence, 176(1), 2246-2269.
N. L. Zhang and T. Kocka (2004). Efficient Learning of Hierarchical Latent Class Models. ICTAI-2004
Double hill climbing (DHC), 2002 7 manifest variables.
Single hill climbing (SHC), 2004 12 manifest variables
Heuristic SHC (HSHC), 2004 50 manifest variables
EAST, 2011 100+ manifest variables
Recent fast algorithm for specific applications.
![Page 26: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/26.jpg)
Illustration of the search process
![Page 27: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/27.jpg)
Algorithm by Others
Variable clustering method S. Harmeling and C.K. I. Williams. Greedy learning of binary latent trees (2011).
IEEE Transactions on Pattern Analysis and Machine Intel ligence, 33(6), 1087-
1097.
Raphaël Mourad, Christine Sinoquet, Philippe Leray (2010). A hierarchical
Bayesian network approach for linkage disequilibrium modeling and data-
dimensionality reduction prior to genome-wide association studies. BMC
Bioinformatics 2011, 12:16doi:10.1186/1471-2105-12-16.
Fast, model quality may be poor
Adaptation of Evolution Tree Algorithms Myung Jin Choi, Vincent Y. F. Tan, Animashree Anandkumar, and Alan S. Willsky
( 2011 ) . Learning latent tree graphical models. Journal of Machine Learning
Research 1 (2011) 1-48.
Fast, has consistence proof, for special LTMs only
Page 27
![Page 28: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/28.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 28
![Page 29: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/29.jpg)
Page 29
Density Estimation
Characteristics of LTMs Are computationally very simple to work with. Can represent complex relationships among manifest variables.
Useful tool for density estimation.
![Page 30: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/30.jpg)
Page 30
Density Estimation
New approximate inference algorithm for Bayesian networks (Wang,
Zhang and Chen, AAAI 08, Exceptional Paper)
SampleLTAB Algo
sparse sparse dense
dense
![Page 31: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/31.jpg)
Multidimensional Clustering
Paper: T. Chen, N. L. Zhang, T. F. Liu, Y. Wang, L. K. M. Poon (2011). Model-based multidimensional
clustering of categorical data. Artificial Intelligence, 176(1), 2246-2269.
Cluster Analysis
Grouping of objects into clusters so that objects in the same
cluster are similar in some sense
Page 31
![Page 32: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/32.jpg)
How to Cluster Those?Page 32
![Page 33: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/33.jpg)
How to Cluster Those?Page 33
Style of picture
![Page 34: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/34.jpg)
How to Cluster Those?Page 34
Type of object in picture
![Page 35: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/35.jpg)
How to Cluster Those?Page 35
Multidimensional clustering / Multi-Clustering
How to partition data in multiple ways?
Latent tree models
![Page 36: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/36.jpg)
Latent Tree Models & Multidimensional Clustering
Model relationship between Observed / Manifest variables
Math Grade, Science Grade, Literature Grade, History Grade
Latent variables Analytic Skill, Literal Skill, Intelligence
Each latent variable gives a partition Intelligence: Low, medium, high
Analytic skill: Low, medium, high
![Page 37: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/37.jpg)
ICAC Data
// 31 variables, 1200 samples
C_City: s0 s1 s2 s3 // very common, quit common, uncommon, ..
C_Gov: s0 s1 s2 s3
C_Bus: s0 s1 s2 s3
Tolerance_C_Gov: s0 s1 s2 s3 //totally intolerable, intolerable, tolerable,...
Tolerance_C_Bus: s0 s1 s2 s3
WillingReport_C: s0 s1 s2 // yes, no, depends
LeaveContactInfo: s0 s1 // yes, no
I_EncourageReport: s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...
I_Effectiveness: s0 s1 s2 s3 s4 //very e, e, a, in-e, very in-e
I_Deterrence: s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...
…..
-1 -1 -1 0 0 -1 -1 -1 -1 -1 -1 0 -1 -1 -1 0 1 1 -1 -1 2 0 2 2 1 3 1 1 4 1 0 1.0
-1 -1 -1 0 0 -1 -1 1 1 -1 -1 0 0 -1 1 -1 1 3 2 2 0 0 0 2 1 2 0 0 2 1 0 1.0
-1 -1 -1 0 0 -1 -1 2 1 2 0 0 0 2 -1 -1 1 1 1 0 2 0 1 2 -1 2 0 1 2 1 0 1.0
….
![Page 38: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/38.jpg)
Latent Structure Discovery
Y2: Demographic info; Y3: Tolerance toward corruption
Y4: ICAC performance; Y7: ICAC accountability
Y5: Change in level of corruption; Y6: Level of corruption
![Page 39: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/39.jpg)
Multidimensional Clustering
Y2=s0: Low income youngsters; Y2=s1: Women with no/low income
Y2=s2: people with good education and good income;
Y2=s3: people with poor education and average income
![Page 40: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/40.jpg)
Multidimensional Clustering
Y3=s0: people who find corruption totally intolerable; 57%
Y3=s1: people who find corruption intolerable; 27%
Y3=s2: people who find corruption tolerable; 15%
Interesting finding:
Y3=s2: 29+19=48% find C-Gov totally intolerable or intolerable; 5% for C-Bus
Y3=s1: 54% find C-Gov totally intolerable; 2% for C-Bus
Y3=s0: Same attitude toward C-Gov and C-Bus
People who are tough on corruption are equally tough toward C-Gov and C-Bus.
People who are relaxed about corruption are more relaxed toward C-Bus than C-GOv
![Page 41: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/41.jpg)
Multidimensional Clustering
Interesting finding: Relationship btw background and tolerance toward corruption
Y2=s2: ( good education and good income) the least tolerant. 4% tolerable
Y2=s3: (poor education and average income) the most tolerant. 32% tolerable
The other two classes are in between.
![Page 42: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/42.jpg)
Marketing DataPage 42
![Page 43: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/43.jpg)
Latent Tree Analysis of Text Data
The WebKB Data Set 1041 web pages collected from 4 CS departments in 1997
336 words
Page 43
![Page 44: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/44.jpg)
Latent Tree Model for WebKB Data by BI AlgorithmPage 44
89 latent variables
![Page 45: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/45.jpg)
Latent Tree Modes for WebKB Data
![Page 46: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/46.jpg)
Page 46
![Page 47: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/47.jpg)
Page 47
![Page 48: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/48.jpg)
LTM for Topic Detection
Topic
A latent state
A collection of document
A document can belong to multiple topics 100%
Page 48
![Page 49: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/49.jpg)
LTM vs LDA for Topic DetectionPage 49
LTM
Topic A latent state
A collection of document
A document can belong to multiple topics 100%
LDA
Topic: Distribution over the entire vocabulary.
The probabilities of the words add to one.
Document:
Distribution over topics.
If a document contains more of one topic, then it contains less of other
topics.
![Page 50: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/50.jpg)
Latent Tree Analysis Summary
Finds meaningful facets of data
Identify natural clusters along each facet.
Gives clear picture of what is in data.
Page 50
![Page 51: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/51.jpg)
LTM for Spectral Clustering
Original Data Set
Eigenvectors of Laplacian Matrix
Rounding: Eigenvectors to final partition
Page 51
![Page 52: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/52.jpg)
LTM for Spectral Clustering
Rounding:
Determine number of clusters
Determine the final partition
No good method available
LTM Method:
Page 52
![Page 53: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/53.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 53
![Page 54: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/54.jpg)
LTM and TCM
Papers N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Latent tree models and
diagnosis in traditional Chinese medicine. Artificial Intelligence in Medicine. 42:
229-245. Took 8 years
N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Statistical Validation of
TCM Theories. Journal of Alternative and Complementary Medicine, 14(5):583-7.
(Featured at TCM Wiki).
张连文 , 袁世宏,王天芳, 赵燕等 . 隐结构分析与西医疾病的辨证分型 (I):
基本原理 . 世界科学技术 --- 中医药现代化 , 13 卷 (3 期 ): 498 ~ 502,
2011.
张连文 , 许朝霞,王忆勤,刘腾飞等 . 隐结构分析与西医疾病的辨证分型(II): 综合聚类 . 世界科学技术 --- 中医药现代化 , 14 卷 (2 期 ), 2012.
Page 54
![Page 55: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/55.jpg)
LTM and TCM: Objectives Statistical validation of TCM postulates
[Review of a recent paper]
I am very interested in what these authors are trying to do. They are dealing
with an important epistemological problem.
To go from the many symptoms and signs that patients present, to construct a
consistent and other-observer identifiable constellation, is a core task of the medical
practitioner. A kind of feedback occurs between what a practitioner is taught/finds listed
in books, and what that practitioner encounters in the clinic. The better the constellation
is understood, the more accurate the clustering of symptoms, the more consistent is the
identification of syndromes among practitioners and through time. While these
constellations have been worked into widely-accepted ‘disease constructs’ for
biomedicine for some time which are widely accepted as ‘real,’ this is not quite as true
for TCM constellations. This latent variable study is interesting not only in itself, but also
as providing evidence that what TCM ‘says’ is so, shows up during analysis as
demonstrably so.
Page 55
![Page 56: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/56.jpg)
LTM and TCM: Objectives TCM postulates explain occurrence of Symptoms :
When KIDNEY YANG is in deficiency, it cannot warm the body and the patient feels cold, resulting in
intolerance to cold, cold limbs, …
Manifest variables : Directly observed: Feel cold, cold limbs
Latent variable: Not directly observed: Kidney Yang deficiency
Latent Structure: Relationships between latent variables and manifest variables
Statistical validation of TCM postulates
)
Page 56
![Page 57: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/57.jpg)
Latent Tree Analysis of Symptom Data
Similar to WebKB data Web page containing words Patient having symptoms
What will be the result of latent tree analysis? Different facets of data revealed Natural clusters along each facet identified
Each facet involves a few symptoms May correspond to a syndrome Providing validation to TCM postulates Providing evidence for syndrome differentiation
Page 57
![Page 58: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/58.jpg)
Latent Tree Model for Kidney Data
Latent structure matches relevant TCM postulate
Providing validation to TCM postulate
Page 58
![Page 59: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/59.jpg)
Latent Tree Model for Kidney Data
Work reported in N. L. Zhang, S. H. Yuan, T. Chen and Y. Wang (2008). Latent tree models and diagnosis in
traditional Chinese medicine. Artificial Intelligence in Medicine. 42: 229-245.
Email from: Bridie Andrews: Bentley University, Boston
Dominique Haughton: ditto, Fellow of American Statistics Association
Lisa Conboy: Harvard Medical School
“We are very interested in your paper on “Latent tree models and
diagnosis in traditional Chinese medicine”, and are planning to repeat
your method using some data we have here on about 270 cases of
“irritable bowel syndrome” and their differing TCM diagnoses.”
Page 59
![Page 60: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/60.jpg)
Results on Many Data Sets from 973 ProjectPage 60
![Page 61: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/61.jpg)
Providing Evidence of Syndrome Differentiation
Page 61
![Page 62: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/62.jpg)
Providing Evidence of Syndrome Differentiation
How to produce evidence for TCM syndrome diagnosis using latent
structure analysis?
Page 62
![Page 63: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/63.jpg)
Providing Evidence of Syndrome Differentiation
Imagine sub-typing WM disease D from TCM perspective
Expected conclusion : several syndromes among D patients
Also providing a basis for distinguishing syndrome Z patients from
other D patients
Page 63
![Page 64: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/64.jpg)
Picture 2Page 64
![Page 65: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/65.jpg)
Example: Model for Depression Data Page 65
![Page 66: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/66.jpg)
Example: Model for Depression Data
Evidence provided by Y8 for syndrome classification : Two classes: 有胸膈气机不畅 , 无胸膈气机不畅 Sizes of the classes: 48% , 52% ; Symptoms important for distinguishing between the two classes
(descending order of importance): 憋气、气短、胸闷 and 太息 . Others play little role
Page 66
![Page 67: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/67.jpg)
Latent Tree Analysis of Prescription Data
Data Guanganman Hospital 1287 formulae prescribed for patients with Disharmony between Liver and
Spleen
Page 67
![Page 68: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/68.jpg)
Latent Tree ModelPage 68
![Page 69: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/69.jpg)
Some Partitions ObtainedPage 69
![Page 70: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/70.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 70
![Page 71: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/71.jpg)
Pouch Latent Tree Models (PLTMs)
Probabilistic graphical model with continuous observed variables (X’s)
and discrete latent variables (Y’s).
Tree structure Bayesian network except several observed variables can
appear in the same node, a pouch.
Page 71
![Page 72: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/72.jpg)
Pouch Latent Tree Models (PLTM)
One possible PLTM for the transcript data
Page 72
PLTM generalizes Gaussian Mixture Model (GMM), which is PLTM with a
single pouch and single latent variable
![Page 73: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/73.jpg)
The UCI Image Data
Each instance represents 3x3 pixel region of an image
with 18 attributes, labeled.
Class labels were first removed and remaining data analyzed using PLTM.
Page 73
Pouches capture natural facets well: From left to right: Line-Density, Edge, Color, Coordinates
Latent variables represent clusterings
![Page 74: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/74.jpg)
The UCI Image Data
Y1 matches true class partition well.
Y3: partition based on edge and color Y4: partition based on centroid.col
Page 74
Feature curve: Normalized
MI between a latent
variables and attributes
Y2 represents a partition
along line-density facet
Y1 represents a partition
along color facet
Interesting finding: Y1
strongly correlated with
centroid.row
![Page 75: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/75.jpg)
Overview
Early Work (1992-2002)
Inference: Variable Elimination
Inference: Local Structures
Others: Learning, Decision Making
Latent Tree Models (2000 - )
Theory and Algorithms
Applications Multidimensional Clustering, Density Estimation, Latent Structure
Survey Data, Documents, Business Data
Traditional Chinese Medicine (TCM)
Extensions
Page 75
![Page 76: CCF 贝叶斯网络在中国的应用和发展学术沙龙](https://reader036.vdocuments.net/reader036/viewer/2022081415/56814664550346895db3843e/html5/thumbnails/76.jpg)
谢谢!