![Page 1: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/1.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Unsupervised Learning of Probabilistic Context-Free Grammar
Using Iterative Biclustering
Kewei Tu and Vasant HonavarArtificial Intelligence Research Laboratory
Department of Computer ScienceIowa State University
www.cs.iastate.edu/~honavar/aigroup.htmlwww.cild.iastate.edu
![Page 2: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/2.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Unsupervised Learning of Probabilistic Context-Free Grammar
• Greedy search to maximize the posterior of the grammar given the corpus
• Iterative (distributional) biclustering• Competitive experimental results
![Page 3: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/3.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Outline
• Introduction• Probabilistic Context Free Grammars (PCFG)• The Algorithm based on Iterative Biclustering (PCFG-BCL)• Experimental results
![Page 4: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/4.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Motivation
• Probabilistic Context-Free Grammar (PCFG) find applications in many areas including:• Natural Language Processing• Bioinformatics
• Important to learn PCFG from data (training corpus)• Labeled corpus not always available• Hence the need for unsupervised learning
![Page 5: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/5.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Task
• Unsupervised learning of a PCFG from a positive corpus
a square is above the triangle the square rolls a triangle rolls the square rolls a triangle is above the square a circle touches a square the triangle covers the circle ……
S NP VPNP Det NVP Vt NP (0.3) | Vi PP (0.2) | rolls (0.2) | bounces (0.1)……
![Page 6: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/6.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG
• Context-free Grammar (CFG)• G = (N, Σ, R, S)
• N: non-terminals• Σ: terminals• R: rules• SN : the start symbol
• Probabilistic CFG• Probabilities on grammar rules
![Page 7: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/7.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
P-CNF
• Probabilistic Chomsky normal form (P-CNF)• Two types of rules:
• ABC• Aa
![Page 8: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/8.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
The AND-OR form
• P-CNF in the AND-OR form• Two types of non-terminals: AND, OR
• AND OR1 OR2• OR A1 | A2 | a1 | a2 | ……
• with probabilities
![Page 9: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/9.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
The AND-OR form
• P-CNF in the AND-OR form
![Page 10: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/10.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
The AND-OR form
• P-CNF in the AND-OR form can be divided into two parts• Start rules
• S…
• A set of AND-OR groups• Each group: AND OR1 OR2• Bijection between ANDs and groups• An OR may appear in multiple groups
![Page 11: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/11.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
The AND-OR form
• P-CNF in the AND-OR form can be divided into two parts
![Page 12: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/12.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Outline
• Introduction• Probabilistic Context Free Grammars (PCFG)• The Algorithm based on Iterative Biclustering (PCFG-
BCL)• Experimental results
![Page 13: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/13.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG-BCL: Outline
• Start with only the terminals• Repeat the two steps
• Learn a new AND-OR group by biclustering
• Attach the new AND to existing ORs
• Post-processing: add start rules
• In principle, these steps are sufficient for learning any CNF grammar
![Page 14: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/14.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG-BCL: Outline
• Find new rules that yield the greatest increase in the posterior of the grammar given the corpus
• Local search, with the posterior as the objective function• Use a prior that favors simpler grammars to avoid
overfitting
![Page 15: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/15.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG-BCL
• Repeat the two steps• Learn a new AND-OR group by biclustering
• Attach the new AND to existing ORs
• Post-processing: add start rules
![Page 16: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/16.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Intuition
• Construct a table T• Index the rows and columns by symbols appearing in the
corpus
• The cell at row x and column y records the number of times the pair xy appears in the corpus
![Page 17: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/17.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
An AND-OR group corresponds to a bicluster
![Page 18: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/18.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
The bicluster is multiplicatively coherent
for any two rows i,j and two columns k,l
![Page 19: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/19.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Expression-context matrix of a bicluster
• Each row: a symbol pair contained in the bicluster• Each column: a context in which the symbol pairs appear
in the corpus
It’s also multiplicatively coherent.
![Page 20: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/20.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Intuition
• If there’s a bicluster that is multiplicatively coherent and has a multiplicatively coherent expression-context matrix
• Then an AND-OR group can be learned from the bicluster
![Page 21: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/21.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Probabilistic Justification
• Change in likelihood as a result of adding an AND-OR group to a PCFG
Bicluster multiplicative coherence
Expression-context matrix multiplicative coherence
![Page 22: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/22.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Prior
• To prevent overfitting, use a prior that favors simpler grammars
• P(G) 2DL(G)
• DL(G) is the description length of the grammar
![Page 23: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/23.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Learning a new AND-OR group by biclustering
• find in the table T a bicluster that leads to the maximal posterior gain
• create a new AND-OR group from the bicluster• reduce the corpus using the new rules
• E.g., “the circle” is rewritten to the new AND symbol• update T
• A new row and column are added for the new AND symbol
![Page 24: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/24.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG-BCL
• Repeat the two steps• Learn a new AND-OR group by biclustering
• Attach the new AND to existing ORs
• Post-processing: add start rules
![Page 25: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/25.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Attaching the new AND under existing ORs
• For the new AND symbol N …• There may exist OR symbols in the learned grammar, s.t.
ON is in the target grammar
• Such rules can't be learned in the biclustering step• When learning O, N doesn’t exist• When learning N, only learn NAB
• We need an additional step to find such rules• Recursion is learned in this step
![Page 26: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/26.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Intuition
• Adding rule ON
= adding a new row/column to the bicluster• If ON is true, then
• the expanded bicluster is multiplicatively coherent
• the expanded expression-context matrix is multiplicatively coherent
• If we find an OR symbol s.t. the expanded bicluster has this property
• Then a new rule ON can be added to the grammar
![Page 27: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/27.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Probabilistic Justification
• Likelihood gain
is an approximation of the expanded bicluster
• To prevent overfitting, the prior is also considered
![Page 28: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/28.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Attaching the new AND under existing ORs
• Try to find OR symbols that lead to large posterior gain• When found
• add the new rule ON to the grammar
• do a maximal reduction of the corpus
• update the table T
![Page 29: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/29.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
PCFG-BCL
• Repeat the two steps• Learn a new AND-OR group by biclustering
• Attach the new AND to existing ORs
• Post-processing: add start rules
![Page 30: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/30.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Postprocessing
• For each sentence in the corpus:• If it’s fully reduced to a single symbol x, then add Sx
• If not, a few options…
• Return the grammar
![Page 31: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/31.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Outline
• Introduction• Probabilistic Context Free Grammars (PCFG)• The Algorithm based on Iterative Biclustering (PCFG-
BCL)• Experimental results
![Page 32: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/32.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Experiments
• Measurements• weak generative capacity
• precision, recall, F-score
• Test data• artificial, English-like CFGs
![Page 33: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/33.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Experiment results
P=Precision, R=Recall, F=F-score
Number in the parentheses: standard deviation
• PCFG-BCL outperforms EMILE and ADIOS• with lower standard deviations
[Adriaans, et al., 2000] [Solan, et al., 2005]
![Page 34: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/34.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Summary
• An unsupervised PCFG-learning algorithm • It acquires new grammar rules by iterative biclustering on a
table of symbol pairs
• In each step it tries to maximize the increase of the posterior of the grammar
• Competitive experimental results
![Page 35: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/35.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Work in progress
• Alternative strategies for optimizing the objective function• Evaluation on and adaptation to real world applications
(e.g., natural language), wrt. both weak and strong generative capacity
![Page 36: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/36.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Thank you~
![Page 37: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/37.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Backup…
![Page 38: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/38.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Step 1
Bicluster multiplicative coherence
E-C matrix multiplicative coherence
Prior gain (bias towards large BC)
Likelihood Gain
Posterior gain:
![Page 39: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/39.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Step 2 Intuition
• Remember O is learned by extracting a bicluster• adding rule ON
= adding a new row/column to the bicluster
![Page 40: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/40.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Expanding the bicluster
The expanded bicluster should still be multiplicatively coherent
![Page 41: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/41.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Step 2 Intuition
• Expression-context matrix• adding rule ON
= adding a set of new rows to the E-C matrix
![Page 42: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/42.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Expanding the expression-context matrix
The expanded expression-context matrix should still be multiplicatively coherent.
![Page 43: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/43.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Step 2
• Likelihood gain:
: the expected numbers of appearance of the symbol pairs when applying the current grammar to expand the current partially reduced corpus.
![Page 44: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/44.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Grammar selection/averaging
• Run the algorithm for multiple times to get multiple grammars
• Use the posterior of the grammars to do model selection/averaging
• Experimental results:• Improved the performance
• Decreased the standard deviations
![Page 45: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/45.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Time Complexity
• N: # of ANDs• k: average # of rules headed by an OR• c: average column# of Expr-Cont Matrix• h: average # of ORs that produce an AND or terminal• d: a recursion depth limit• ω: sentence# in the corpus• m: average sentence length
)( 2122 mchckNO d
![Page 46: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/46.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
biclustering vs. distributional clustering
V1 makes | likesV2 likes | is
Figure from [Adriaans, et al., 2000]
![Page 47: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/47.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
biclustering vs. substitutability heuristic
N1 tea | coffeeN2 eating
Figure from [Adriaans, et al., 2000]
![Page 48: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/48.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
![Page 49: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/49.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
A set of multiplicatively coherent biclusters, which represent a set of AND-OR groups in the grammar.
![Page 50: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/50.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Related work
• Unsupervised CFG learning• EMILE [Adriaans et al., 2000]
• ABL [Zaanen, 2000]
• [Clark, 2001; 2007]
• ADIOS [Solan et al., 2005]
• Main difference• Distributional biclustering
• A unified method for different types of rules
![Page 51: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/51.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Related work
• Unsupervised PCFG learning• Inside-outside
• [Stolcke&Omohundro, 1994]
• [Chen 1995]
• [Kurihara&Sato, 2004; 2006]
• [Liang et al., 2007]
• Main difference• Different prior
• Structure search method
![Page 52: Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering](https://reader033.vdocuments.net/reader033/viewer/2022051401/56812cd8550346895d9194a1/html5/thumbnails/52.jpg)
Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
Iowa State UniversityDepartment of Computer Science, Iowa State UniversityArtificial Intelligence Research Laboratory
Center for Computational Intelligence, Learning, and Discovery CCILD
Related work
• Unsupervised parsing (not CFG)• [Klein&Manning, 2002; 2004]
• U-DOP [Bod, 2006]