random regression: example target query: p(gender(sam) = f)? sam is friends with bob and anna....

2
Random Regression: Example Target Query: P(gender(sam) = F) ? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi, Tianxiang Gao and Yuke Zhu School of Computing Science Simon Fraser University, Vancouver, Canada Project Website: http://www.cs.sfu.ca/~oschulte/jbn/ Bayes Net Inference: The Cyclicity Problem In the presence of recursive dependencies (autocorrelations), a ground first-order Bayes net template may contain cycles (Schulte et al. 2012, Domingos and Richardson 2007). Overview How to define Bayes net relational inference with ground cycles? 1. Define Markov blanket probabilities: the probability of a target node value given its Markov blanket. 2. Random regression: compute the expected probability over a random instantiation of the Markov blanket. 3. Closed-form result: equivalent to a log-linear regression model that uses Markov blanket feature frequencies rather than counts. 4. Random regression works well with Bayes net parameters very fast parameter estimation. References 1.H. Khosravi, O. Schulte, T. Man, X. Xu, B. Bina, Structure learning for Markov logic networks with many descriptive attributes, AAAI, 2010, pp. 487–493. 2.O. Schulte and H. Khosravi. Learning graphical models for relational data via lattice search. Machine Learning, 88:3, 331-368, 2012. 3.Schulte, O.; Khosravi, H. & Man, T. Learning Directed Relational Models With Recursive Dependencies Machine Learning, 2012, Forthcoming. 4.Domingos, P., Richardson, M.: Markov logic: A unifying framework for statistical relational learning. in Statistical Relational Learning, 2007. Evaluation Use Learn-and-Join Algorithm for Bayes net structure learning (Khosravi et al. 2010, Schulte and Khosravi 2012). MBN Convert Bayes net to Markov net, use Alchemy to learn weights in log-linear model with Markov Blanket feature counts. CP+Count Use log-conditional probs as weights in log-linear model with Markov Blanket feature counts. CP+Frequency. Use log-conditional probs as weights in log-linear model with Markov Blanket feature frequencies (= random regression). Learning Times Conditional Log-likelihood Quick Summary Plot • Average performance over databases • Smaller Learning Time is better. • Bigger CLL is better. Conclusion • Random regression: principled way to define relational Bayes net inferences even with ground cycles. • Closed form evaluation: log-linear model with feature frequencies. • Bayes net conditional probabilities are fast to compute, interpretable and local. • Using feature frequencies rather than counts addresses the balancing problem: in the count model, Relational Random Regression for Bayes Nets Closed Form Proposition The random regression value can be obtained by multiplying the probability associated with each Markov blanket state, raised to the frequency of the state. Example: P(g(sam) = F|mb) = α P(cd(sam) = T|gd(sam) = F) x [P(g(sam) = F|g(anna)=F, Fr(sam,anna) = T) x P(g(sam) = F|g(bob) = M, Fr(sam,bob) = T)] 1/2 = 70% x [60% x 40%] 1/2 = 0.34 = e -1.07 Relational regression in graphical models Bayes net dependency net, use geometric mean as combining rule = log-linear model with frequencies = random regression. Bayes net Markov net, use standard Markov network regression = log-linear model with counts. Example: P(g(sam) = F|mb) = α 70% x 60% x 40% = 0.168. Methods Compared coffee_dr(sam) Friend(sam,Y) gender(sam) gender(Y) Bayes Net Markov Net Dependency Net Log-linear Model with Frequencies random regression Log-linear Model with Counts produ ct geo.mea n coffee_dr(X) Friend(X,Y) gender(X) gender(Y) P(g(X) = F |g(Y) =F, F(X,Y) = T)= .6 P(g(X) = M|g(Y) = M, F(X,Y) = T) = .6 ... P(cd(X) = T|g(X) = F) = .7 P(cd(X) = T|g(X) = M) = .3 Name Gender Coffee Drinke r Anna F T Sam ? F Bob M F Regression Graph

Upload: junior-hancock

Post on 18-Dec-2015

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Random Regression: Example Target Query: P(gender(sam) = F)? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi,

Random Regression: Example

Target Query: P(gender(sam) = F)?

Sam is friends with Bob and Anna. Unnormalized Probability:

Oliver Schulte, Hassan Khosravi, Tianxiang Gao and Yuke ZhuSchool of Computing Science

Simon Fraser University, Vancouver, CanadaProject Website: http://www.cs.sfu.ca/~oschulte/jbn/

Bayes Net Inference: The Cyclicity Problem

In the presence of recursive dependencies (autocorrelations), a ground first-order Bayes net template may contain cycles (Schulte et al. 2012, Domingos and Richardson 2007).

OverviewHow to define Bayes net relational inference with ground cycles?1. Define Markov blanket probabilities: the probability of a target

node value given its Markov blanket.2. Random regression: compute the expected probability over a

random instantiation of the Markov blanket.3. Closed-form result: equivalent to a log-linear regression model

that uses Markov blanket feature frequencies rather than counts.4. Random regression works well with Bayes net parameters

very fast parameter estimation.

References1. H. Khosravi, O. Schulte, T. Man, X. Xu, B. Bina, Structure learning for

Markov logic networks with many descriptive attributes, AAAI, 2010, pp. 487–493.

2. O. Schulte and H. Khosravi. Learning graphical models for relational data via lattice search. Machine Learning, 88:3, 331-368, 2012.

3. Schulte, O.; Khosravi, H. & Man, T. Learning Directed Relational Models With Recursive Dependencies Machine Learning, 2012, Forthcoming.

4. Domingos, P., Richardson, M.: Markov logic: A unifying framework for statistical relational learning. in Statistical Relational Learning, 2007.

Evaluation• Use Learn-and-Join Algorithm for Bayes net structure learning

(Khosravi et al. 2010, Schulte and Khosravi 2012).

• MBN Convert Bayes net to Markov net, use Alchemy to learn weights in log-linear model with Markov Blanket feature counts.

• CP+Count Use log-conditional probs as weights in log-linear model with Markov Blanket feature counts.

• CP+Frequency. Use log-conditional probs as weights in log-linear model with Markov Blanket feature frequencies (= random regression).

Learning Times

ConditionalLog-likelihood

Quick Summary Plot

• Average performance over databases

• Smaller Learning Time is better.

• Bigger CLL is better.

Conclusion• Random regression: principled way to define relational Bayes net inferences even with ground cycles.

• Closed form evaluation: log-linear model with feature frequencies.

• Bayes net conditional probabilities are fast to compute, interpretable and local.

• Using feature frequencies rather than counts addresses the balancing problem: in the count model, features with more groundings carry exponentially more weights.

Relational Random Regression for Bayes Nets

Closed Form

Proposition The random regression value can be obtained by multiplying the probability associated with each Markov blanket state, raised to the frequency of the state. Example:

P(g(sam) = F|mb) =α P(cd(sam) = T|gd(sam) = F) x[P(g(sam) = F|g(anna)=F, Fr(sam,anna) = T) xP(g(sam) = F|g(bob) = M, Fr(sam,bob) = T)] 1/2 =70% x [60% x 40%] 1/2 = 0.34 = e-1.07

Relational regression in graphical models

• Bayes net dependency net, use geometric mean as combining rule = log-linear model with frequencies = random regression.

• Bayes net Markov net, use standard Markov network regression = log-linear model with counts. Example:

P(g(sam) = F|mb) = α 70% x 60% x 40% = 0.168.

Methods Compared

coffee_dr(sam)

Friend(sam,Y)

gender(sam)

gender(Y)

Bayes Net

Markov Net

Dependency Net

Log-linear Model with Frequencies

random regression

Log-linear Model with Counts

product

geo.mean

coffee_dr(X)

Friend(X,Y)

gender(X)

gender(Y)

P(g(X) = F |g(Y) =F, F(X,Y) = T)= .6P(g(X) = M|g(Y) = M, F(X,Y) = T) = .6...

P(cd(X) = T|g(X) = F) = .7P(cd(X) = T|g(X) = M) = .3

Name Gender CoffeeDrinker

Anna F T

Sam ? F

Bob M F

Regression Graph

Page 2: Random Regression: Example Target Query: P(gender(sam) = F)? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi,

coffee_dr(sam)

Friend(sam,Y)

gender(sam)

gender(Y)

P(g(X) = F |g(Y) =F, F(X,Y) = T)= .6P(g(X) = M|g(Y) = M, F(X,Y) = T) = .6...

P(cd(X) = T|g(X) = F) = .7P(cd(X) = T|g(X) = M) = .3

Schulte, O.; Khosravi, H. & Man, T. Learning Directed Relational Models With Recursive Dependencies Machine Learning, 2012, Forthcoming.

Domingos, P., Richardson, M.: Markov logic: A unifying framework for statistical relational learning.

Name Gender CoffeeDrinker

Anna F T

Sam ? F

Bob M F

People

Name Gender CoffeeDrinker

Anna F T

Sam ? F

Bob M F

Regression Graph:

Regression Graph: