practical probabilistic relational learning sriraam natarajan

20
Practical Probabilistic Relational Learning Sriraam Natarajan

Upload: malcolm-lester

Post on 28-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Practical Probabilistic Relational Learning Sriraam Natarajan

Practical Probabilistic Relational Learning

Sriraam Natarajan

Page 2: Practical Probabilistic Relational Learning Sriraam Natarajan

Take-Away Message

Learn from rich, highly structured data!

Page 3: Practical Probabilistic Relational Learning Sriraam Natarajan

Traditional Learning

+

DataAttributes(Features)

Data is i.i.d.

B E A M J

1 0 1 1 0

0 0 0 0 1

. . .

0 1 1 0 1

Earthquake

Alarm

Burglary

MaryCalls

JohnCalls

Page 4: Practical Probabilistic Relational Learning Sriraam Natarajan

Learning

Earthquake

Alarm

Burglary

MaryCalls JohnCalls

0.08 0.92 0.01 0.99

0.1 0.9

0.55 0.45

0.6 0.4

0.95 0.05

0.3 0.7

0.8 0.2

0.1 0.9

0.9 0.1

Page 5: Practical Probabilistic Relational Learning Sriraam Natarajan

PatientID Date Prescribed Date Filled Physician Medication Dose Duration

P1 5/17/98 5/18/98 Jones prilosec 10mg 3 months

PatientID SNP1 SNP2 … SNP500K

P1 AA AB BB P2 AB BB AA

Real-World Problem: Predicting Adverse Drug Reactions

PatientID Gender Birthdate

P1 M 3/22/63

PatientID Date Physician Symptoms Diagnosis

P1 1/1/01 Smith palpitations hypoglycemic P1 2/1/03 Jones fever, aches influenza

PatientID Date Lab Test Result

P1 1/1/01 blood glucose 42 P1 1/9/01 blood glucose 45

Pati

en

t Ta

ble

Vis

it T

ab

le

Lab

Tests

SN

P T

ab

le

Pre

scri

pti

on

s

Page 6: Practical Probabilistic Relational Learning Sriraam Natarajan

Logic + Probability = Probabilistic Logic aka Statistical Relational Learning Models

Logic

Probabilities

Add ProbabilitiesStatistical Relational

Learning (SRL)

• Several previous SRL Workshops in the past decade• This year – StaRAI @ AAAI 2013

Add Relations

Page 7: Practical Probabilistic Relational Learning Sriraam Natarajan

PropositionalLogic

First Order Logic

Statistical Relational Learning

Probability Theory Probabilistic Logic

Inductive Logic Programming

Classical MachineLearning

Prop Rule Learning

Deterministic

Stochastic

Learning

No Learning

Prop FO

Page 8: Practical Probabilistic Relational Learning Sriraam Natarajan

Costs and Benefits of the SRL soup

BenefitsRich pool of different languagesVery likely that there is a language that fits your task

at hand wellA lot research remains to be done, ;-)

Costs“Learning” SRL is much harderNot all frameworks support all kinds of inference and

learning settings

How do we actually learn relational models from data?

Page 9: Practical Probabilistic Relational Learning Sriraam Natarajan

Why is this problem hard?

Non-convex problem Repeated search of parameters for every step in

induction of the model First-order logic allows for different levels of

generalization Repeated inference for every step of parameter

learningInference is P# complete

How can we scale this?

Page 10: Practical Probabilistic Relational Learning Sriraam Natarajan

Relational Probability Trees

Each conditional probability distribution can be learned as a tree

Leaves are probabilities The final model is the

set of the RRTs

male(X)

chol(X,Y,L), Y>40,L>200

diag(X,Hypertension,Z),Z>55

bmi(X,W,55), W>30

0.8

0.77

0.05

0.3

noyes

noyes

no

no

yes

yes[Blockeel & De Raedt ’98]

To predict heartAttack(X)

Page 11: Practical Probabilistic Relational Learning Sriraam Natarajan

Gradient (Tree) Boosting [Friedman Annals of Statistics 29(5):1189-1232, 2001]

Models = weighted combination of a large number of small trees (models) Intuition: Generate an additive model by sequentially fitting small trees to

pseudo-residuals from a regression at each iteration…

Data

Predictions

- Residuals=Data

+Loss fct

Initial Model+

++

Induce

Iterate

Final Model =

+ + + +…

Page 12: Practical Probabilistic Relational Learning Sriraam Natarajan

Boosting Results – MLJ 11Algo Likelihood AUC-ROC AUC-PR Time

Boosting 0.810 0.961 0.930 9sMLN 0.730 0.535 0.621 93 hrs

Predicting the advisor for a

student

Movie Recommendation

Citation Analysis Machine Reading

Page 13: Practical Probabilistic Relational Learning Sriraam Natarajan

Other Applications

Similar Results in several other problems Imitation Learning – Learning how to act from

demonstrations (Natarajan et al IJCAI ‘11) Robocup, a grid world domain, traffic signal domain and blocksworld

Prediction of CAC Levels – Predicting cardio-vascular risks in young adults (Natarajan et al – IAAI 13)

Prediction of heart attacks (Weiss et al – IAAI 12, AI Magazine 12)

Prediction of onset of Alzheimer’s (Natarajan et al ICMLA ’12, Natarajan et al IJMLC 2013)

Page 14: Practical Probabilistic Relational Learning Sriraam Natarajan

Parallel Lifted Learning

Page 15: Practical Probabilistic Relational Learning Sriraam Natarajan

Stochastic ML

Statistical Relational

Scales well, stochastic gradients, online learning, …

Symmetries, compact models, lifted inference, ….

ParallelSymmetries, compact models, lifted inference, ….

Page 16: Practical Probabilistic Relational Learning Sriraam Natarajan

Symmetry based inference

Page 17: Practical Probabilistic Relational Learning Sriraam Natarajan

1

3

5

42 3

2

1

4

5

1

3

5

42

1

3

5

42

P(Anna) HI (Bob)

P(Bob)HI(Anna)

root clause

P(Anna) !P(Bob)

neighboring clauses

P(Anna) => !HI(Bob)

P(Anna) => HI(Anna)

P(Bob) => HI(Bob)

P(Bob) => !HI(Anna)

Tree (set of clauses)

P(Anna)!P(Bob)P(Bob)=> HI(Bob)P(Bob)=> !HI(Anna)

Variabilized tree

P(X)!P(Y)P(Y)=> HI(Y)P(Y)=> !HI(X)

Page 18: Practical Probabilistic Relational Learning Sriraam Natarajan

Lifted TrainingGenerate tree

pieces from corresponding

patterns.

Compute gradient using lifted BP

Update covariance matrix C or some low rank variant

Update parameter vector and the corresponding

equations

Randomly draw mini-batches

Generate initial tree pieces and

variablize its arguments.

Page 19: Practical Probabilistic Relational Learning Sriraam Natarajan

Challenges

Message schedules Iterative Map-reduce? How do we take this idea to learning the

models?How can we more efficiently parallelize

symmetry identification?What are the compelling problems? Vision,

NLP,…

Page 20: Practical Probabilistic Relational Learning Sriraam Natarajan

Conclusion

The world is inherently relational and uncertain SRL has developed into an exciting field in the past decade

Several previous SRL workshops Boosting Relational models has promising initial results

Applied to several different problems First scalable relational learning algorithm How can we parallelize/scale this algorithm? Can this benefit from an inference algorithm like Belief

Propagation that can be parallelized easily?