1 statistical predicate invention stanley kok dept. of computer science and eng. university of...

38
1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

Post on 15-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

1

Statistical Predicate Invention

Stanley KokDept. of Computer Science and Eng.

University of Washington

Joint work with Pedro Domingos

Page 2: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

2

Overview

Motivation Background Multiple Relational Clusterings Experiments Future Work

Page 3: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

3

Motivation

Statistical Learning• able to handle noisy data

Relational Learning (ILP)• able to handle non-i.i.d. data

Statistical Relational Learning

Page 4: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

4

Latent Variable Discovery[Elidan & Friedman, 2005; Elidan et al.,2001; etc.]

Predicate Invention[Wogulis & Langley, 1989; Muggleton & Buntine, 1988; etc.]

Motivation

Statistical Learning• able to handle noisy data

Relational Learning (ILP)• able to handle non-i.i.d. data

Statistical Relational LearningDiscovery of new concepts, properties, and relations

from data

Statistical Predicate Invention

Page 5: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

5

SPI Benefits

More compact and comprehensible models Improve accuracy by representing

unobserved aspects of domain Model more complex phenomena

Page 6: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

6

State of the Art

Few approaches combine statistical and relational learning Only cluster objects [Roy et al., 2006; Long et al., 2005; Xu et

al., 2005; Neville & Jensen, 2005; Popescul & Ungar 2004; etc.]

Only predict single target predicate [Davis et al., 2007; Craven & Slattery, 2001]

Infinite Relational Model [Kemp et al., 2006; Xu et al., 2006]

Clusters objects and relations simultaneously Multiple types of objects Relations can be of any arity #Clusters need not be specified in advance

Page 7: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

7

Multiple Relational Clusterings

Clusters objects and relations simultaneously Multiple types of objects Relations can be of any arity #Clusters need not be specified in advance Learns multiple cross-cutting clusterings Finite second-order Markov logic First step towards general framework for SPI

Page 8: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

8

Overview

Motivation Background Multiple Relational Clusterings Experiments Future Work

Page 9: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

9

Markov Logic Networks (MLNs)

A logical KB is a set of hard constraintson the set of possible worlds

Let’s make them soft constraints:When a world violates a formula,it becomes less probable, not impossible

Give each formula a weight(Higher weight Stronger constraint)

satisfiesit formulas of weightsexpP(world)

Page 10: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

10

Markov Logic Networks (MLNs)

Vector of truth assignments to ground atoms

Partition function. Sums over all possibletruth assignments to ground atoms

Weight of ith formula

#true groundings of ith formula

Page 11: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

11

Overview

Motivation Background Multiple Relational Clusterings Experiments Future Work

Page 12: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

12

Multiple Relational Clusterings

Invent unary predicate = Cluster Multiple cross-cutting clusterings Cluster relations by objects they relate

and vice versa Cluster objects of same type Cluster relations with same arity and

argument types

Page 13: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

13

Example of Multiple Clusterings

BobBill

AliceAnna

CarolCathy

EddieElise

DavidDarren

FelixFaye

HalHebe

GeraldGigi

IdaIris

Friends

Friends

Friends

Predictiveof hobbies

Co-workers Co-workers Co-workers

Predictive of skillsSome are friendsSome are co-workers

Page 14: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

14

Second-Order Markov Logic

Finite, function-free Variables range over relations (predicates)

and objects (constants) Ground atoms with all possible predicate

symbols and constant symbols Represent some models more compactly

than first-order Markov logic Specify how predicate symbols are

clustered

Page 15: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

15

Symbols

Cluster: Clustering: Atom: ,

Cluster combination:

Page 16: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

16

MRC Rules

Each symbol belongs to at least one cluster

Symbol cannot belong to >1 cluster in same clustering

Each atom appears in exactly one combination of clusters

Page 17: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

17

MRC Rules

Atom prediction rule: Truth value of atom is determined by cluster combination it belongs to

Exponential prior on number of clusters

Page 18: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

18

Learning MRC Model

Learning consists of finding Cluster assignment assignment

of truth values to all and atoms

Weights of atom prediction rules

Vector of truth assignments to all observed ground atoms

that maximize log-posterior probability

Page 19: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

19

Learning MRC Model

Three hard rules + Exponential prior rule

Page 20: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

20

Learning MRC Model

Atom prediction rules

Smoothing parameter

Wt of rule is log-odds of atomin its cluster combination being true

Can be computed in closed form

#true & #false atomsin cluster combination

Page 21: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

21

Search Algorithm

Approximation: Hard assignment of symbols to clusters

Greedy with restarts Top-down divisive refinement algorithm Two levels

Top-level finds clusterings Bottom-level finds clusters

Page 22: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

22

PQ

RS

T

W

Search Algorithm

VU

PQ

RS

T

W

a

bc d

hg

fe

Inputs: sets ofpredicate symbols

constantsymbols

Greedy search with restarts

Outputs: Clustering of each set of symbols

Page 23: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

23

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

predicate symbols

constantsymbols

Greedy search with restarts

Outputs: Clustering of each set of symbols

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

Recurse for every cluster combination

Search AlgorithmInputs: sets of

Page 24: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

24

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

Recurse for every cluster combination

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

PQ

RS

a

bc d

PQ R

SP

QR

Sa b

c d ab

c d

Search Algorithm

hg

fe

Q

R

P

S

Q

R

P

S

hg

fe

hg

fe

Terminate when no refinement improves MAP score

predicate symbols

constantsymbolsInputs: sets of

Page 25: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

25

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

PQ

RS

a

bc d

PQ R

SP

QR

Sa b

c d ab

c d

Search Algorithm

hg

fe

Q

R

P

S

Q

R

P

S

hg

fe

hg

fe

Leaf ≡ atom prediction rule Return leaves8r, x r 2 r Æ x 2 x ) r(x)

Page 26: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

26

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

PQ

RS

T

WVU

PQ

RS

T

W

a

bc d

hg

fe

PQ

RS

VU

T

W

PQ

RS

VU

T

W

a

bc d

hg

fe

a

bc d

hg

fe

PQ

RS

a

bc d

PQ R

SP

QR

Sa b

c d ab

c d

Search Algorithm

hg

fe

Q

R

P

S

Q

R

P

S

hg

fe

hg

fe

: Multiple clusterings

Search enforces hard rules Limitation: High-level clusters constrain lower ones

Page 27: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

27

Overview

Motivation Background Multiple Relational Clusterings Experiments Future Work

Page 28: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

28

Datasets

Animals Sets of animals and their features, e.g., Fast(Leopard) 50 animals, 85 features 4250 ground atoms; 1562 true ones

Unified Medical Language System (UMLS) Biomedical ontology Binary predicates, e.g., Treats(Antibiotic,Disease) 49 relations, 135 concepts 893,025 ground atoms; 6529 true ones

Page 29: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

29

Datasets

Kinship Kinship relations between members of an

Australian tribe: Kinship(Person,Person) 26 kinship terms, 104 persons 281,216 ground atoms; 10,686 true ones

Nations Set of relations among nations,

e.g.,ExportsTo(USA,Canada) Set of nation features, e.g., Monarchy(UK) 14 nations, 56 relations, 111 features 12,530 ground atoms; 2565 true ones

Page 30: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

30

Methodology

Randomly divided ground atoms into ten folds 10-fold cross validation Evaluation measures

Average conditional log-likelihood of test ground atoms (CLL)

Area under precision-recall curve of test ground atoms (AUC)

Page 31: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

31

Methodology

Compared with IRM [Kemp et al., 2006] and MLN structure learning (MSL) [Kok & Domingos, 2005]

Used default IRM parameters; run for 10 hrs MRC parameters and both set to 1 (no tuning) MRC run for 10 hrs for first level of clustering MRC subsequent levels permitted 100 steps

(3-10 mins) MSL run for 24 hours; parameter settings in online

appendix

Page 32: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

32

Results

-0.43 -0.43 -0.42

-0.54

-0.60

-0.50

-0.40

-0.30

-0.011

-0.004

-0.017

-0.025

-0.030

-0.020

-0.010

0.000

-0.06

-0.05

-0.08

-0.07

-0.10

-0.08

-0.06

-0.04

-0.31 -0.31-0.33 -0.33

-0.50

-0.40

-0.30

-0.20

0.79 0.80 0.80

0.68

0.00

0.20

0.40

0.60

0.80

1.00

0.80

0.97

0.64

0.47

0.00

0.20

0.40

0.60

0.80

1.00

0.68

0.85

0.49

0.60

0.00

0.20

0.40

0.60

0.80

1.00

0.75 0.75 0.730.77

0.00

0.20

0.40

0.60

0.80

1.00

CL

L CL

L CL

L CL

L

AU

C AU

C AU

C AU

C

Animals UMLS Kinship Nations

IRM MRC MSLInit IRM MRC MSLInit IRM MRC MSLInit IRM MRC MSLInit

Animals UMLS Kinship Nations

IRM MRC MSLInit IRM MRC MSLInit IRM MRC MSLInit IRM MRC MSLInit

Page 33: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

33

Multiple Clusterings Learned

VirusFungus

BacteriumRickettsia

Invertebrate

AlgaPlant

Archaeon

AmphibianBirdFish

HumanMammalReptile

VertebrateAnimal

Bioactive SubstanceBiogenic Amine

Immunologic FactorReceptor

Found In

¬ Found In

DiseaseCell Dysfunction

Neoplastic Process

Causes

¬ Causes

Page 34: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

34

Multiple Clusterings Learned

VirusFungus

BacteriumRickettsia

Invertebrate

AlgaPlant

Archaeon

AmphibianBirdFish

HumanMammalReptile

VertebrateAnimal

Is A

¬ Is A

DiseaseCell Dysfunction

Neoplastic Process

Causes

¬ Causes

Page 35: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

35

Multiple Clusterings Learned

VirusFungus

BacteriumRickettsia

Invertebrate

AlgaPlant

Archaeon

AmphibianBirdFish

HumanMammalReptile

VertebrateAnimal

Bioactive SubstanceBiogenic Amine

Immunologic FactorReceptor

Found In

¬ Found In

DiseaseCell Dysfunction

Neoplastic Process

Causes

¬ Causes

Is A

¬ Is A

Page 36: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

36

Overview

Motivation Background Multiple Relational Clusterings Experiments Future Work

Page 37: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

37

Future Work

Experiment on larger datasets, e.g., ontology induction from web text

Use clusters learned as primitives in structure learning

Learn a hierarchy of multiple clusterings and performing shrinkage

Cluster predicates with different arities and argument types

Speculation: all relational structure learning can be accomplished with SPI alone

Page 38: 1 Statistical Predicate Invention Stanley Kok Dept. of Computer Science and Eng. University of Washington Joint work with Pedro Domingos

38

Conclusion

Statistical Predicate Invention: key problem for statistical relational learning

Multiple Relational Clusterings First step towards general framework for SPI Based on finite second-order Markov logic Creates multiple relational clusterings of the

symbols in data Empirical comparison with MLN structure learning

and IRM shows promise