markov logic networks: a step towards a unified theory of learning and cognition

23
Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner

Upload: nenet

Post on 16-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition. Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner. One Algorithm. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Markov Logic Networks:A Step Towards a Unified Theory of

Learning and Cognition

Pedro DomingosDept. of Computer Science & Eng.

University of Washington

Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner

Page 2: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

One Algorithm

Observation:The cortex has the samearchitecture throughout

Hypothesis: A singlelearning/inferencealgorithm underliesall cognition

Let’s discover it!

Page 3: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

The Neuroscience Approach Map the brain and figure out how it works Problem: We don’t know nearly enough

Page 4: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

The Engineering Approach Pick a task (e.g., object recognition) Figure out how to solve it Generalize to other tasks Problem: Any one task is too impoverished

Page 5: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

The Foundational Approach

Consider all tasks the brain does Figure out what they have in common Formalize, test and repeat Advantage: Plenty of clues Where to start?

The grand aim of science is to cover the greatest number of experimental facts by logical deduction from the smallest number of hypotheses or axioms.

Albert Einstein

Page 6: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Recurring Themes

Noise, uncertainty, incomplete information

→ Probability / Graphical models

Complexity, many objects & relations

→ First-order logic

Page 7: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Examples

Field Logical approach

Statistical approach

Knowledge representation

First-order logic Graphical models

Automated reasoning

Satisfiability testing

Markov chain Monte Carlo

Machine learning Inductive logic programming

Neural networks

Planning Classical planning

Markov decision processes

Natural language

processing

Definite clause grammars

Prob. context-free grammars

Page 8: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Research Plan

Unify graphical models and first-order logic Develop learning and inference algorithms Apply to wide variety of problems This talk: Markov logic networks

Page 9: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Markov Logic Networks

MLN = Set of 1st-order formulas with weights Formula = Feature template (Vars→Objects) E.g., Ising model:

Most graphical models are special cases First-order logic is infinite-weight limit

Weight of formula i No. of true instances of formula i in x

iii xnw

ZxP )(exp

1)(

Up(x) ^ Neighbor(x,y) => Up(y)

Page 10: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

MLN Algorithms:The First Three Generations

Problem First generation

Second generation

Third generation

MAP inference

Weighted satisfiability

Lazy inference

Cutting planes

Marginal inference

Gibbs sampling

MC-SAT Lifted inference

Weight learning

Pseudo-likelihood

Voted perceptron

Scaled conj. gradient

Structure learning

Inductive logic progr.

ILP + PL (etc.)

Clustering + pathfinding

Page 11: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Weighted Satisfiability

SAT: Find truth assignment that makes allformulas (clauses) true Huge amount of research on this problem State of the art: Millions of vars/clauses in minutes

MaxSAT: Make as many clauses true as possible Weighted MaxSAT: Clauses have weights;

maximize satisfied weight MAP inference in MLNs is just weighted MaxSAT Best current solver: MaxWalkSAT

Page 12: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

MC-SAT

Deterministic dependences break MCMC In practice, even strong probabilistic ones do Swendsen-Wang:

Introduce aux. vars. u to represent constraints among x Alternately sample u | x and x | u.

But Swendsen-Wang only works for Ising models MC-SAT: Generalize S-W to arbitrary clauses Uses SAT solver to sample x | u. Orders of magnitude faster than Gibbs sampling,

etc.

Page 13: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Lifted Inference

Consider belief propagation (BP) Often in large problems, many nodes are

interchangeable:They send and receive the same messages throughout BP

Basic idea: Group them into supernodes, forming lifted network

Smaller network → Faster inference Akin to resolution in first-order logic

Page 14: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Belief Propagation

Nodes (x)

Features (f)

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Page 15: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Nodes (x)

Features (f)

Page 16: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

, :Functions of edge counts

Nodes (x)

Features (f)

Page 17: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Weight Learning

Pseudo-likelihood + L-BFGS is fast and robust but can give poor inference results

Voted perceptron:Gradient descent + MAP inference

Problem: Multiple modes Not alleviated by contrastive divergence Alleviated by MC-SAT Start each MC-SAT run at previous end state

Page 18: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Weight Learning (contd.)

Problem: Extreme ill-conditioning

Solvable by quasi-Newton, conjugate gradient, etc. But line searches require exact inference Stochastic gradient not applicable because

data not i.i.d. Solution: Scaled conjugate gradient Use Hessian to choose step size Compute quadratic form inside MC-SAT Use inverse diagonal Hessian as preconditioner

Page 19: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Structure Learning

Standard inductive logic programming optimizesthe wrong thing

But can be used to overgenerate for L1 pruning Our approach:

ILP + Pseudo-likelihood + Structure priors For each candidate structure change:

Start from current weights & relax convergence Use subsampling to compute sufficient statistics Search methods: Beam, shortest-first, etc.

Page 20: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Applications to Date

Natural language processing

Information extraction Entity resolution Link prediction Collective classification Social network analysis

Robot mapping Activity recognition Scene analysis Computational biology Probabilistic Cyc Personal assistants Etc.

Page 21: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Unsupervised Semantic ParsingMicrosoft buys Powerset. BUYS(MICROSOFT,POWERSET)Goal

USP

Evaluation

Microsoft buys PowersetMicrosoft acquires semantic search engine PowersetPowerset is acquired by Microsoft CorporationThe Redmond software giant buys PowersetMicrosoft’s purchase of Powerset, …

Challenge

Recursively cluster expressions composed of similar subexpressionsExtract knowledge from biomedical abstracts and answer questionsSubstantially outperforms state of the artThree times as many correct answers; accuracy 88%

Page 22: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Research Directions

Compact representations Deep architectures Boolean decision diagrams Arithmetic circuits

Unified inference procedure Learning MLNs with many latent variables Tighter integration of learning and inference End-to-end NLP system Complete agent

Page 23: Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition

Resources

Open-source software/Web site: Alchemy Learning and inference algorithms Tutorials, manuals, etc. MLNs, datasets, etc. Publications

Book: Domingos & Lowd, Markov Logic,Morgan & Claypool, 2009.

alchemy.cs.washington.edu