lecture 1 is based on david heckerman’s tutorial slides. (microsoft research) bayesian networks...

45
Lecture 1 is based on David Heckerman’s Tutorial slides. (Microsoft Research) Bayesian Networks Lecture 1: Basics and Knowledge- Based Construction Requirements Requirements : 50% home works; 50% Exam or a proj : 50% home works; 50% Exam or a proj

Upload: olivia-fitzgerald

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 1 is based on David Heckerman’sTutorial slides.

(Microsoft Research)

Bayesian Networks

Lecture 1:Basics and Knowledge-

Based Construction

RequirementsRequirements: 50% home works; 50% Exam or a project: 50% home works; 50% Exam or a project

What I hope you will get out of

this course... What are Bayesian networks? Why do we use them? How do we build them by hand? How do we build them from data? What are some applications? What is their relationship to other models? What are the properties of conditional

independence that make these models appropriate?

Usage in genetic linkage analysis

Applications of hand-built Bayes Nets

Answer Wizard 95, Office Assistant 97,2000 Troubleshooters in Windows 98 Lymph node pathology Trauma care NASA mission control

Some Applications of learned Bayes Nets

Clustering users on the web (MSNBC) Classifying Text (spam filtering)

Some factors that support intelligence

Knowledge representation Reasoning Learning / adapting

Artificial Intelligence

Artificial Intelligence is better than

none !

Artificial Intelligence is better than

ours !

Outline for today

Basics Knowledge-based construction Probabilistic inference Applications of hand-built BNs at Microsoft

Bayesian Networks: History

1920s: Wright -- analysis of crop failure 1950s: I.J. Good -- causality Early 1980s: Howard and Matheson, Pearl Other names:

directed acyclic graphical (DAG) models belief networks causal networks probabilistic networks influence diagrams knowledge maps

Bayesian Network

Fuel

FuelFuelGaugeGauge

StartStart

BatteryBattery

EngineEngineTurns OverTurns Over

p(b)

p(t|b)

p(g|f,b)

p(s|f,t)

p(f)

Directed Acyclic Graph, annotated with prob distributions

BN structure: Definition

Missing arcs encode independencies such that

n

iiin xpxxp

11 )|(),,( pa

Fuel

FuelFuelGaugeGauge

StartStart

BatteryBattery

EngineEngineTurns OverTurns Over

p(b)

p(t|b)

p(g|f,b)

p(s|f,t)

p(f)

),|()|(),|()()(

),,,,(

tfspbtpbfgpfpbp

stgfbp

Independencies in a Bayes net

(*))|(),,(1

1

n

iiin xpxxp pa

n

iiin xxxpxxp

1111 ),|(),,(

iii XXX Pa|),,( 11

Many other independencies are entailed by (*): can beread from the graph using d-separation (Pearl)

Example:

Explaining Away and Induced Dependencies

Fuel

Start

TurnOver

|FT

)|( SFT

"explaining away"

"induced dependencies"

Local distributions

Table:Table:p(S=y|T=n,F=e) = 0.0p(S=y|T=n,F=n) = 0.0p(S=y|T=y,F=e) = 0.0p(S=y|T=y,F=n) = 0.99

Fuel(empty, not)

Start

(yes, no)

TurnOver

(yes, no)

T F

S

Local distributions

Tree:Tree:

Fuel(empty, not)

Start

(yes, no)

TurnOver

(yes, no)

T F

STurnOver

Fuel

noyes

empty notempty

p(start)=0

p(start)=0 p(start)=0.99

Lots of possibilities for a local distribution...

y = discrete node: any probabilistic classifier Decision tree Neural net

y= continuous node: any probabilistic regression model Linear regression with Gaussian noise Neural net

)( 1 n,...,xy|xp

node parents

Naïve Bayes Classifier

Class

Input 1 Input 2 Input n...

discrete

Hidden Markov Model

H1

X1

H2

X2

H3

X3

H4

X4

H5

X5

......

discrete, hidden

observations

Feed-Forward Neural Network

X1 X1 X1

Y1 Y2 Y3

hidden layer

inputs

outputs (binary)

sigmoid

sigmoid

Outline

Basics Knowledge-based construction Probabilistic inference Decision making Applications of hand-built BNs at Microsoft

Building a Bayes net by hand(ok, now we're starting to be

Bayesian) Define variables Assess the structure Assess the local probability distributions

What is a variable?

Collectively exhaustive, mutually exclusive values

Error Occured

No Error

Clarity Test: Is the variable knowable in principle

Is it raining? {Where, when, how many inches?} Is it hot? {T 100F , T < 100F}

Is user’s personality dominant or submissive? {numerical result of standardized personality test}

Assessing structure(one approach)

Choose an ordering for the variables For each variable, identify parents Pai such

that

p x x x p xi i i i( | , ) ( | )1 1 pa

i

iii

iin xpxxxpxxp )|(),|(),( 111 pa

Example

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

Example

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

p(f)

Example

p(b|f)=p(b)

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

p(f)

Example

p(b|f)=p(b)p(t|b,f)=p(t|b)

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

p(f)

Example

p(b|f)=p(b)p(t|b,f)=p(t|b)

p(g|f,b,t)=p(g|f,b)

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

p(f)

Example

p(b|f)=p(b)p(t|b,f)=p(t|b)

p(g|f,b,t)=p(g|f,b)

p(s|f,b,t,g)=p(s|f,t)

p(f,b,t,g,s) = p(f) p(b) p(t|b) p(g|f,b) p(s|f,t)

Fuel GaugeGauge StartStartBatteryBattery TurnOverTurnOver

p(f)

Why is this the wrong way?Variable order can be critical

BatteryBatteryTurnOverTurnOverStartStart FuelFuelGauge

A better way:Use causal knowledge

Fuel

GaugeGauge

StartStart

BatteryBattery

TurnOverTurnOver

Conditional Independence Simplifies Probabilistic

Inference

tfb

tb

sgtbfp

sgtbfp

gsp

gsfpgsfp

,,

,

),,,,(

),,,,(

),(

),,(),|(

f b ttfb

tfspbfgpbtpbpfpsgtbfp ),|(),|()|()()(),,,,(,,

Fuel GaugeGaugeBatteryBattery TurnOverTurnOver StartStart

f b t

tfspbtpbfgpbpfp ),|()|(),|()()(

Online Troubleshooters

Define Problem

Gather Information

Get Recommendations

Office Assistant 97

Studies with Human Subjects

“Wizard of OZ” experiments at MS Usability Labs

Expert AdvisorExpert Advisor Inexperienced userInexperienced user

User Actions

Typed Advice

.

Activities with Relevance to User’s Needs

Several classes of evidenceSeveral classes of evidence

SearchSearch: e.g., menu surfing: e.g., menu surfing

IntrospectionIntrospection: e.g., sudden pause, slowing of command : e.g., sudden pause, slowing of command streamstream

Focus of attentionFocus of attention: e.g, selected objects: e.g, selected objects

Undesired effectsUndesired effects: e.g., command/undo, dialogue opened : e.g., command/undo, dialogue opened and cancelledand cancelled

Inefficient command sequencesInefficient command sequences

Goal-specific sequences of actionsGoal-specific sequences of actions

Summary so far

Bayes nets are useful because... They encode independence explicitly

more parsimonious models efficient inference

They encode independence graphically Easier explanation Easier encoding

They sometimes correspond to causal models Easier explanation Easier encoding Modularity leads to easier maintenance

Teenage Bayes

MICRONEWS 97:Microsoft Researchers Exchange Brainpower with Eighth-grader

Teenager Designs Award-Winning Science Project

.. For her science project, which she called "Dr. Sigmund Microchip," Tovar wanted to create a computer program to diagnose the probability of certain personality types. With only answers from a few questions, the program was able to accurately diagnose the correct personality type 90 percent of the time.

Artificial Intelligence is a promising fieldalways was, always will be.