1 probability fol fails for a domain due to: laziness: too much to list the complete set of rules,...

37
1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical ignorance: there is no complete theory for the domain Practical ignorance: have not or cannot run all necessary tests

Upload: toby-miles

Post on 18-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

3 Probability Prior probability: probability in the absence of any other information P(Dice = 2) = 1/6 random variable: Dice domain = probability distribution: P(Dice) =

TRANSCRIPT

Page 1: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

1

Probability• FOL fails for a domain due to:

– Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result

– Theoretical ignorance: there is no complete theory for the domain

– Practical ignorance: have not or cannot run all necessary tests

Page 2: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

2

Probability• Probability = a degree of belief• Probability comes from:

– Frequentist: experiments and statistical assessment

– Objectivist: real aspects of the universe– Subjectivist: a way of characterizing an agent’s

beliefs• Decision theory = probability + utility theory

Page 3: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

3

Probability• Prior probability: probability in the

absence of any other information

P(Dice = 2) = 1/6

random variable: Dicedomain = <1, 2, 3, 4, 5, 6>probability distribution: P(Dice) = <1/6, 1/6, 1/6, 1/6, 1/6,

1/6>

Page 4: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

4

Probability• Conditional probability: probability in

the presence of some evidence

P(Dice = 2 | Dice is even) = 1/3P(Dice = 2 | Dice is odd) = 0

• P(A | B) = P(A B)/P(B)P(A B) = P(A | B).P(B)

Page 5: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

5

Probability• Example:

S = stiff neckM = meningitisP(S | M) = 0.5P(M) = 1/50000P(S) = 1/20

P(M | S) = P(S | M).P(M)/P(S) = 1/5000

Page 6: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

6

Probability• Joint probability distributions:

X: <x1, …, xm> Y: <y1, …, yn>

P(X = xi, Y = yj)

Page 7: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

7

Probability• Axioms:

0 P(A) 1P(true) = 1 and P(false) = 0P(A B) = P(A) + P(B) P(A B)

Page 8: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

8

Probability• Derived properties:

P(A) = 1 P(A)

P(U) = P(A1) + P(A2) + ... + P(An)U = A1 A2 ... An collectively exhaustiveAi Aj = false mutually exclusive

Page 9: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

9

Probability• Bayes’ theorem:

P(Hi | E) = P(HiE)/P(E) = P(E | Hi).P(Hi)/P(E | Hi).P(Hi)

Hi’s are collectively exhaustive & mutually exclusive

Page 10: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

10

Probability• Problem: a full joint probability

distribution P(X1, X2, ..., Xn) is sufficient for computing any probability on Xi's, but the number of joint probabilities is exponential.

Page 11: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

11

Probability• Independence:

P(A B) = P(A).P(B) P(A) = P(A | B)

• Conditional independence:P(A B | E) = P(A | E).P(B | E)

P(A | E) = P(A | E B)

Page 12: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

12

Probability• Example:

P(Toothache | Cavity Catch) = P(Toothache | Cavity)

P(Catch | Cavity Toothache) = P(Catch | Cavity)

Page 13: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

13

Probability"In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue."

Page 14: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

14

Probability"In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue."Q1: If earthquake happens, how likely will John make a call?

Page 15: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

15

Probability"In John's and Mary's house, an alarm is installed to sound in case of burglary or earthquake. When the alarm sounds, John and Mary may make a call for help or rescue."Q1: If earthquake happens, how likely will John make a call?

Q2: If the alarm sounds, how likely is the house burglarized?

Page 16: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

16

Bayesian Networks• Pearl, J. (1982). Reverend Bayes on

Inference Engines: A Distributed Hierarchical Approach, presented at the Second National Conference on Artificial Intelligence (AAAI-82), Pittsburgh, Pennsylvania

• 2000 AAAI Classic Paper Award

Page 17: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

17

Bayesian Networks

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B).001

P(E).002

B E P(A)T TT FF TF F

.95

.94

.29

.001

A P(J)TF

.90

.05

A P(M)TF

.70

.01

Page 18: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

18

Bayesian Networks• Syntax:

– A set of random variables making up the nodes– A set of directed links connecting pairs of nodes– Each node has a conditional probability table that

quantifies the effects of its parent nodes– The graph has no directed cycles

A

B C

D

A

B C

D

yes no

Page 19: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

19

Bayesian Networks• Semantics:

Ordering the nodes: Xi is a predecessor of Xj i < j

P(X1, X2, …, Xn) = P(Xn | Xn-1, …, X1).P(Xn-1 | Xn-2, …, X1). … .P(X2 | X1).P(X1)= iP(Xi | Xi-1, …, X1) = iP(Xi | Parents(Xi))

P (Xi | Xi-1, …, X1) = P(Xi | Parents(Xi)) Parents(Xi) {Xi-1, …, X1}

Each node is conditionally independent of its predecessors given its parents

Page 20: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

20

Bayesian Networks• Example:

P(J M A B E) = P(J | A).P(M | A).P(A | B E).P(B).P(E)

Page 21: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

21

Bayesian Networks• P(Query | Evidence) = ?

– Diagnostic (from effects to causes): P(B | J)– Causal (from causes to effects): P(J | B)– Intercausal (between causes of a common

effect): P(B | A, E)– Mixed: P(A | J, E), P(B | J, E)

Page 22: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

22

P(B | A) = P(B A)/P(A)= P(B A)

Bayesian Networks

Page 23: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

23

Bayesian Networks• Why Bayesian Networks?

Page 24: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

24

General Conditional Independence

Y1 Yn

U1 Um

Z1j Znj

X

A node (X) is conditionally independent of its non-descendents (Zij's), given its parents (Ui's)

Page 25: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

25

General Conditional Independence

Y1 Yn

U1 Um

Z1j Znj

X

A node (X) is conditionally independent of all other nodes, given its parents (Ui's), children (Yi's), and children's parents (Zij's)

Page 26: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

26

General Conditional Independence

X and Y are conditionally independent given E

Z

Z

Z

X YE

Page 27: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

27

General Conditional Independence

Example:Battery

GasIgnitionRadio

Starts

Moves

Page 28: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

28

General Conditional Independence

Example: Battery

GasIgnitionRadio

Starts

Moves

Gas - (no evidence at all) - RadioGas Radio | StartsGas Radio | Moves

Page 29: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

29

Network Construction• Wrong ordering:

– Produces a more complicated net– Requires more probabilities to be specified– Produces slight relationships that require

difficult and unnatural probability judgments – Fails to represent all the conditional

independence relationships

a lot of unnecessary probabilities

Page 30: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

30

Network Construction• Models:

– Diagnostic: symptoms to causes– Causal: causes to symptoms

Add "root causes" first, then the variables they influence

fewer and more comprehensible probabilities

Page 31: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

31

Dempster-Shafer Theory• Example: disease diagnostics

D = {Allergy, Flu, Cold, Pneumonia}

Page 32: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

32

Dempster-Shafer Theory• Example: disease diagnostics

D = {Allergy, Flu, Cold, Pneumonia}

• Basic probability assignment:m: 2D [0, 1]

m1: {Flu, Cold, Pneu} 0.6 D 0.4

Page 33: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

33

Dempster-Shafer Theory• Belief:

Bel: 2D [0, 1]Bel(S) = XS m(X)

Page 34: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

34

Dempster-Shafer Theory• Belief:

Bel: 2D [0, 1]Bel(S) = XS m(X)

• Plausibility:Pl(p) = 1 Bel(p)p: [Bel(p), Pl(p)]

Page 35: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

35

Dempster-Shafer Theory• Ignorance:

Bel(p) + Bel(p) 1

Page 36: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

36

Dempster-Shafer Theory• Combination rules: m1 and m2

XY=Zm1(X).m2(Y) m(Z) =

1 XY=m1(X).m2(Y)

Page 37: 1 Probability FOL fails for a domain due to: Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result Theoretical

37

Exercises• In Russell’s AIMA: Chapters 13-14.