artificial intelligence chapter 19 reasoning with

31
Artificial Intelligence Artificial Intelligence Chapter 19 Chapter 19 Reasoning with Uncertain Reasoning with Uncertain Information Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University

Upload: others

Post on 25-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Artificial Intelligence Artificial Intelligence Chapter 19Chapter 19

Reasoning with Uncertain Reasoning with Uncertain InformationInformation

Biointelligence LabSchool of Computer Sci. & Eng.

Seoul National University

OutlineOutline

l Review of Probability Theoryl Probabilistic Inferencel Bayes Networksl Patterns of Inference in Bayes Networksl Uncertain Evidencel D-Separationl Probabilistic Inference in Polytrees

(C) 2000-2002 SNU CSE Biointelligence Lab 2

l Review of Probability Theoryl Probabilistic Inferencel Bayes Networksl Patterns of Inference in Bayes Networksl Uncertain Evidencel D-Separationl Probabilistic Inference in Polytrees

19.1 Review of Probability Theory (1/4)19.1 Review of Probability Theory (1/4)

l Random variables

l Joint probability

(C) 2000-2002 SNU CSE Biointelligence Lab 3

l Random variables

l Joint probability

(B (BAT_OK), M (MOVES) , L (LIFTABLE), G (GUAGE))

Joint Probability

(True, True, True, True) 0.5686(True, True, True, False) 0.0299(True, True, False, True) 0.0135(True, True, False, False) 0.0007… …

Ex.

19.1 Review of Probability Theory (2/4)19.1 Review of Probability Theory (2/4)l Marginal probability

l Conditional probability

¨ Ex. The probability that the battery is charged given that the are does not move

Ex.

(C) 2000-2002 SNU CSE Biointelligence Lab 4

l Marginal probability

l Conditional probability

¨ Ex. The probability that the battery is charged given that the are does not move

19.1 Review of Probability Theory (3/4)19.1 Review of Probability Theory (3/4)

(C) 2000-2002 SNU CSE Biointelligence Lab 5

Figure 19.1 A Venn Diagram

19.1 Review of Probability Theory (4/4)19.1 Review of Probability Theory (4/4)

l Chain rule

l Bayes’ rule

l

¨Abbreviation for where

(C) 2000-2002 SNU CSE Biointelligence Lab 6

l Chain rule

l Bayes’ rule

l

¨Abbreviation for where

19.2 Probabilistic Inference19.2 Probabilistic Inferencel The probability some variable Vi has value vi

given the evidence e =e.

( ) ( )( )

( )( )[ ]( )

( )( ) ( )RpRp

RpRQPpRQPp

RpRQpRQp

Ø=

Ø+

=

ØØØ+Ø

=ØØ

3.01.02.0

,,),,,|

(C) 2000-2002 SNU CSE Biointelligence Lab 7

l The probability some variable Vi has value vigiven the evidence e =e.

p(P,Q,R) 0.3

p(P,Q,¬R) 0.2

p(P, ¬Q,R) 0.2

p(P, ¬Q,¬R) 0.1

p(¬P,Q,R) 0.05

p(¬P, Q, ¬R) 0.1

p(¬P, ¬Q,R) 0.05

p(¬P, ¬Q,¬R) 0.0

( ) ( )( )

( )( )[ ]( )

( )( ) ( )RpRp

RpRQPpRQPp

RpRQpRQp

Ø=

Ø+

=

ØØØ+Ø

=ØØ

3.01.02.0

,,),,,|

( ) ( )( )

( )( )[ ]( )

( )( ) ( )RpRp

RpRQPpRQPp

RpRQpRQp

Ø=

Ø+

=

ØØØØ+ØØ

=ØØØ

=ØØ

1.00.01.0

,,),,,|

( )( ) ( ) 1|||

75.0|=Ø

=ØRQpRQP

RQpQ

Statistical IndependenceStatistical Independence

l Conditional independence

l Mutually conditional independence

l Unconditional independence

(C) 2000-2002 SNU CSE Biointelligence Lab 8

l Conditional independence

l Mutually conditional independence

l Unconditional independence

19.3 Bayes Networks (1/2)19.3 Bayes Networks (1/2)

l Directed, acyclic graph (DAG) whose nodes are labeled by random variables.

l Characteristics of Bayesian networks¨Node Vi is conditionally independent of any subset of

nodes that are not descendents of Vi.

l Prior probabilityl Conditional probability table (CPT)

(C) 2000-2002 SNU CSE Biointelligence Lab 9

l Directed, acyclic graph (DAG) whose nodes are labeled by random variables.

l Characteristics of Bayesian networks¨Node Vi is conditionally independent of any subset of

nodes that are not descendents of Vi.

l Prior probabilityl Conditional probability table (CPT)

( ) ( )Õ=

=k

iiik VPaVpVVVp

121 )(|,...,,

19.3 Bayes Networks (2/2)19.3 Bayes Networks (2/2)

(C) 2000-2002 SNU CSE Biointelligence Lab 10

19.4 Patterns of Inference in Bayes 19.4 Patterns of Inference in Bayes Networks (1/3)Networks (1/3)l Causal or top-down inference

¨ Ex. The probability that the arm moves given that the block is liftable

( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( ) ( )BpLBMpBpLBMp

LBpLBMpLBpLBMpLBMpLBMpLMp

ØØ+=ØØ+=

Ø+=

,|,||,||,|

|,|,|

(C) 2000-2002 SNU CSE Biointelligence Lab 11

l Causal or top-down inference¨ Ex. The probability that the arm moves given that the block is

liftable

( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( ) ( )BpLBMpBpLBMp

LBpLBMpLBpLBMpLBMpLBMpLMp

ØØ+=ØØ+=

Ø+=

,|,||,||,|

|,|,|

19.4 Patterns of Inference in Bayes 19.4 Patterns of Inference in Bayes Networks (2/3)Networks (2/3)l Diagnostic or bottom-up inference

¨ Using an effect (or symptom) to infer a cause¨ Ex. The probability that the block is not liftable given that the arm

does not move.( ) 9525.0| =ØØ LMp (using a causal reasoning)

(C) 2000-2002 SNU CSE Biointelligence Lab 12

l Diagnostic or bottom-up inference¨ Using an effect (or symptom) to infer a cause¨ Ex. The probability that the block is not liftable given that the arm

does not move.( ) 9525.0| =ØØ LMp (using a causal reasoning)

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( )

( ) 88632.0|

03665.07.00595.0||

28575.03.09525.0||

=ØØØ

=Ø´

Ø=Ø

Ø=

Ø´

ØØØ=ØØ

MLpMpMpMp

LpLMpMLp

MpMpMpLpLMpMLp (Bayes’ rule)

19.4 Patterns of Inference in Bayes 19.4 Patterns of Inference in Bayes Networks (3/3)Networks (3/3)l Explaining away

¨¬B explains ¬M, making ¬L less certain

( ) ( ) ( )( )

( ) ( ) ( )( )

( ) ( ) ( )( )88632.0030.0

,,|

,|,|

,|,,|

<<=ØØ

ØØØØØ=

ØØØØØØØØ

=

ØØØØØØ

=ØØØ

MBpLpBpLBMp

MBpLpLBpLBMp

MBpLpLBMpMBLp (Bayes’ rule)

(def. of conditional prob.)

(C) 2000-2002 SNU CSE Biointelligence Lab 13

l Explaining away

¨¬B explains ¬M, making ¬L less certain

( ) ( ) ( )( )

( ) ( ) ( )( )

( ) ( ) ( )( )88632.0030.0

,,|

,|,|

,|,,|

<<=ØØ

ØØØØØ=

ØØØØØØØØ

=

ØØØØØØ

=ØØØ

MBpLpBpLBMp

MBpLpLBpLBMp

MBpLpLBMpMBLp

(structure of the Bayes network)

19.5 Uncertain Evidence19.5 Uncertain Evidence

l We must be certain about the truth or falsity of the propositions they represent.¨ Each uncertain evidence node should have a child node, about

which we can be certain.¨ Ex. Suppose the robot is not certain that its arm did not move.

< Introducing M’ : “The arm sensor says that the arm moved”– We can be certain that that proposition is either true or false.

< p(¬L| ¬B, ¬M’) instead of p(¬L| ¬B, ¬M)¨ Ex. Suppose we are uncertain about whether or not the battery is

charged.< Introducing G : “Battery guage”< p(¬L| ¬G, ¬M’) instead of p(¬L| ¬B, ¬M’)

(C) 2000-2002 SNU CSE Biointelligence Lab 14

l We must be certain about the truth or falsity of the propositions they represent.¨ Each uncertain evidence node should have a child node, about

which we can be certain.¨ Ex. Suppose the robot is not certain that its arm did not move.

< Introducing M’ : “The arm sensor says that the arm moved”– We can be certain that that proposition is either true or false.

< p(¬L| ¬B, ¬M’) instead of p(¬L| ¬B, ¬M)¨ Ex. Suppose we are uncertain about whether or not the battery is

charged.< Introducing G : “Battery guage”< p(¬L| ¬G, ¬M’) instead of p(¬L| ¬B, ¬M’)

19.6 D19.6 D--Separation (1/3)Separation (1/3)

l e d-separates Vi and Vj if for every undirected path in the Bayes network between Vi and Vj, there is some node, Vb, on the path having one of the following three properties.¨ Vb is in e, and both arcs on the path lead out of Vb¨ Vb is in e, and one arc on the path leads in to Vb and one arc leads

out.¨ Neither Vb nor any descendant of Vb is in e, and both arcs on the

path lead in to Vb.l Vb blocks the path given e when any one of these

conditions holds for a path.l Two nodes Vi and Vj are conditionally independent given a

set of node e

(C) 2000-2002 SNU CSE Biointelligence Lab 15

l e d-separates Vi and Vj if for every undirected path in the Bayes network between Vi and Vj, there is some node, Vb, on the path having one of the following three properties.¨ Vb is in e, and both arcs on the path lead out of Vb¨ Vb is in e, and one arc on the path leads in to Vb and one arc leads

out.¨ Neither Vb nor any descendant of Vb is in e, and both arcs on the

path lead in to Vb.l Vb blocks the path given e when any one of these

conditions holds for a path.l Two nodes Vi and Vj are conditionally independent given a

set of node e

19.6 D19.6 D--Separation (2/3)Separation (2/3)

(C) 2000-2002 SNU CSE Biointelligence Lab 16Figure 19.3 Conditional Independence via Blocking Nodes

19.6 D19.6 D--Separation (3/3)Separation (3/3)

l Ex.¨ I(G, L|B) by rules 1 and 3

<By rule 1, B blocks the (only) path between G and L, given B.<By rule 3, M also blocks this path given B.

¨ I(G, L)<By rule 3, M blocks the path between G and L.

¨ I(B, L)<By rule 3, M blocks the path between B and L.

l Even using d-separation, probabilistic inference in Bayes networks is, in general, NP-hard.

(C) 2000-2002 SNU CSE Biointelligence Lab 17

l Ex.¨ I(G, L|B) by rules 1 and 3

<By rule 1, B blocks the (only) path between G and L, given B.<By rule 3, M also blocks this path given B.

¨ I(G, L)<By rule 3, M blocks the path between G and L.

¨ I(B, L)<By rule 3, M blocks the path between B and L.

l Even using d-separation, probabilistic inference in Bayes networks is, in general, NP-hard.

19.7 Probabilistic Inference in 19.7 Probabilistic Inference in Polytrees (1/2)Polytrees (1/2)l Polytree

¨A DAG for which there is just one path, along arcs in either direction, between any two nodes in the DAG.

(C) 2000-2002 SNU CSE Biointelligence Lab 18

l A node is above Q¨ The node is connected to Q only through Q’s parents

l A node is below Q¨ The node is connected to Q only through Q’s

immediate successors.l Three types of evidence.

¨All evidence nodes are above Q.¨All evidence nodes are below Q.¨ There are evidence nodes both above and below Q.

19.7 Probabilistic Inference in 19.7 Probabilistic Inference in Polytrees (2/2)Polytrees (2/2)

(C) 2000-2002 SNU CSE Biointelligence Lab 19

l A node is above Q¨ The node is connected to Q only through Q’s parents

l A node is below Q¨ The node is connected to Q only through Q’s

immediate successors.l Three types of evidence.

¨All evidence nodes are above Q.¨All evidence nodes are below Q.¨ There are evidence nodes both above and below Q.

Evidence Above (1/2)Evidence Above (1/2)

l Bottom-up recursive algorithml Ex. p(Q|P5, P4)( ) ( )

( ) ( )

( ) ( )

( ) ( ) ( )

( ) ( ) ( )å

å

å

å

å

=

=

=

=

=

7,6

7,6

7,6

7,6

7,6

4|75|67,6|

4,5|74,5|67,6|

4,5|7,67,6|

4,5|7,64,5,7,6|

4,5|7,6,4,5|

PP

PP

PP

PP

PP

PPpPPpPPQp

PPPpPPPpPPQp

PPPPpPPQp

PPPPpPPPPQp

PPPPQpPPQp

(Structure of The Bayes network)

(C) 2000-2002 SNU CSE Biointelligence Lab 20

l Bottom-up recursive algorithml Ex. p(Q|P5, P4)( ) ( )

( ) ( )

( ) ( )

( ) ( ) ( )

( ) ( ) ( )å

å

å

å

å

=

=

=

=

=

7,6

7,6

7,6

7,6

7,6

4|75|67,6|

4,5|74,5|67,6|

4,5|7,67,6|

4,5|7,64,5,7,6|

4,5|7,6,4,5|

PP

PP

PP

PP

PP

PPpPPpPPQp

PPPpPPPpPPQp

PPPPpPPQp

PPPPpPPPPQp

PPPPQpPPQp

(Structure of The Bayes network)(d-separation)(d-separation)

Evidence Above (2/2)Evidence Above (2/2)

l Calculating p(P7|P4) and p(P6|P5)

l Calculating p(P5|P1)¨ Evidence is “below”¨Here, we use Bayes’ rule

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )å

å å=

==

2,1

3 3

25|12,1|65|6

34,3|74|34,3|74|7

PP

P P

PpPPpPPPpPPp

PpPPPpPPpPPPpPPp

(C) 2000-2002 SNU CSE Biointelligence Lab 21

l Calculating p(P7|P4) and p(P6|P5)

l Calculating p(P5|P1)¨ Evidence is “below”¨Here, we use Bayes’ rule

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )å

å å=

==

2,1

3 3

25|12,1|65|6

34,3|74|34,3|74|7

PP

P P

PpPPpPPPpPPp

PpPPPpPPpPPPpPPp

( ) ( ) ( )( )5

11|55|1Pp

PpPPpPPp =

Evidence Below (1/2)Evidence Below (1/2)

l Top-down recursive algorithm

( ) ( ) ( )( )

( ) ( )( ) ( ) ( )QpQPPpQPPkp

QpQPPPPkpPPPPp

QpQPPPPpPPPPQp

|11,14|13,12|11,14,13,1211,14,13,12

|11,14,13,121,14,13,12|

==

=

(C) 2000-2002 SNU CSE Biointelligence Lab 22

l Top-down recursive algorithm

( ) ( ) ( )( )

( ) ( )( ) ( ) ( )QpQPPpQPPkp

QpQPPPPkpPPPPp

QpQPPPPpPPPPQp

|11,14|13,12|11,14,13,1211,14,13,12

|11,14,13,121,14,13,12|

==

=

( ) ( ) ( )

( ) ( )å

å

=

=

9

9

|99|13,12

|9,9|13,12|13,12

P

P

QPpPPPp

QppQPPPpQPPp

( ) ( ) ( )å=8

8,8|9|9P

PpQPPpQPp ( ) ( ) ( )9|139|129|13,12 PPpPPpPPPp =

Evidence Below (2/2)Evidence Below (2/2)

( ) ( ) ( )

( ) ( ) ( )å

å=

=

10

10

|1010|1110|14

|1010|11,14|11,14

P

P

QPpPPpPPp

QPpPPPpQPPp

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( ) ( ) ( )1011,10|1511|1011,10|1511|10,15

1111|10,1510,15

1111|10,1510,15|11

1111,10|1510|15

10|1510,15|1110|11

1

11

15

PpPPPpPPpPPPpPPPp

PpPPPpkPPp

PpPPPpPPPp

PpPPPpPPp

PPpPPPpPPp

P

P

==

=-

=

=

=

å

å

(C) 2000-2002 SNU CSE Biointelligence Lab 23

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( ) ( ) ( )1011,10|1511|1011,10|1511|10,15

1111|10,1510,15

1111|10,1510,15|11

1111,10|1510|15

10|1510,15|1110|11

1

11

15

PpPPPpPPpPPPpPPPp

PpPPPpkPPp

PpPPPpPPPp

PpPPPpPPp

PPpPPPpPPp

P

P

==

=-

=

=

=

å

å

Evidence Above and BelowEvidence Above and Below

( ) ( ) ( )( )

( ) ( )( ) ( )+-

++-

+-

++--+

=

=

=

ee

eeeee

eeeee

||

|,||

|,|,|

2

2

QpQpkQpQpk

pQpQpQp

( )}11,14,13,12{},4,5{| PPPPPPQpe+ e-

(C) 2000-2002 SNU CSE Biointelligence Lab 24

( ) ( ) ( )( )

( ) ( )( ) ( )+-

++-

+-

++--+

=

=

=

ee

eeeee

eeeee

||

|,||

|,|,|

2

2

QpQpkQpQpk

pQpQpQp

A Numerical Example (1/2)A Numerical Example (1/2)( ) ( ) ( )QpQUkpUQp || =

( ) ( ) ( )

( ) ( ) ( ) ( )80.099.08.001.095.0,|,|

,||

=´+´=ØØ+=

=åRpQRPpRpQRPp

RpQRPpQPpR

(C) 2000-2002 SNU CSE Biointelligence Lab 25

( ) ( ) ( )

( ) ( ) ( ) ( )80.099.08.001.095.0,|,|

,||

=´+´=ØØ+=

=åRpQRPpRpQRPp

RpQRPpQPpR

( ) ( ) ( )

( ) ( ) ( ) ( )019.099.001.001.090.0,|,|

,||

=´+´=ØØØ+Ø=

Ø=Ø åRpQRPpRpQRPp

RpQRPpQPpR

A Numerical Example (2/2)A Numerical Example (2/2)

l Other techniques¨ Bucket elimination¨ Monte Carlo method¨ Clustering

( ) ( ) ( )60.02.02.08.07.0

2.0|8.0||=´+´=

´Ø+´= PUpPUpQUp

( ) ( ) ( )21.098.02.0019.07.0

98.0|019.0||=´+´=

´Ø+´=Ø PUpPUpQUp

(C) 2000-2002 SNU CSE Biointelligence Lab 26

l Other techniques¨ Bucket elimination¨ Monte Carlo method¨ Clustering

( ) ( ) ( )21.098.02.0019.07.0

98.0|019.0||=´+´=

´Ø+´=Ø PUpPUpQUp

( )( )

( ) 13.003.035.4|,35.420.095.021.0|

03.005.06.0|

=´==\´=´´=Ø

´=´´=

UQpkkkUQp

kkUQp

Additional Readings (1/5)Additional Readings (1/5)

l [Feller 1968]¨ Probability Theory

l [Goldszmidt, Morris & Pearl 1990]¨Non-monotonic inference through probabilistic method

l [Pearl 1982a, Kim & Pearl 1983]¨Message-passing algorithm

l [Russell & Norvig 1995, pp.447ff]¨ Polytree methods

(C) 2000-2002 SNU CSE Biointelligence Lab 27

l [Feller 1968]¨ Probability Theory

l [Goldszmidt, Morris & Pearl 1990]¨Non-monotonic inference through probabilistic method

l [Pearl 1982a, Kim & Pearl 1983]¨Message-passing algorithm

l [Russell & Norvig 1995, pp.447ff]¨ Polytree methods

Additional Readings (2/5)Additional Readings (2/5)

l [Shachter & Kenley 1989]¨Bayesian network for continuous random variables

l [Wellman 1990]¨Qualitative networks

l [Neapolitan 1990]¨ Probabilistic methods in expert systems

l [Henrion 1990]¨ Probability inference in Bayesian networks

(C) 2000-2002 SNU CSE Biointelligence Lab 28

l [Shachter & Kenley 1989]¨Bayesian network for continuous random variables

l [Wellman 1990]¨Qualitative networks

l [Neapolitan 1990]¨ Probabilistic methods in expert systems

l [Henrion 1990]¨ Probability inference in Bayesian networks

Additional Readings (3/5)Additional Readings (3/5)

l [Jensen 1996]¨Bayesian networks: HUGIN system

l [Neal 1991]¨Relationships between Bayesian networks and neural

networksl [Hecherman 1991, Heckerman & Nathwani 1992]

¨ PATHFINDERl [Pradhan, et al. 1994]

¨CPCSBN

(C) 2000-2002 SNU CSE Biointelligence Lab 29

l [Jensen 1996]¨Bayesian networks: HUGIN system

l [Neal 1991]¨Relationships between Bayesian networks and neural

networksl [Hecherman 1991, Heckerman & Nathwani 1992]

¨ PATHFINDERl [Pradhan, et al. 1994]

¨CPCSBN

Additional Readings (4/5)Additional Readings (4/5)

l [Shortliffe 1976, Buchanan & Shortliffe 1984]¨MYCIN: uses certainty factor

l [Duda, Hart & Nilsson 1987]¨ PROSPECTOR: uses sufficiency index and necessity index

l [Zadeh 1975, Zadeh 1978, Elkan 1993]¨ Fuzzy logic and possibility theory

l [Dempster 1968, Shafer 1979]¨Dempster-Shafer’s combination rules

(C) 2000-2002 SNU CSE Biointelligence Lab 30

l [Shortliffe 1976, Buchanan & Shortliffe 1984]¨MYCIN: uses certainty factor

l [Duda, Hart & Nilsson 1987]¨ PROSPECTOR: uses sufficiency index and necessity index

l [Zadeh 1975, Zadeh 1978, Elkan 1993]¨ Fuzzy logic and possibility theory

l [Dempster 1968, Shafer 1979]¨Dempster-Shafer’s combination rules

Additional Readings (5/5)Additional Readings (5/5)

l [Nilsson 1986]¨ Probabilistic logic

l [Tversky & Kahneman 1982]¨Human generally loses consistency facing uncertainty

l [Shafer & Pearl 1990]¨ Papers for uncertain inference

l Proceedings & Journals¨Uncertainty in Artificial Intelligence (UAI)¨ International Journal of Approximate Reasoning

(C) 2000-2002 SNU CSE Biointelligence Lab 31

l [Nilsson 1986]¨ Probabilistic logic

l [Tversky & Kahneman 1982]¨Human generally loses consistency facing uncertainty

l [Shafer & Pearl 1990]¨ Papers for uncertain inference

l Proceedings & Journals¨Uncertainty in Artificial Intelligence (UAI)¨ International Journal of Approximate Reasoning