perfect recall:

Perfect recall:• Every decision node observes all earlier decision nodes and their parents (along a “temporal” order)

• Sum-max-sum rule (dynamical programming):

• Perfect recall is unrealistic: memory limit, decentralized systems

Variational methods:• Log-partition function duality:

• Junction graph BP: approximating and

Belief Propagation for Structured Decision Making Qiang Liu Alexander Ihler

Department of Computer Science, University of California, Irvine

AbstractVariational inference methods such as loopy BP have revolutionized inference abilities on graphical models.

Influence diagrams (or decision networks) are extension of graphical models for representing structured decision making problems.

Our contribution:• A general variational framework for solving influence diagrams• A junction graph belief propagation for IDs with an intuitive interpretation and strong theoretical guarantees • A convergent double-loop algorithm• Significant empirical improvement over the baseline algorithm

Variational Framework for structured decision

Influence Diagram

Graphical Models and Variational MethodsGraphical models:• Factors & exponential family form

• Graphical representations: Bayes nets, Markov random fields …

Inference: answering queries about graphical models

Our Algorithms

Experiments

Junction graph belief propagation for MEU:• Construct junction graph over the augmented distribution

Main result:

• Intuition: the last term encourages policies to be deterministic • Perfect recall convex optimization (easier)• Imperfect recall non-convex optimization (harder)

Bethe-Kikuchi approximation: locally consistent polytopeed

abc bcd

Loopy Junction graph

Influence diagram:• Chance nodes (C):

Augmented distribution:

Maximum expected utility (MEU):

Imperfect recall:• No closed form solution• Dominant algorithm: single policy updating (SPU), with policy-by-policy optimality

If is the maximum, the optimal strategy is Causes policies to be deterministic

Significance:• Enables converting arbitrary variational methods to MEU algorithms • “Integrates” the policy evaluation and policy improvement steps (avoiding expensive inner loops)

c1c4d1

c1c2d2

c2c3d3 c4d2d3

Influence diagram Augmented distribution (factor graph)

Junction graph

• For each decision node , identify a unique cluster (called a decision cluster) that includes

Decision cluster of d1

Normal cluster

• Message passing algorithm ( )Sum-messages (from normal clusters):

MEU-messages (from decision clusters):

Optimal policies:

• Strong local optimality: provably better than SPU

Convergent algorithm by proximal point method:• Iteratively optimize a smoothed objective,

Diagnostic network (UAI08 inference challenge):

e.g., calculating (log) partition function:

Decentralized Sensor network:

Conditional probability:

Decision rule:

Global utility function: Local utility function:

• Decision nodes (D):

• Utility nodes (U):

Perfect recall Imperfect recall

Additive

d1 d2 utility

+1 +1 2-1 -1 1+1 -1 0-1 +1 0

Toy example:

Multiplicative

Weather

Activity

Forecast

Happiness

perfect recall:

decision rule

decision clusters

decision networks

decision nodes d

earlier decision nodes

junction graph bp

policy optimality

construct junction graph

Documents

hydroxycut recall

precision recall

unemployment insurance, recall expectations, and...

recall: stride scheduling recall: linux completely fair

cryptography lecture 4 arpita patra. recall o various...

ms&e 246: lecture 8 games of complete...

practice makes perfect in memory...

brand recall

development of a recall...

recall procedures

extensive form abstract economies and generalized perfect...

allianz global corporate & specialty product recall ·...

safety recall communication guide · safety recall...

lets recall

university of groningen collaborative recall of details of...

extensive form abstract economies and generalized perfect...

quick-fire gps recall tenses · 2020-04-21 · present...

food recall plan recall team personnel - haccp manager...

the perfect recall...

focus recall. - axis communications 3. how does focus recall...