factor graphs and message passing algorithms — part 1...
TRANSCRIPT
![Page 1: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/1.jpg)
December 2007 1
Factor Graphs and Message Passing Algorithms
— Part 1: Introduction
Hans-Andrea Loeliger
![Page 2: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/2.jpg)
2
The Two Basic Problems
1. Marginalization: Compute
fk(xk)4=
∑x1, . . . , xn
except xk
f (x1, . . . , xn)
2. Maximization: Compute the “max-marginal”
fk(xk)4= max
x1, . . . , xn
except xk
f (x1, . . . , xn)
assuming that f is real-valued and nonnegative and has a maximum.Note that
argmax f (x1, . . . , xn) =(argmax f1(x1), . . . , argmax fn(xn)
).
For large n, both problems are in general intractable(even for x1, . . . , xn ∈ {0, 1}).
![Page 3: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/3.jpg)
3
Factorization Helps
For example, if f (x1, . . . , fn) = f1(x1)f2(x2) · · · fn(xn) then
fk(xk) =∑x1
f1(x1) · · ·∑xk−1
fk−1(xk−1)fk(xk)∑xk+1
fk+1(xk+1) · · ·∑xn
fn(xn)
and
fk(xk) = maxx1
f1(x1) · · ·maxxk−1
fk−1(xk−1)fk(xk) maxxk+1
fk+1(xk+1) · · ·maxxn
fn(xn).
Factorization helps also beyond this trivial example.−→ Factor graphs and message passing algorithms.
![Page 4: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/4.jpg)
4
Roots
Statistical physics:- Markov random fields (Ising 1925?)
Signal processing:- linear state-space models and Kalman filtering: Kalman 1960. . .- recursive least-squares adaptive filters- Hidden Markov models and forward-backward algorithm: Baumet al. 1966. . .
Error correcting codes:- Low-density parity check codes: Gallager 1962; Tanner 1981;MacKay 1996; Luby et al. 1998. . .- Convolutional codes and Viterbi decoding: Forney 1973. . .- Turbo codes: Berrou et al. 1993. . .
Machine learning, statistics:- Bayesian networks: Pearl 1988; Shachter 1988; Lauritzen andSpiegelhalter 1988; Shafer and Shenoy 1990. . .
![Page 5: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/5.jpg)
5
Outline of this talk
1. Factor graphs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. The sum-product and max product algorithms . . . . . . . . . . . 19
3. On factor graphs with cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4. Factor graphs and error correcting codes . . . . . . . . . . . . . . . . . 51
![Page 6: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/6.jpg)
6
Factor Graphs
A factor graph represents the factorization of a function of severalvariables. We use Forney-style factor graphs (Forney, 2001).
Example:
f (x1, x2, x3, x4, x5) = fA(x1, x2, x3) · fB(x3, x4, x5) · fC(x4).
x1
fA
x2
x3
fB
fC
x4
x5
Rules:
• A node for every factor.
• An edge or half-edge for every variable.
• Node g is connected to edge x iff variable x appears in factor g.
(What if some variable appears in more than 2 factors?)
![Page 7: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/7.jpg)
7
A main application of factor graphs are stochastic models. Example:
Markov Chain
pXY Z(x, y, z) = pX(x) pY |X(y|x) pZ|Y (z|y).
pX
X
pY |X
Y
pZ|Y
Z
![Page 8: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/8.jpg)
8
Other Notation Systems for Graphical Models
Example: p(u, w, x, y, z) = p(u)p(w)p(x|u, w)p(y|x)p(z|x).
WU
X =
Y
Z
Forney-style factor graph.
� ��W
� ��U
� ��X
� ��Y
� ��Z
Original factor graph [FKLW 1997].
� ��W - � ��X
� ��U
?
?
� ��Y
- � ��Z
Bayesian network.
� ��W � ��X�
�����
� ��U
� ��Y
� ��Z
Markov random field.
![Page 9: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/9.jpg)
9
Terminology
x1
fA
x2
x3
fB
fC
x4
x5
Local function = factor (such as fA, fB, fC).
Global function f = product of all local functions; usually (but notalways!) real and nonnegative.
A configuration is an assignment of values to all variables.
The configuration space is the set of all configurations, which is thedomain of the global function.
A configuration ω = (x1, . . . , x5) is valid iff f (ω) 6= 0.
![Page 10: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/10.jpg)
10
Invalid Configurations Do Not Affect Marginals
A configuration ω = (x1, . . . , xn) is valid iff f (ω) 6= 0.
Recall:
1. Marginalization: Compute
fk(xk)4=
∑x1, . . . , xn
except xk
f (x1, . . . , xn)
2. Maximization: Compute the “max-marginal”
fk(xk)4= max
x1, . . . , xn
except xk
f (x1, . . . , xn)
assuming that f is real-valued and nonnegative and has a maximum.
![Page 11: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/11.jpg)
11
Auxiliary Variables
Example: Let Y1 and Y2 be two independent observations of X:
p(x, y1, y2) = p(x)p(y1|x)p(y2|x).
pX
X
=f=
X ′
pY1|X
Y1
X ′′
pY2|X
Y2
Literally, the factor graph represents an extended model
p(x, x′, x′′, y1, y2) = p(x)p(y1|x′)p(y2|x′′)f=(x, x′, x′′)
wheref=(x, x′, x′′)
4= δ(x− x′)δ(x− x′′)
enforces X = X ′ = X ′′ for every valid configuration.
![Page 12: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/12.jpg)
12
Branching Points
Equality constraint nodes may be viewed as branching points:
X
X ′
= X ′′ ⇐⇒ X t
The factorf=(x, x′, x′′)
4= δ(x− x′)δ(x− x′′)
enforces X = X ′ = X ′′ for every valid configuration.
![Page 13: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/13.jpg)
13
Modularity, Special Symbols, Arrows
As a refinement of the previous example, let
Y1 = X + Z1 (1)
Y2 = X + Z2 (2)
with Z1 and Z2 independent of each other and of X:
pX
X
=
?pZ1
-Z1 +
pY1|X
?Y1
?pZ2
�Z2+
pY2|X
?Y2
The “+”-nodes represent the factors δ(x+z1−y1) and δ(x + z2 − y2),which enforce (1) and (2) for every valid configuration.
![Page 14: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/14.jpg)
14
Known Variables vs. Unknown Variables
Known variables (observations, known parameters, . . . ) may beplugged into the corresponding factors.
Example: Y2 = y2 observed.
pX(·)X
=
pY1|X(·|·)
Y1
pY2|X(·|·)
y2 pY2|X(y2|·)
Known variables will be denoted by small letters;unknown variables will usually be denoted by capital letters.
![Page 15: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/15.jpg)
15
From A Priori to A Posteriori Probability
Example (cont’d): Let Y1 = y1 and Y2 = y2 be two independentobservations of X. For fixed y1 and y2, we have
p(x|y1, y2) =p(x, y1, y2)
p(y1, y2)
∝ p(x, y1, y2).
pX
X
=
pY1|X
y1
pY2|X
y2
The factorization is unchanged (except for a scale factor).
![Page 16: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/16.jpg)
16
Example:
Hidden Markov Model
p(x0, x1, x2, . . . , xn, y1, y2, . . . , yn) = p(x0)
n∏k=1
p(xk|xk−1)p(yk|xk−1)
p(x0)-
X0 =
?
p(y1|x0)
?
Y1
-
p(x1|x0)
-X1 =
?
p(y2|x1)
?
Y2
-
p(x2|x1)
-X2
. . .
![Page 17: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/17.jpg)
17
Example:
Hidden Markov Model with Parameter(s)
p(x0, x1, x2, . . . , xn, y1, y2, . . . , yn | θ) = p(x0)
n∏k=1
p(xk|xk−1, θ)p(yk|xk−1)
p(x0)-
X0 =
?
?
Y1
-
p(x1|x0, θ)
-X1 =
?
?
Y2
-
p(x2|x1, θ)
-X2
. . .
Θ=
?
=
?
![Page 18: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/18.jpg)
18
A non-stochastic example:
Least-Squares Problems
Minimizingn∑
k=1
x2k subject to (linear or nonlinear) constraints is
equivalent to maximizing
e−∑n
k=1 x2k =
n∏k=1
e−x2k
subject to the given constraints.
constraints
X1
e−x21
. . .Xn
e−x2n
Here, the factor graph represents a nonnegative real-valued functionthat we wish to maximize.
![Page 19: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/19.jpg)
19
Outline
1. Factor graphs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. The sum-product and max product algorithms . . . . . . . . . . . 19
3. On factor graphs with cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4. Factor graphs and error correcting codes . . . . . . . . . . . . . . . . . 51
![Page 20: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/20.jpg)
20
Recall:
The Two Basic Problems
1. Marginalization: Compute
fk(xk)4=
∑x1, . . . , xn
except xk
f (x1, . . . , xn)
2. Maximization: Compute the “max-marginal”
fk(xk)4= max
x1, . . . , xn
except xk
f (x1, . . . , xn)
assuming that f is real-valued and nonnegative and has a maximum.
![Page 21: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/21.jpg)
21
Message Passing Algorithms
operate by passing messages along the edges of a factor graph:
-
�
6?
6?
-
�
-
�
6?
6?
-
�
-
� . . .
-
�
6?
-
�
6?
-
�
![Page 22: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/22.jpg)
22
Towards the sum-product algorithm:
Computing Marginals—A Generic Example
Assume we wish to compute
f3(x3) =∑
x1, . . . , x7
except x3
f (x1, . . . , x7)
and assume that f can be factored as follows:
f1
X1
f2
X2
f3
X3
f4
X4
f5
X5
f7
X7
f6
X6
![Page 23: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/23.jpg)
23
Example cont’d:
Closing Boxes by the Distributive Law
f1
X1
f2
X2
f3
X3
f4
X4
f5
X5
f7
X7
f6
X6
f3(x3) =
(∑x1,x2
f1(x1)f2(x2)f3(x1, x2, x3)
)
·
(∑x4,x5
f4(x4)f5(x3, x4, x5)
(∑x6,x7
f6(x5, x6, x7)f7(x7)
))
![Page 24: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/24.jpg)
24
Example cont’d: Message Passing View
f1
X1
f2
X2
f3
X3- �
f4
X4
f5
X5�
f7
X7
f6
X6
f3(x3) =
(∑x1,x2
f1(x1)f2(x2)f3(x1, x2, x3)︸ ︷︷ ︸−→µX3
(x3)
)
·
(∑x4,x5
f4(x4)f5(x3, x4, x5)
(∑x6,x7
f6(x5, x6, x7)f7(x7)︸ ︷︷ ︸←−µX5
(x5)
)︸ ︷︷ ︸
←−µX3(x3)
)
![Page 25: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/25.jpg)
25
Example cont’d: Messages Everywhere
f1
X1-
f2
X2?
f3
X3- �
f4
X4 ?
f5
X5�
f7
X7 ?
f6
X6
With −→µX1(x1)4= f1(x1),
−→µX2(x2)4= f2(x2), etc., we have
−→µX3(x3) =∑x1,x2
−→µX1(x1)−→µX2(x2)f3(x1, x2, x3)
←−µX5(x5) =∑x6,x7
−→µX7(x7)f6(x5, x6, x7)
←−µX3(x3) =∑x4,x5
−→µX4(x4)←−µX5(x5)f5(x3, x4, x5)
![Page 26: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/26.jpg)
26
The Sum-Product Algorithm (Belief Propagation)
HHHH
HHH
Y1
..
.
�������
Yn
gXH
HHj
���* - �
Sum-product message computation rule:
−→µX(x) =∑
y1,...,yn
g(x, y1, . . . , yn)−→µY1(y1) · · · −→µYn(yn)
Sum-product theorem:
If the factor graph for some global function f has no cycles, then
fX(x) = −→µX(x)←−µX(x).
![Page 27: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/27.jpg)
27
Arrows and Notation for Messages
X-
�
-
For edges drawn with arrows:
−→µX denotes the message in the direction of the arrow.←−µX denotes the message in the opposite direction.
Edges may be drawn with arrows just for the sake of this notation.
![Page 28: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/28.jpg)
28
Sum-Product Algorithm: a Simple Example
pX
?
X?
6
=
?
X ′6
pZ1
-Z1
- +
?y1
?
X ′′ 6
pZ2
�Z2�+
?y2
−→µX(x) = pX(x)−→µZ1(z1) = pZ1(z1)−→µZ2(z2) = pZ2(z2)
![Page 29: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/29.jpg)
29
Sum-Product Example cont’d
pX
?
X?
6
=
?
X ′6
pZ1
-Z1
- +
?y1
?
X ′′ 6
pZ2
�Z2�+
?y2
←−µX ′(x′) =
∫z1
−→µZ1(z1) δ(x′ + z1 − y1) dz1
= pZ1(y1 − x′)
←−µX(x) =
∫x′
∫x′′
←−µX ′(x′)←−µX ′′(x
′′) δ(x− x′) δ(x− x′′) dx′ dx′′
= pZ1(y1 − x) pZ2(y2 − x)
![Page 30: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/30.jpg)
30
Sum-Product Example cont’d
pX
?
X?
6
=
?
X ′6
pZ1
-Z1
- +
?y1
?
X ′′ 6
pZ2
�Z2�+
?y2
Marginal of the global function at X:
−→µX(x)←−µX(x) = pX(x) pZ1(y1 − x) pZ2(y2 − x)︸ ︷︷ ︸p(y1,y2|x)
∝ p(x|y1, y2).
![Page 31: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/31.jpg)
31
Messages for Finite-Alphabet Variables
may be represented by a list of function values.
Assume, for example that X takes values in {+1,−1}:
pX
?
X?
6
=
?
X ′6
pZ1
-Z1
- +
?y1
?
X ′′ 6
pZ2
�Z2�+
?y2
−→µX =(−→µX(+1),−→µX(−1)
)=(pX(+1), pX(−1)
)←−µX ′ =
(←−µX ′(+1),←−µX ′(−1))
=(pZ1(y1 − 1), pZ1(y1 + 1)
)etc.
![Page 32: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/32.jpg)
32
Applying the sum-product algorithm to
Hidden Markov Models
yields recursive algorithms for many things.
Recall the definition of a hidden Markov model (HMM):
p(x0, x1, x2, . . . , xn, y1, y2, . . . , yn) = p(x0)
n∏k=1
p(xk|xk−1)p(yk|xk−1)
p(x0)-
X0 =
?
p(y1|x0)
?
y1
-
p(x1|x0)
-X1 =
?
p(y2|x1)
?
y2
-
p(x2|x1)
-X2 =
?
p(y3|x2)
?
y3
-. . .
Assume that Y1 = y1, . . . , Yn = yn are observed (known).
![Page 33: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/33.jpg)
33
Sum-product algorithm applied to HMM:
Estimation of Current State
p(xn|y1, . . . , yn) =p(xn, y1, . . . , yn)
p(y1, . . . , yn)
∝ p(xn, y1, . . . , yn)
=∑x0
. . .∑xn−1
p(x0, x1, . . . , xn, y1, y2, . . . , yn)
= −→µXn(xn).
For n = 2:
-X0
- =
?
6
?
y1
--
-X1
- =
?
6
?
y2
--
-X2
- =
?
6=1
?
Y3
-�
=1. . .
![Page 34: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/34.jpg)
34
Backward Message in Chain Rule Model
pX
-X
�
pY |X
-Y
If Y = y is known (observed):
←−µX(x) = pY |X(y|x),
the likelihood function.
If Y is unknown:
←−µX(x) =∑
y
pY |X(y|x)
= 1.
![Page 35: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/35.jpg)
35
Sum-product algorithm applied to HMM:
Prediction of Next Output Symbol
p(yn+1|y1, . . . , yn) =p(y1, . . . , yn+1)
p(y1, . . . , yn)∝ p(y1, . . . , yn+1)
=∑
x0,x1,...,xn
p(x0, x1, . . . , xn, y1, y2, . . . , yn, yn+1)
= −→µYn(yn).
For n = 2:
-X0
- =
?
6
?
y1
--
-X1
- =
?
6
?
y2
--
-X2
- =
?
?
?
Y3?
-�
=1. . .
![Page 36: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/36.jpg)
36
Sum-product algorithm applied to HMM:
Estimation of Time-k State
p(xk | y1, y2, . . . , yn) =p(xk, y1, y2, . . . , yn)
p(y1, y2, . . . , yn)∝ p(xk, y1, y2, . . . , yn)
=∑
x0, . . . , xn
except xk
p(x0, x1, . . . , xn, y1, y2, . . . , yn)
= −→µXk(xk)←−µXk
(xk)
For k = 1:
-X0
- =
?
6
?
y1
--
-X1
-�
=
?
6
?
y2
-�
-X2� =
?
6
?
y3
-� . . .
![Page 37: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/37.jpg)
37
Sum-product algorithm applied to HMM:
All States Simultaneously
p(xk|y1, . . . , yn) for all k:
-X0
-�
=
?
6
?
y1
--
�
-X1
-�
=
?
6
?
y2
--
�
-X2
-�
=
?
6
?
y3
-�
- . . .
In this application, the sum-product algorithm coincides with theBaum-Welch / BCJR forward-backward algorithm.
![Page 38: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/38.jpg)
38
Scaling of Messages
In all the examples so far:
• The final result (such as −→µXk(xk)←−µXk
(xk)) equals the desiredquantity (such as p(xk|y1, . . . , yn)) only up to a scale factor.
• The missing scale factor γ may be recovered at the end fromthe condition ∑
xk
γ−→µXk(xk)←−µXk
(xk) = 1.
• It follows that messages may be scaled freely along the way.
• Such message scaling is often mandatory to avoid numericalproblems.
![Page 39: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/39.jpg)
39
Sum-product algorithm applied to HMM:
Probability of the Observation
p(y1, . . . , yn) =∑x0
. . .∑xn
p(x0, x1, . . . , xn, y1, y2, . . . , yn)
=∑xn
−→µXn(xn).
This is a number. Scale factors cannot be neglected in this case.
For n = 2:
-X0
- =
?
6
?
y1
--
-X1
- =
?
6
?
y2
--
-X2
- =
?
6=1
?
Y3
-�
=1. . .
![Page 40: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/40.jpg)
40
Towards the max-product algorithm:
Computing Max-Marginals—A Generic Example
Assume we wish to compute
f3(x3) = maxx1, . . . , x7
except x3
f (x1, . . . , x7)
and assume that f can be factored as follows:
f1
X1
f2
X2
f3
X3
f4
X4
f5
X5
f7
X7
f6
X6
![Page 41: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/41.jpg)
41
Example:
Closing Boxes by the Distributive Law
f1
X1
f2
X2
f3
X3
f4
X4
f5
X5
f7
X7
f6
X6
f3(x3) =
(maxx1,x2
f1(x1)f2(x2)f3(x1, x2, x3)
)
·
(maxx4,x5
f4(x4)f5(x3, x4, x5)
(maxx6,x7
f6(x5, x6, x7)f7(x7)
))
![Page 42: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/42.jpg)
42
Example cont’d: Message Passing View
f1
X1
f2
X2
f3
X3- �
f4
X4
f5
X5�
f7
X7
f6
X6
f3(x3) =
(maxx1,x2
f1(x1)f2(x2)f3(x1, x2, x3)︸ ︷︷ ︸−→µX3
(x3)
)
·
(maxx4,x5
f4(x4)f5(x3, x4, x5)
(maxx6,x7
f6(x5, x6, x7)f7(x7)︸ ︷︷ ︸←−µX5
(x5)
)︸ ︷︷ ︸
←−µX3(x3)
)
![Page 43: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/43.jpg)
43
Example cont’d: Messages Everywhere
f1
X1-
f2
X2?
f3
X3- �
f4
X4 ?
f5
X5�
f7
X7 ?
f6
X6
With −→µX1(x1)4= f1(x1),
−→µX2(x2)4= f2(x2), etc., we have
−→µX3(x3) = maxx1,x2
−→µX1(x1)−→µX2(x2)f3(x1, x2, x3)
←−µX5(x5) = maxx6,x7
−→µX7(x7)f6(x5, x6, x7)
←−µX3(x3) = maxx4,x5
−→µX4(x4)←−µX5(x5)f5(x3, x4, x5)
![Page 44: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/44.jpg)
44
The Max-Product Algorithm
HHHHH
HH
Y1
..
.
�������
Yn
gXHHHj
���* - �
Max-product message computation rule:
−→µX(x) = maxy1,...,yn
g(x, y1, . . . , yn)−→µY1(y1) · · · −→µYn(yn)
Max-product theorem:
If the factor graph for some global function f has no cycles, then
fX(x) = −→µX(x)←−µX(x).
![Page 45: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/45.jpg)
45
Max-product algorithm applied to HMM:
MAP Estimate of the State Trajectory
The estimate
(x0, . . . , xn)MAP = argmaxx0,...,xn
p(x0, . . . , xn|y1, . . . , yn)
= argmaxx0,...,xn
p(x0, . . . , xn, y1, . . . , yn)
may be obtained by computing
pk(xk)4= max
x1, . . . , xn
except xk
p(x0, . . . , xn, y1, . . . , yn)
= −→µXk(xk)←−µXk
(xk)
for all k by forward-backward max-product sweeps.
In this example, the max-product algorithm is a time-symmetricversion of the Viterbi algorithm with soft output.
![Page 46: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/46.jpg)
46
Max-product algorithm applied to HMM:
MAP Estimate of the State Trajectory cont’d
Computing
pk(xk)4= max
x1, . . . , xn
except xk
p(x0, . . . , xn, y1, . . . , yn)
= −→µXk(xk)←−µXk
(xk)
simultaneously for all k:
-X0
-�
=
?
6
?
y1
--
�
-X1
-�
=
?
6
?
y2
--
�
-X2
-�
=
?
6
?
y3
-�
- . . .
![Page 47: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/47.jpg)
47
Marginals and Output Edges
Marginals such −→µX(x)←−µX(x) may be viewed as messages out of a“output half edge” (without incoming message):
-X
- � =⇒-
X ′- = -
X ′′�
?
X ?
−→µ X(x) =
∫x′
∫x′′
−→µX ′(x′)←−µX ′′(x
′′) δ(x− x′) δ(x− x′′) dx′ dx′′
= −→µX ′(x)←−µX ′′(x)
=⇒ Marginals are computed like messages out of “=”-nodes.
![Page 48: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/48.jpg)
48
Outline
1. Factor graphs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. The sum-product and max product algorithms . . . . . . . . . . . 19
3. On factor graphs with cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4. Factor graphs and error correcting codes . . . . . . . . . . . . . . . . . 51
![Page 49: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/49.jpg)
49
What About Factor Graphs with Cycles?
-
�
6?
6?
-
�
-
�
6?
6?
-
�
-
�
-
�
6?
-
�
6?
-
�
![Page 50: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/50.jpg)
50
What About Factor Graphs with Cycles?
• Generally iterative algorithms.
• For example, alternating maximization
xnew = argmaxx
f (x, y) and ynew = argmaxy
f (x, y)
using the max-product algorithm in each iteration.
• Iterative sum-product message passing gives excellent resultsfor maximization(!) in some applications (e.g., the decoding oferror correcting codes).
• Many other useful algorithms can be formulated in messagepassing form (e.g., gradient ascent, Gibbs sampling, expectationmaximization, variational methods,. . . ).
• Rich and vast research area. . .
![Page 51: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/51.jpg)
51
Outline
1. Factor graphs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. The sum-product and max product algorithms . . . . . . . . . . . 19
3. On factor graphs with cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4. Factor graphs and error correcting codes . . . . . . . . . . . . . . . . . 51
![Page 52: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/52.jpg)
52
Factor Graph of an Error Correcting Code
≈ Tanner graph of the code (Tanner 1981)
A factor graph of a code C ⊂ F n represents (a factorization of)the membership indicator function of the code:
IC : F n → {0, 1} : x 7→{
1, if x ∈ C0, else
![Page 53: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/53.jpg)
53
Factor Graph from Parity Check Matrix
Example: (7, 4, 3) binary Hamming code. (F4= GF(2).)
C = {x ∈ F n : HxT = 0}
with
H =
1 1 1 0 1 0 00 1 1 1 0 1 00 0 1 1 1 0 1
The membership indicator function
IC : F n → {0, 1} : x 7→{
1, if x ∈ C0, else
of this code may be written as
IC(x1, . . . , xn) = δ(x1⊕x2⊕x3⊕x5)·δ(x2⊕x3⊕x4⊕x6)·δ(x3⊕x4⊕x5⊕x7)
where ⊕ denotes addition modulo 2. Each factor corresponds toone row of the parity check matrix.
![Page 54: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/54.jpg)
54
Factor Graph from Parity Check Matrix (cont’d)
Example: (7, 4, 3) binary Hamming code.
C = {x ∈ F n : HxT = 0}
with
H =
1 1 1 0 1 0 00 1 1 1 0 1 00 0 1 1 1 0 1
X1
=
X2
=
X3
=
X4
=
X5 X6 X7
⊕
JJJJJJJ
HHH
HHH
HHH
HHH
HHH
H⊕
����������
ZZZ
ZZZ
ZZZ
ZZ
⊕
����������������
����������
JJJJJJJJ
![Page 55: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/55.jpg)
55
Factor Graph for Joint Code / Channel Model
code
X1 X2. . . Xn
channel model
Y1 Y2. . .
Yn
Example: memoryless channel:
X1
Y1
X2
Y2
. . .
Xn
Yn
![Page 56: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/56.jpg)
56
Factor Graph from Generator Matrix
Example: (7, 4, 3) binary Hamming code is the image of
F k → F n : u 7→ uG
with
G =
1 0 1 1 0 0 00 1 0 1 1 0 00 0 1 0 1 1 00 0 0 1 0 1 1
.
X1 X2
⊕
X3
⊕
X4
⊕
X5
⊕
X6 X7
=
U1
ZZZ
ZZZ
ZZZZ
HHH
HHH
HHH
HHH
HHH
H=
U2
JJJJJJJ
ZZ
ZZ
ZZZ
ZZZ
=
U3
����������
JJJJJJJ
=
U4
����������������
![Page 57: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/57.jpg)
57
Factor Graph of Dual Code
is obtained by interchanging partity check nodes and equality checknodes (Kschischang, Forney).
Works only for Forney-style factor graphs where all code symbolsare external (half-edge) variables.
Example: dual of (7, 4, 3) binary Hamming code
X1
⊕
X2
⊕
X3
⊕
X4
⊕
X5 X6 X7
=
JJJJJJJ
HHHHH
HHH
HHH
HHH
HH=
����������
ZZZ
ZZZ
ZZZ
ZZ
=
����������������
����������
JJJJJJJJ
![Page 58: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/58.jpg)
58
Factor Graph Corresponding to Trellis
Example: (7, 4, 3) binary Hamming code
s����������
1
HHHHH
HHHHH
0
s
s
!!!!!!
!!!!0
cccccccccc
1
##########
1
aaaaaaaaaa0
s
s
s
s
0bbbbbbbbbb
1
""""""""""
1
0
0bbbbbbbbbb
1
""""""""""
1
0
s
s
s
s
01@@@@@@@@@@@@
10
00@@@@@@@@@@@@
11
������������
10
01
������������
11
00
s
s
s
s
aaaaaaaaaa
0
1
``````````
1
!!!!!!
!!!!
0
s
s
HHHHH
HHHHH
1
����
������
0
s
X1 X4 X3 X2 X5 X6 X7
S1 S2 S3 S4 S6
![Page 59: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/59.jpg)
59
Factor Graph of Low-Density Parity-Check Codes
“random” connections
=
X1
LLL
���
=
X2
\\\
���
. . .=
Xn
\\\LLL���
⊕���LLL\\\
. . . ⊕������LLL
Standard decoder: iterative sum-product message passing.Convergence is not guaranteed!
Much recent / ongoing research on improved decoding:Yedidia, Freeman, Weiss; Feldman; Wainwright; Chertkov. . .
![Page 60: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/60.jpg)
60
Single-Number Parameterizationsof Soft-Bit Messages
Difference: ∆4=
µ(0)− µ(1)
µ(0) + µ(1)= mean of {+1,−1}-representation
Ratio: Λ4= µ(0)/µ(1)
Logarithm of ratio: L4= log
(µ(0)/µ(1)
)
![Page 61: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/61.jpg)
61
Conversions among Parameterizations
(µ(0)
µ(1)
)∆ = m Λ L
(µ(0)
µ(1)
) (1+∆
21−∆
2
) (Λ
Λ+11
Λ+1
) (eL
eL+11
eL+1
)
∆ = m µ(0)−µ(1)µ(0)+µ(1)
Λ−1Λ+1 tanh(L/2)
Λ µ(0)µ(1)
1+∆1−∆ eL
L ln µ(0)µ(1) 2 tanh−1(∆) ln Λ
σ2 4µ(0)µ(1)(µ(0)+µ(1))2
1−m2 4Λ(Λ+1)2
4eL+e−L+2
![Page 62: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/62.jpg)
62
Sum-Product Rules for Binary Parity Check Codes
δ[x− y] δ[x− z]
=X-
Z-
Y 6
(µZ(0)
µZ(1)
)=
(µX(0) µY (0)
µX(1) µY (1)
)
∆Z =∆X + ∆Y
1 + ∆X∆Y
ΛZ = ΛX · ΛY
LZ = LX + LY
δ[x⊕ y ⊕ z]
⊕X-
Z-
Y 6
(µZ(0)
µZ(1)
)=
(µX(0) µY (0) + µX(1) µY (1)
µX(0) µY (1) + µX(1) µY (0)
)∆Z = ∆X ·∆Y
ΛZ =1 + ΛXΛY
ΛX + ΛY
tanh(LZ/2) = tanh(LX/2) · tanh(LY /2)
![Page 63: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/63.jpg)
63
Max-Product Rules for Binary Parity Check Codes
δ[x− y] δ[x− z]
=X-
Z-
Y 6
µZ(0)
µZ(1)
=
µX(0) µY (0)
µX(1) µY (1)
LZ = LX + LY
δ[x⊕ y ⊕ z]
⊕X-
Z-
Y 6
µZ(0)
µZ(1)
=
max{µX(0) µY (0), µX(1) µY (1)
}max
{µX(0) µY (1), µX(1) µY (0)
}
|LZ| = min{|LX|, |LY |
}sgn(LZ) = sgn(LX) · sgn(LY )
![Page 64: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/64.jpg)
64
Decomposition of Multi-Bit Checks
= = =
⊕ ⊕ ⊕
![Page 65: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/65.jpg)
65
Encoder of a Turbo Code
-X1,ks
- s?
s? ?n- - n- - n s
6
-X2,k
s
?
Interleaver
- s?
s? ?n- - n- - n s
6
-X3,k
![Page 66: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/66.jpg)
66
Factor Graph of a Turbo Code
X1,k−1
=
X1,k
=
X1,k+1
=
· · ·
X2,k−1 X2,k X2,k+1
· · ·
“random” connections
· · ·
X3,k−1 X3,k X3,k+1
· · ·
![Page 67: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/67.jpg)
67
Turbo Decoder: Conventional View
-
Lin2,k - Decoder A -
Lout,k
s?ns -−
Lin1,k
s6n?n?
LoutA,k������������������LoutB,k
PPPP
PPPP
PPPP
PPPP
P�
InterleaverInverse
interleaver
-
-Lin3,k Decoder B
6
n6s -−
![Page 68: Factor Graphs and Message Passing Algorithms — Part 1 ...crm.sns.it/media/course/1524/Loeliger_A.pdf2,x 3,x 4,x 5) = f A(x 1,x 2,x 3)·f B(x 3,x 4,x 5)·f C(x 4). x 1 f A x 2 x 3](https://reader033.vdocuments.net/reader033/viewer/2022050519/5fa2f29ce971b429cb0909f3/html5/thumbnails/68.jpg)
68
Messages in a Turbo Decoder
=
?Lin1,k 6LoutA,k
+LoutB,k=
6LoutA,k ?
?6LoutB,k
=
· · ·6Lin2,k
· · ·
“random” connections
· · ·6Lin3,k
· · ·