axel haddad
TRANSCRIPT
![Page 1: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/1.jpg)
To reach or not to reach?Efficient algorithms for total-payoff games
Thomas Brihaye1, Gilles Geeraerts2, Axel Haddad1 (me),Benjamin Monmege2
1Université de Mons2Université de Bruxelles
European Project FP7-CASSTING
![Page 2: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/2.jpg)
Teaser
• Variant of usual quantitative games
• Add a reachability objective
• We want to compute the value
• Game extension of shortest path problem
• Solve an open problem for total-payoff games
![Page 3: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/3.jpg)
2-player quantitative games on graph
Eve plays against Adam. The arena is:
• a finite graph,
• where the vertices belong either to Eve or Adam,
• and each edge has a weight.
During a play:
• A token is moved along the edges
• by the player that owns the current state.
• The play is infinite.
![Page 4: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/4.jpg)
Payoff function
Defines a value of a play.
Total Payoff: the limit of the sums of the weigths.
Mean Payoff: the limit of the average of the weights.
(actually we take the limit inferior)
Eve wants to minimize it, Adam wants to maximize it.
![Page 5: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/5.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights:
-1 1 -1 2 -2 2 -2 2 . . .
Sums:
-1 0 -1 1 -1 1 -1 1 . . .
Average:
-1 0 -0.333 0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 6: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/6.jpg)
Example
• • •
−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1
1 -1 2 -2 2 -2 2 . . .
Sums: -1
0 -1 1 -1 1 -1 1 . . .
Average: -1
0 -0.333 0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 7: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/7.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1
-1 2 -2 2 -2 2 . . .
Sums: -1 0
-1 1 -1 1 -1 1 . . .
Average: -1 0
-0.333 0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 8: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/8.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1
2 -2 2 -2 2 . . .
Sums: -1 0 -1
1 -1 1 -1 1 . . .
Average: -1 0 -0.333
0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 9: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/9.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2
-2 2 -2 2 . . .
Sums: -1 0 -1 1
-1 1 -1 1 . . .
Average: -1 0 -0.333 0.25
-0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 10: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/10.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2
2 -2 2 . . .
Sums: -1 0 -1 1 -1
1 -1 1 . . .
Average: -1 0 -0.333 0.25 -0.2
0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 11: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/11.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2 2
-2 2 . . .
Sums: -1 0 -1 1 -1 1
-1 1 . . .
Average: -1 0 -0.333 0.25 -0.2 0.166
-0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 12: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/12.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2 2 -2
2 . . .
Sums: -1 0 -1 1 -1 1 -1
1 . . .
Average: -1 0 -0.333 0.25 -0.2 0.166 -0.143
0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 13: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/13.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2 2 -2 2
. . .
Sums: -1 0 -1 1 -1 1 -1 1
. . .
Average: -1 0 -0.333 0.25 -0.2 0.166 -0.143 0.125
. . .
Total Payoff: −1 Mean Payoff: 0
![Page 14: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/14.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2 2 -2 2 . . .
Sums: -1 0 -1 1 -1 1 -1 1 . . .
Average: -1 0 -0.333 0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 15: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/15.jpg)
Example
• • •−1
−1
1
1
−2
−2
2
2
−1
−1
Weights: -1 1 -1 2 -2 2 -2 2 . . .
Sums: -1 0 -1 1 -1 1 -1 1 . . .
Average: -1 0 -0.333 0.25 -0.2 0.166 -0.143 0.125 . . .
Total Payoff: −1 Mean Payoff: 0
![Page 16: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/16.jpg)
Known results
There exists optimal positional strategies for both players[Ehrenfeucht, Mycielski 79] [Gimbert, Zielonka 04].
(Positional strategy = strategy that depends only on the current node)
Deciding whether the value of a vertex is ⩽ K is in NP ∩ coNP (noknown algorithm in P).
For Mean Payoff one can compute the values in pseudo-polynomialtime [Zwick, Paterson 95].
![Page 17: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/17.jpg)
Reachability quantitative gamesAdd some target vertices.Eve wants to reach a target while minimizing the payoff.
(Eve gets +∞ if she does not reach a target)
Adam wants to avoid the target or maximize the payoff.
• •0
-1 0
Val = −∞ but no optimal strategy!
• •
•
-1
0
-W 0
0
Optimal strategy for Eve:go ← W times and then go ↓
Optimal strategy for Adam: go ↓
![Page 18: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/18.jpg)
Reachability quantitative gamesAdd some target vertices.Eve wants to reach a target while minimizing the payoff.
(Eve gets +∞ if she does not reach a target)
Adam wants to avoid the target or maximize the payoff.
• •0
-1 0
Val = −∞ but no optimal strategy!
• •
•
-1
0
-W 0
0
Optimal strategy for Eve:go ← W times and then go ↓
Optimal strategy for Adam: go ↓
![Page 19: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/19.jpg)
Reachability quantitative gamesAdd some target vertices.Eve wants to reach a target while minimizing the payoff.
(Eve gets +∞ if she does not reach a target)
Adam wants to avoid the target or maximize the payoff.
• •0
-1 0
Val = −∞ but no optimal strategy!
• •
•
-1
0
-W 0
0Optimal strategy for Eve:
go ← W times and then go ↓Optimal strategy for Adam: go ↓
![Page 20: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/20.jpg)
What is known
Best strategies are of the form:• play for a long time a positional strategy• and then reach the target
[Filiot, Gentilini, Raskin 12].
Deciding whether the value of a vertex is ⩽ K is in NP ∩ coNP.
Total Payoff, Non-negative weights. In this case, positionallydetermined, value and optimal strategies can be computed in P(modified Dijkstra algorithm) [Kachiyan et Al. 08].
![Page 21: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/21.jpg)
Contributions
Reachability mean-payoff games are equivalent to mean-payoffgames.⇒ One can compute the values in pseudo-polynomial time.
A value iteration algorithm for reachability total-payoff games:⇒ it computes the values in pseudo-polynomial time.
A value iteration algorithm for total-payoff games (alsopseudo-polynomial).
![Page 22: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/22.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
+∞ +∞
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 23: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/23.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
+∞ +∞
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 24: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/24.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
+∞ 0
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 25: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/25.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
+∞ 0
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 26: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/26.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−1 0
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 27: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/27.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−1 0
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 28: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/28.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−1 −1
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 29: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/29.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−1 −1
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 30: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/30.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
. . . . . .
. . .
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...
−W −W 0
−W −W 0
![Page 31: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/31.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
. . . . . .
. . .
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...
−W −W 0
−W −W 0
![Page 32: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/32.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−W −W
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 33: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/33.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−W −W
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 34: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/34.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−W −W
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 35: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/35.jpg)
Algorithm for reachability total-payoffCompute Val⩽i the value mapping when the game stops after i steps.
(Val⩽i+1 = do one move, and get the values of Val⩽i)
• •
•
−W −W
0
-1
0
-W 0
0
optimal positional strategy for Adam
• • •
+∞ +∞ 0
+∞ 0 0
−1 0 0
−1 −1 0
......
...−W −W 0
−W −W 0
![Page 36: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/36.jpg)
Algorithm for total-payoff
• • •A B C-W
1
W
0
construct a RTP game
• • •A B C
• •
•
-W
1
W
0
-W 1 W0
0 0
compute the valuesA: -W, B: 1, C: 0
update the game• • •A B C
• •
•
-W
1
W
0
-W 1 W0
1 0
A: -W+1, B: 2, C: 0
... until it converges ...
A: -0, B: W, C: 0• • •A B C
• •
•
-W
1
W
0
-W 1 W0
W 0
![Page 37: Axel Haddad](https://reader035.vdocuments.net/reader035/viewer/2022081722/55cf944b550346f57ba0fb13/html5/thumbnails/37.jpg)
Conclusion
• Reachability mean-payoff games are equivalent to mean-payoffgames (pseudo-polynomial algorithm)
• Value iteration algorithm for reachability total-payoff games(pseudo-polynomial algorithm)
• Value iteration algorithm for total-payoff games(pseudo-polynomial algorithm)
• More: Acceleration
• More: Finding good strategies for Eve and Adam in RTP gamesand in TP games.
• Thanks! … Questions?