mcts ai

12
MctsAi Team FightingICE March 27, 2016 http://www.ice.ci.ritsumei.ac.jp/~ftgaic/

Upload: ftgaic

Post on 21-Jan-2017

579 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Mcts ai

MctsAiTeam FightingICEMarch 27, 2016

http://www.ice.ci.ritsumei.ac.jp/~ftgaic/

Page 2: Mcts ai

Outline of MctsAi

A sample fighting game AI implementing UCB applied to trees (UCT) [1] for the FightingICE platform

A typical Monte-Carlo Tree Search (MCTS) algorithm [2]

[1] Levente Kocsis and Csaba Szepesvari, “Bandit based Monte-Carlo Planning”[2] R Coulom, “Efficient selectivity and backup operators in Monte-Carlo tree search”

Page 3: Mcts ai

UCT Repeat Selection→Expansion→Playout→Backpropagation until

Reaching the predefined maximum time-length or the maximum number of playouts Use UCB1 value in Selection Finally select the action associated with the adjacent child node, of the root node,

having maximum number of visits

selection expansion playout backpropagation

Page 4: Mcts ai

Upper Confidence Bound (UCB1) [3]

:i: Average evaluation value of node i: Balancing parameter ( empirically set to 3 in the sample AI)::

Select a less visited node with a high evaluation value[3] P Auer and N Cesa-Bianchi and P Fischer, “Finite-time analysis of the multiarmed bandit problem”

Page 5: Mcts ai

MctsAi Procedure

1. Expand all adjacent child nodes at once from the root node2. Repeat an iteration of Selection, Expansion, Playout, and

Backprogation as many times as possible for 16.5ms (<-also empirically set)

3. Select an action to perform

Page 6: Mcts ai

1  Expansion of all adjacent child nodes from the root node Assign a very large random value to non-visited nodes as their initial UCB1 value

0 0

10002

NaN

0

NaN

0

100109999

NaN

0

ucb1value

avg eval. value

# of visits

Node :

Page 7: Mcts ai

2.1 Selection Select nodes with highest UCB1 value all the way down to a leaf node

0

10002

NaN

0

NaN

0

100109999

NaN

0

17

4.42

0.3

3

2.5

10

4.764.07

0.5

4

NaN

0

10030

NaN

0

10028

NaN

0

10020

Example 1

Example 2

Page 8: Mcts ai

2.2 Expansion If a leaf node having 10 visits at the depth level of 1 is

reached, then expand all of its child nodes at once

17

4.42

0.3

3

2.5

10

4.764.07

0.5

4

NaN

0

10030

NaN

0

10028

NaN

0

10020

17

4.42

0.3

3

2.5

10

4.764.07

0.5

4

Page 9: Mcts ai

2.3 Playout

0

10002

NaN

0

NaN

0

100109999

NaN

0

17

4.42

0.3

3

2.5

10

4.764.07

0.5

4

NaN

0

10030

NaN

0

10028

NaN

0

10020

Perform a random simulation for 60 frames ahead

Example 1

Example 2

Page 10: Mcts ai

2.4 Backpropagation

17

4.42

0.3

3

2.5

10

4.764.07

0.5

4

NaN

0

10030

NaN

0

10028

NaN

0

10020

Backpropagate a newly obtained evaluation value and modify the UCB1 value and number of visits of all related nodes

18

4.46

0.3

3

2.27

11

4.444.10

0.5

4

0

1

6.57

NaN

0

10028

NaN

0

10020

Page 11: Mcts ai

3 Selection of an action

0.33

3

4.64

0.33

3

4.64

2.66

6

5.71

56

4.14

2.53

28

1.95

22

3.763.81

0.33

6

0.5

2

5.98

2.2

5

5.66

4.09

11

6.43

Finally, select the action associated with the child node having the highest number of visits

Page 12: Mcts ai

That’s all folks!