reinforcement learning in starcraft 2
TRANSCRIPT
![Page 1: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/1.jpg)
REINFORCEMENT
LEARNING IN
STARCRAFT 2Björn Boyd Isacsson
![Page 2: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/2.jpg)
AI IN GAMES
When is it used
Improvement needed
Players are too good
Difficult and time-consuming creation
General only
![Page 3: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/3.jpg)
STARCRAFT 2
Why?
What is StarCraft 2?
AI in StarCraft
Default
Human-made
Galaxy Script
![Page 4: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/4.jpg)
REINFORCEMENT LEARNING
Why?
Games
Galaxy limitations
What is Reinforcement Learning?
![Page 5: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/5.jpg)
REINFORCEMENT LEARNING
THE BASICS
States and Actions
Reward-based
Iterations
Optimisation vs Exploration
Put it all together!
Problems?
![Page 6: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/6.jpg)
REINFORCEMENT LEARNING
FIXING THE PROBLEMS
Continuous gameplay
Sampling
Lots of actions
Limiting the actions
“Infinite” states
Table → Function
Q[s][a] → Q(s, a) = α + β * x1 + γ * x2 + …
Find values for α, β, γ…
![Page 7: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/7.jpg)
THE AI IMPLEMENTATION
Sample time
0.1 seconds
Actions
Move left
Move right
Attack closest target
Function
Distance
Reinforcement Learning AI Health
Enemy Health
Action chosen
![Page 8: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/8.jpg)
MY OPTIMAL
![Page 9: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/9.jpg)
ONE UNIT – 10 ITERATIONS
![Page 10: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/10.jpg)
ONE UNIT – 100 ITERATIONS
![Page 11: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/11.jpg)
MANY UNITS
Implementing many units
One AI
Multiple AIs
![Page 12: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/12.jpg)
MANY UNITS – 100 ITERATIONS
![Page 13: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/13.jpg)
MANY UNITS – 1000 ITERATIONS
![Page 14: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/14.jpg)
LIMITATIONS
Reinforcement Learning
Requires instant reward
Many iterations = Lots of time
No guaranteed optimal
Galaxy Script
Memory!
![Page 15: REINFORCEMENT LEARNING IN STARCRAFT 2](https://reader034.vdocuments.net/reader034/viewer/2022051123/5868e4491a28ab3f7c8b9218/html5/thumbnails/15.jpg)
CONCLUSION
There is potential
Improvements could expand the possibilities
Doesn’t fit every scenario