Download - Blackjack & The game of tag
![Page 1: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/1.jpg)
Blackjack &
The game of tag
Presented by Leonid Leontiev
![Page 2: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/2.jpg)
Game of tag
Competition, Coevolution and the Game of TagCraig W. Reynolds
Electronic Arts1450 Fashion Island Boulevard
San Mateo, CA 94404 USAtelephone: 415-513-7442, fax: 415-571-1893
[email protected]@red.com
![Page 3: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/3.jpg)
3
Game of tag introduction
• Tag is a children’s game based on symmetrical pursuit and evasion
• Tag is played by two or more, one of whom is designated as “it”
• The it player chases the others, who all try to escape
![Page 4: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/4.jpg)
4
Background
• Tag is intended as a simple model of behavior based on control of locomotion direction, or steering
• Test case to learn about evolving controllers for related, but more complex tasks
• A player’s fitness is determined by how well it performs when placed in competition with several opponents chosen randomly from the coevolving population of players
![Page 5: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/5.jpg)
5
Goals
• Study the use of competitive fitness in the evolution of agent behavior
• Automatically discover a controller through evolution based solely on competition between controllers
• Analyze approach that stands in contrast to evolving controllers by pitting them against a static, predetermined expert strategy
![Page 6: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/6.jpg)
6
History
• 1992 John Koza “Genetic Programming: on the Programming of Computers by Means of Natural Selection”
• 1993 Pete Angeline’s work on coevolution of players for the game of Tic Tac Toe, using competitive fitness
• 1994 Smith R. E. work on coevolution of strategies for the game of Othello
• 1994 Sims, K. “Evolving 3D Morphology and Behavior by Competition”
![Page 7: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/7.jpg)
7
Types of competitive architecture
competitive architecture matches per opponents referencegeneration of n per individual
new versus all (n2-n)/2 n-1 [Koza 1992]new versus several nk k this papersingle elimination n-1 log2 n [Angeline 1993]tournament tree
new versus previous best n 1 [Sims 1994]new versus new n/2 1 [Smith 1994]
![Page 8: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/8.jpg)
8
Experimental Design
• Genetic Programming is used to evolve control programs for simulated vehicles
• No static, predetermined control program
• The vehicles are abstract autonomous agents, moving at constant speed on a two dimensional surface
• Job of control program is to inspect the environment and to compute a steering angle
![Page 9: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/9.jpg)
9
Experimental Design
For each player, at each simulation step:– Its control program is run to determine a steering
angle– The vehicle's heading is altered by this angle– The vehicle is moved a fixed distance along its
new heading– Tags are detected and handled
• The step length is typically 125% longer for “it”
![Page 10: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/10.jpg)
10
Experimental Design
• No simulation of force, mass, acceleration or momentum
• Always two players in a tag game
• The playing field is featureless
• Fitness is defined to be the portion of time (simulation steps) spent not being it
![Page 11: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/11.jpg)
11
Experimental Design
• The entire state of the world consists of: – a flag indicating
who is it– the relative
position of the opponent's vehicle
![Page 12: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/12.jpg)
12
Experimental Design
• Series of 4 games is played• The two players alternate starting as it for
each game of the series• Before each game:
– The players are given random initial headings– Randomly positioned within a starting box
measuring about 3.5 vehicle-body-lengths on a side
• Tag the opponent – getting to within one vehicle length
![Page 13: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/13.jpg)
13
Experimental Design
• Each game
consisted of 25
simulation
steps
• A player's
score for a
game
is the number-of-non-it-steps divided by 25
![Page 14: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/14.jpg)
14
Experimental Design
• To determine a player's fitness, it is pitted against 6 randomly chosen players from the existing population
• Scores from these 24 games are averaged together to obtain the final fitness value
![Page 15: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/15.jpg)
15
Genetic Programming
• Steady State Genetic Programming (SSGP)– choosing two parent programs from the
population– creating a new offspring program from parents
by applying crossover operator and mutation– testing the fitness of the new program – choosing a program to remove from the
population to make room– adding the new program into the population
![Page 16: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/16.jpg)
16
Problems
• Mediocre-but-lucky program may receive undeservedly high fitness and going on to dominate the population
• Competitive fitness values are measured relative to the population at a certain point in time
• Because steady state genetic computation proceeds individual by individual, there is no demarcation of generations.
![Page 17: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/17.jpg)
17
Set of functionsfunction usage description
+ (+ a b) a plus b- (- a b) a minus b* (* a b) a times b% (% a b) if b=0 then 1 else
a divided by bmin (min a b) if a<b then a else b
max (max a b) if a>b then a else babs (abs a) absolute value of a iflte (iflte a b c d) if a <= b then c
else dif-it (if-it a b) if this player is it
then a else blocal-x (local-x) returns x-coordinate
of the opponent playerlocal-y (local-y) returns y-coordinate
of the opponent player
![Page 18: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/18.jpg)
18
Size limitation
• Measured in term of the total number of functions and/or terminals
• When a program size exceeds this limit, the hoist genetic operator [Kinnear 1994] is used to find a smaller (but hopefully still fit) subexpression
![Page 19: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/19.jpg)
19
Results
• These experiments were run on Macintosh Quadra 950 workstations. In this implementation a fitness test consisting of 24 tag games takes 7 to 12 seconds to run, depending on program size.
![Page 20: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/20.jpg)
20
Run A
• A population of 5000 individuals.
• Both players moved at the same speed
• Most popular strategies at the early stage:– Evasion vehicle simply travel in a straight line– Pursuit strategies appear to have been looping
(constant steering angle) and “stumblers” that seemed to move erratically, but managed to creep slowly towards their target.
![Page 21: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/21.jpg)
21
Run A cont.
• Later an improved evasion strategy appeared: if the pursuer is behind you, go straight ahead, otherwise turn randomly (if-it <pursuer-branch> (max 0 (local-y)))
• The pursuers got to be very good at picking off the easy targets, the inefficient evaders
![Page 22: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/22.jpg)
22
Run A cont.
• At the end stage of run A pursuit strategy used a competent but inefficient “three phase” technique
![Page 23: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/23.jpg)
23
![Page 24: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/24.jpg)
24
Run C
• A population of 1000 individuals
• Mutation was added in an attempt to prevent the loss of diversity observed in earlier runs
• Many games consisted of a chase featuring near-optimal pursuit and evasion
![Page 25: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/25.jpg)
25
Fitness of the optimal player placed in competition with the evolving population
![Page 26: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/26.jpg)
26
Histogram of fitness
distribution after 215
generations
![Page 27: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/27.jpg)
27
Run C cont.
• After 215415 individuals were processed (215 generations), there were 4 individuals with the same best fitness value
• One of these was compared to the optimal player in a series of 100 games
• Got a score of 49.3%
![Page 28: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/28.jpg)
28
![Page 29: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/29.jpg)
29
![Page 30: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/30.jpg)
30
Run G
• Did not segregate the pursuer and evader code
• The change seemed to make the problem harder to solve
• Used a larger limit on program size (100)
![Page 31: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/31.jpg)
31
Fitness of the optimal player placed in competition with the evolving population
![Page 32: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/32.jpg)
32
Run G cont.
• Individual 113520 was the best of population
• The program size is 98
• Many strange behavioral traits– Pursuit behavior has a reasonable two phase
strategy for opponents up to 5 units ahead but is very inept for opponents further away
– The evasion behavior is strongly asymmetrical
![Page 33: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/33.jpg)
33
![Page 34: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/34.jpg)
34
Individual 113520 code
(% (% (if-it (abs (local-x)) (iflte (iflte (local-x) 0.57168305 (local-x)
(+ (iflte (local-y) (iflte (local-y) (if-it (local-x) (abs (local-x))) (iflte 0.40530929 0.26004231 (abs (local-x)) (local-y)) (if-it 0.40530929 0.57168305)) (min (abs (local-x)) (+ (local-x) (localx))) (local-x)) (local-x))) 0.57168305 (local-x) (+ (iflte (local-y) (iflte (local-y) (if-it (local-x) (local-x)) (iflte 0.40530929 (local-x) (abs (iflte (local-x) 0.37254661 0.32281655 (local-x))) (local-x)) (if-it 0.40530929 (abs (local-x)))) (min 0.1637349 (iflte (local-x) (local-y) (abs (iflte (abs (local-x)) (max (ifit (local-y) (abs 0.53183758)) (local-x)) 0.32281655 (local-x))) 0.53183758)) (local-x)) (local-x)))) (+ (local-x) (local-x)))
(iflte (- (abs 0.53183758) (if-it (% 0.57168305 (local-y)) (- 0.1637349 (local-y)))) 0.40530929 (abs 0.53183758) 0.83426005))
![Page 35: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/35.jpg)
35
Conclusions• Using the game of tag to test relative fitness,
artificial evolution was able to discover skillful, near-optimal tag players
• Good results were obtained despite the random selection of opponents and based only on relative performance fitness
• The population’s average performance was within 10% of the optimal player, and the best of population individual performed within a few percentage points of optimal (in run C)
![Page 36: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/36.jpg)
36
Conclusions• The quality of evolved players approached,
but did not reach, that of the optimal player
• Possible reasons:– Fundamental limitation of competitive fitness– Flaw in the experimental design– Limitations of genetic population size and length
of runs
![Page 37: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/37.jpg)
BlackjackEvolving Strategies in Blackjack
David B. Fogel Natural Selection, Inc.
3333 N. Torrey Pines Ct., Suite 200 La Jolla, CA 92037 USA
![Page 38: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/38.jpg)
38
Background on blackjack
• Blackjack also known as 21 • Player or players compete against the dealer
or “house.” • The rules vary by casino, and even by
country.• The variations are insignificant, but affect the
potential profitability of player strategies.
![Page 39: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/39.jpg)
39
Blackjack rules
• The dealer and each player receives two cards. The dealer turns the first of his cards face up and the other remains face down.
• The object is to come as close to 21 as possible without going “busted.”
• Each card is counted as its face value, • Face cards counting 10• Aces being counted as1 or 11
![Page 40: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/40.jpg)
40
Blackjack rules
![Page 41: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/41.jpg)
41
Blackjack rules
• If the first two cards dealt to the player yield 21, this is called
“blackjack”• If the dealer’s up card is an
ace, the player may purchase “insurance” for half the amount of the player’s wager. If the dealer has blackjack, the player wins 2:1
![Page 42: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/42.jpg)
42
• If the player has two cards of equal denomination on the deal, he may split the cards into two new hands.
• Also, on the initial deal, when the player has two cards, he has the option of “doubling down”
• If the player goes over 21, he busts and immediately loses his wager.
• If the player stands at a value less than or equal to 21, the play proceeds to the dealer,
Blackjack rules
![Page 43: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/43.jpg)
43
History
• The intelligent player can win consistently at blackjack by “counting cards,” using the history of which cards have been played
• 1956 – “The Optimum Strategy in Blackjack”, Dr. Roger Baldwin
• 1962 – “Beat the Dealer”, Dr. Edward Thorp
![Page 44: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/44.jpg)
44
History
• Thorp analyzed the player advantage, using his basic strategy, when the (single) deck contained all 16 tens, and when a number of the tens were removed. +0.13% advantage with all 16 tens −1.85% disadvantage with 12 tens −3.13% disadvantage with 8 tens −2.14% disadvantage with 4 tens +1.62% advantage when no ten remained.
• No linear relationship between the number of tens and the player’s advantage or disadvantage.
![Page 45: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/45.jpg)
45
Basic strategy
• The player makes the same play in the same setting without respect to which cards have been played in prior hands
• If the player mimicked the dealers rules, the player faced a disadvantage between -5.56% and -6.78% with 95% confidence
![Page 46: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/46.jpg)
46
Basic Startegies
-1.00%
-0.50%
0.00%
0.50%
Ad
van
tag
e
AVERAGE MIN MAX
AVERAGE -0.02 -0.41 -0.02 -0.44 -0.03 -0.43 0.06 -0.43 -0.02 -0.71
MIN -0.14 -0.51 -0.14 -0.55 -0.14 -0.55 -0.06 -0.55 -0.14 -0.82
MAX 0.09 -0.33 0.10 -0.32 0.09 -0.32 0.17 -0.32 0.10 -0.59
1-Deck4-Decks:1-Deck4-Decks:1-Deck:4-Decks:1-Deck:4-Decks:1-Deck:4-Decks:
Thorp Revere Archer Gollehon Patterson
![Page 47: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/47.jpg)
47
Counting strategy
• Computer simulation has shown that the player can have an advantage over the house by altering his strategy based on the distribution of cards played in prior hands
• Player advantage after removing all of the cards of a given rank
• The most significant single card is the 5
![Page 48: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/48.jpg)
48
Counting strategy
Type of Missing Advantage %Card PlayerAces ............................ -2.42Twos ........................... +1.75Threes ......................... +2.14Fours ........................... +2.64Fives ........................... +3.58Sixes ........................... +2.40Sevens ........................ +2.05Eights .......................... +0.43Nines ........................... -0.41Tens ............................ +1.62
![Page 49: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/49.jpg)
49
Evolving Basic Strategies
• Starting with Gollehon's basic strategy and three random variants of the strategy
• 3 million simulated hands on a single deck, reshuffling after 2/3 of the deck had been played
• Strategies were represented as entries in matrices describing decisions
![Page 50: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/50.jpg)
50
Strategy representation example
![Page 51: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/51.jpg)
51
Evolving Basic Strategies
• Simple mutation was used to create an offspring from each parent, altering multiple entries in the strategy
• 10 generations on a one-deck game
• 10 more generations on a four-deck game
• Each generation of evolution required just less than three days on the Macintosh SE
![Page 52: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/52.jpg)
52
Basic Strategies
-1.00%
-0.80%
-0.60%
-0.40%
-0.20%
0.00%
0.20%
0.40%
Ad
va
nta
ge
AVERAGE MIN MAX
AVERAGE -0.02 -0.41 -0.02 -0.44 -0.03 -0.43 0.06 -0.43 -0.02 -0.71 0.22 -0.25
MIN -0.14 -0.51 -0.14 -0.55 -0.14 -0.55 -0.06 -0.55 -0.14 -0.82 0.10 -0.36
MAX 0.09 -0.33 0.10 -0.32 0.09 -0.32 0.17 -0.32 0.10 -0.59 0.33 -0.13
1-Deck4-Decks:1-Deck4-Decks:1-Deck:4-Decks:1-Deck:4-Decks:1-Deck:4-Decks:1-Deck:4-Decks:
Thorp Revere Archer Gollehon Patterson EA
![Page 53: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/53.jpg)
53
Evolving Counting Strategies
• The best-evolved basic strategy and two random variants of this strategy were further evolved
• 400,000 simulated hands, increased over time to speed the process
• 50 generations were executed using the plus-minus counting framework
![Page 54: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/54.jpg)
54
Counting Strategies
-1.00%
0.00%
1.00%
2.00%
3.00%
AVERAGE MIN MAX
AVERAGE 1.82% 0.93% 0.31% -0.12% 0.56% -0.21%
MIN 1.65% 0.77% 0.16% -0.25% 0.32% -0.45%
MAX 1.99% 1.08% 0.45% 0.02% 0.80% 0.03%
1-Deck 2-Decks: 4-Decks: 8-Decks: 1-Deck 4-Decks:
EA Patterson and Olsen
![Page 55: Blackjack & The game of tag](https://reader035.vdocuments.net/reader035/viewer/2022062805/56814d31550346895dba5f81/html5/thumbnails/55.jpg)
55
Conclusions
• The standard blackjack strategies rely on a separate analysis of the correct play in each of a set of various possible situations
• Each of these situations is viewed independently
• It is reasonable to view the challenge of finding optimum strategies as a nonlinear problem that should be addressed by lifelike simulation of sequences of hands played until a deck or decks is/are reshuffled.