univ ersit y of alb erta library release orm - ?· y of alb erta library release f orm name of...

Download Univ ersit y of Alb erta Library Release orm - ?· y of Alb erta Library Release F orm Name of Author:…

Post on 28-Sep-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • University of AlbertaLibrary Release FormName of Author: Denis Richard PappTitle of Thesis: Dealing with Imperfect Information in PokerDegree: Master of ScienceYear this Degree Granted: 1998Permission is hereby granted to the University of Alberta Library to reproduce singlecopies of this thesis and to lend or sell such copies for private, scholarly or scienticresearch purposes only.The author reserves all other publication and other rights in association with thecopyright in the thesis, and except as hereinbefore provided, neither the thesis nor anysubstantial portion thereof may be printed or otherwise reproduced in any materialform whatever without the author's prior written permission.. . . . . . . . . . . . . . . . . .Denis Richard Papp2036 - 104A stEdmonton, AlbertaCanada, T6J-5K4Date: . . . . . . . . .

  • University of AlbertaDealing with Imperfect Information in PokerbyDenis Richard Papp

    A thesis submitted to the Faculty of Graduate Studies and Research in partial fulll-ment of the requirements for the degree of Master of Science.Department of Computing ScienceEdmonton, AlbertaFall 1998

  • University of AlbertaFaculty of Graduate Studies and ResearchThe undersigned certify that they have read, and recommend to the Faculty of Grad-uate Studies and Research for acceptance, a thesis entitledDealing with ImperfectInformation in Poker submitted by Denis Richard Papp in partial fulllment ofthe requirements for the degree of Master of Science.

    . . . . . . . . . . . . . . . . . .Jonathan Schaeer. . . . . . . . . . . . . . . . . .Duane Szafron. . . . . . . . . . . . . . . . . .Oliver SchulteDate: . . . . . . . . .

  • AbstractPoker provides an excellent testbed for studying decision-making under conditions ofuncertainty. There are many benets to be gained from designing and experimentingwith poker programs. It is a game of imperfect knowledge, where multiple competingagents must understand estimation, prediction, risk management, deception, counter-deception, and agent modeling. New evaluation techniques for estimating the strengthand potential of a poker hand are presented. This thesis describes the implementationof a program that successfully handles all aspects of the game, and uses adaptiveopponent modeling to improve performance.

  • AcknowledgementsThe author would like to acknowledge the following: A big thanks to Jonathan Schaeer, for being an excellent supervisor. Darse Billings, for all his time, ideas and insightful input. Duane Szafron, for contributing to the poker research group. The National Sciences and Engineering Research Council, for providing nancialsupport.As well as the following people who helped make things easier (with or withouttheir knowledge): The authors of the fast poker hand evaluation library: Cliord T. Matthews,Roy T. Hashimoto, Keith Miyake and Mat Hostetter. Without this library theimplementation would certainly have been slower.ftp://ftp.csua.berkeley.edu/pub/rec.gambling/poker/poker.tar.gz. Todd Mummert, the author of the dealer programs that make the IRC pokerserver possible. http://www.cs.cmu.edu/People/mummert/public/ircbot.html. Michael Maurer, the author of the PERL player program that was the basis ofLoki's interface to the IRC dealer program.http://nova.stanford.edu/maurer/r.g/. All the other original authors who helped pioneer interest in IRC poker-playingprograms and provided us with some additional data: Greg Wohletz, StephenHow, Greg Reynolds and others. An extra thanks to Greg Reynolds for thevery handy Windows poker client (GPkr), which made watching Loki play a lotmore user-friendly. http://webusers.anet-stl.com/gregr/.

  • Contents1 Introduction 12 Poker 42.1 Playing a Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.1 Ante . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 The Deal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.3 Betting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1.4 Showdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Texas Hold'em . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 How Humans Play Poker 103.1 Hand Strength and Potential . . . . . . . . . . . . . . . . . . . . . . . 103.2 Opponent Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3 Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.4 Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.4.1 Pot Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.4.2 Implied Odds and Reverse Implied Odds . . . . . . . . . . . . 123.4.3 Eective Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.5 Playing Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.6 Deception and Unpredictability . . . . . . . . . . . . . . . . . . . . . 143.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 How Computers Play Poker 164.1 Expert Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2 Game-Theoretic Optimal Strategies . . . . . . . . . . . . . . . . . . . 174.3 Simulation and Enumeration . . . . . . . . . . . . . . . . . . . . . . . 184.4 Findler's Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.5 The Gala System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.6 Hobbyists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.7 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Hand Evaluation 245.1 Pre-Flop Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2 Hand Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

  • 5.2.1 Multi-Player Considerations . . . . . . . . . . . . . . . . . . . 275.3 Hand Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.3.1 Multi-Player Considerations . . . . . . . . . . . . . . . . . . . 315.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 Betting Strategy 336.1 Pre-Flop Betting Strategy . . . . . . . . . . . . . . . . . . . . . . . . 346.2 Basic Post-Flop Betting Strategy . . . . . . . . . . . . . . . . . . . . 376.3 Eective Hand Strength . . . . . . . . . . . . . . . . . . . . . . . . . 376.4 Semi-Blung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396.5 Calling With Pot Odds . . . . . . . . . . . . . . . . . . . . . . . . . . 396.6 Calling With Showdown Odds . . . . . . . . . . . . . . . . . . . . . . 406.7 Other Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Opponent Modeling 437.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.1.1 Weight Array . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.1.2 Action Frequencies . . . . . . . . . . . . . . . . . . . . . . . . 447.2 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467.2.1 Re-Weighting System . . . . . . . . . . . . . . . . . . . . . . . 477.2.2 Pre-Flop Re-Weighting . . . . . . . . . . . . . . . . . . . . . . 497.2.3 Post-Flop Re-Weighting . . . . . . . . . . . . . . . . . . . . . 497.2.4 Modeling Abstraction . . . . . . . . . . . . . . . . . . . . . . . 527.3 Using the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.3.1 The Field Array . . . . . . . . . . . . . . . . . . . . . . . . . . 537.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Experiments 568.1 Self-Play Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 568.2 Other Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578.3 Betting Strategy Experiments . . . . . . . . . . . . . . . . . . . . . . 608.4 Opponent Modeling Experiments . . . . . . . . . . . . . . . . . . . . 638.4.1 Generic Opponent Modeling (GOM) . . . . . . . . . . . . . . 638.4.2 Specic Opponent Modeling (SOM) . . . . . . . . . . . . . . . 648.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669 Conclusions and Future Work 67Bibliography 69A Pre-Flop Income Rates 71B Expert-Dened Values 74C Glossary 76

  • List of Figures4.1 Example of an Expert Knowledge-Based System . . . . . . . . . . . . 174.2 Branching Factor for Structured Betting Texas Hold'em With a Max-imum of 4 Bets/Round . . . . . . . . . . . . . . . . . . . . . . . . . . 194.3 Loki's Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.1 HandStrength Calculation . . . . . . . . . . . . . . . . . . . . . . . . 265.2 HandPotential2 Calculation . . . . . . . . . . . . . . . . . . . . . . . 306.1 Pre-Flop Betting Strategy . . . . . . . . . . . . . . . . . . . . . . . . 366.2 Simple Betting Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 386.3 Post-Flop Betting Strategy . . . . . . . . . . . . . . . . . . . . . . . . 427.1 Re-weighting Function Code . . . . . . . . . . . . . . . . . . . . . . . 487.2 Re-weighting Function . . . . . . . . . . . . . . . . . . . . . . . . . . 487.3 Post-Flop Re-weighting Function . . . . . . . . . . . . . . . . . . . . 517.4 Re-weighting Function With < . . . . . . . . . . . . . . . . . . . 527.5 Central Opponent Modeling Function . . . . . . . . . . . . . . . . . . 558.1 Betting Strategy Experiment . . . . . . . . . . . . . . . . . . . . . . . 618.2 Showdown Odds Experiment . . . . . . . . . . . . . . . . . . . . . . . 628.3 Generic Opponent Modeling Experiment . . . . . . . . . . . . . . . . 648.4 Specic Opponent Modeling Experiment . . . . . . . . . . . . . . . . 65

  • List of Tables5.1 Unweighted potential of A}-Q|/3~-4|-J~ . . . . . . . . . . . . . . 298.1 Seating assignments for tournament play (reproduced from [2]) . . . . 57A.1 IR