double dummy hand evaluatoraldous/research/... · it is a classic and the most popular method of...

11
1 Double Dummy Hand Evaluator Zhenyang Zhang Advisor: David Aldous 1. Introduction The goal of this study is to analyze methods that already exist and come up with a new way of thinking about trick-taking values of different hands. Rather than making any conclusions on hand evaluation, I would like to be more cautious, and consider my model a “baby” version of a double dummy hand evaluator. I hope that this will bring up more thoughts about hand evaluation modeling in the future. I am using double dummy data from Matt Ginsberg’s double dummy library. One may argue that double dummy tricks are not “realistic” since sometimes results are not achievable in the real world plays. For example, double dummy tricks will always run a suit missing Q, while in reality players may guess the Q on the wrong side and lose one trick as a result. However, there are still a lot of advantages of using it. First, there are a lot of hands in Ginsberg’s library, 717102 hands in total. And in some sense double dummy tricks are more “pure”. We do not have to take into account if players made a mistake or used some at-table information (like hand gestures, facial expressions, etc.) during the play so that they won more or fewer tricks than expected. Also, expert players have inclination to bid games, which means that minor suit contracts will be less played and the result will be somehow biased, while double dummy results will not be affected. 2. Observations on existing methods Before we actually step into our model of evaluating hands, we wish to examine a few existing methods of evaluating hands. The four methods are high card points, distributional points, controls and losers. They are among the most popular methods of evaluating a hand that have been used by professional players. But through the below analysis we can see that these are not ideal ways of telling how good your hand is. -High Card Points It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge. Ace is counted as 4 points, King 3, Queen 2 and Jack 1. In total there are 40 points and a hand of more than 10 points is considered good.

Upload: others

Post on 06-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

1

DoubleDummyHandEvaluator

ZhenyangZhangAdvisor:DavidAldous

1.IntroductionThegoalofthisstudyistoanalyzemethodsthatalreadyexistandcomeupwithanewwayofthinkingabouttrick-takingvaluesofdifferenthands.Ratherthanmakinganyconclusionsonhandevaluation,Iwouldliketobemorecautious,andconsidermymodela“baby”versionofadoubledummyhandevaluator.Ihopethatthiswillbringupmorethoughtsabouthandevaluationmodelinginthefuture.IamusingdoubledummydatafromMattGinsberg’sdoubledummylibrary.Onemayarguethatdoubledummytricksarenot“realistic”sincesometimesresultsarenotachievableintherealworldplays.Forexample,doubledummytrickswillalwaysrunasuitmissingQ,whileinrealityplayersmayguesstheQonthewrongsideandloseonetrickasaresult.However,therearestillalotofadvantagesofusingit.First,therearealotofhandsinGinsberg’slibrary,717102handsintotal.Andinsomesensedoubledummytricksaremore“pure”.Wedonothavetotakeintoaccountifplayersmadeamistakeorusedsomeat-tableinformation(likehandgestures,facialexpressions,etc.)duringtheplaysothattheywonmoreorfewertricksthanexpected.Also,expertplayershaveinclinationtobidgames,whichmeansthatminorsuitcontractswillbelessplayedandtheresultwillbesomehowbiased,whiledoubledummyresultswillnotbeaffected.2.ObservationsonexistingmethodsBeforeweactuallystepintoourmodelofevaluatinghands,wewishtoexamineafewexistingmethodsofevaluatinghands.Thefourmethodsarehighcardpoints,distributionalpoints,controlsandlosers.Theyareamongthemostpopularmethodsofevaluatingahandthathavebeenusedbyprofessionalplayers.Butthroughthebelowanalysiswecanseethatthesearenotidealwaysoftellinghowgoodyourhandis.-HighCardPointsItisaclassicandthemostpopularmethodofevaluatingone’shandthateveryonewilllearnatthebeginningofplayingbridge.Aceiscountedas4points,King3,Queen2andJack1.Intotalthereare40pointsandahandofmorethan10pointsisconsideredgood.

Page 2: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

2

Hereisahistogramofhighcardpointsof1000randomsampleddeals.Wecanseethatthedistributionisroughlynormalandthusarobustmethod.However,boththeplotandcorrelationbetweenHCPandtricksimplythatthisisnotsoprecise.Andthisisexactlywhatmanybridgeexpertshaveargued:HCPdoesnottakehandshapeintoaccount;QueensandJacksareoverestimatedsincetheyarenotsopowerfulintherealplaysincetheyarelikelytobecoveredbyAcesandKings.

Histogram for hcp

hcp

Frequency

0 5 10 15 20 25

0200

400

600

Page 3: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

3

-DistributionpointsArevisedversionofHCPisdistributionpoint:thatistoadddistributionalvaluestooriginalhighcardpointsofahand.Adoubletonisworth1pointmore,singleton3andvoid5.SinceitbasicallyinheritsfromHCP,wecanseethatthedistributionisstillnormal-like.Thecorrelationofdistributionpointandnumberoftrickstowinisaround0.29,whichislargerthanbefore.ThereforeitisbetterthanHCPinsomesensewhilestillnotgoodenough.

0 5 10 15 20

24

68

1012

hcp vs tricks

hcp

tricks

Page 4: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

4

Histogram for distp

distp

Frequency

0 5 10 15 20 25 30

0100

200

300

400

500

600

700

Page 5: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

5

ControlsSincetherearecontroversialideasonvaluesofQueenandJacks,expertsreevaluatetheirhandsusingcontrolswhentheyarethinkingaboutslamcontracts.AceiscountedastwocontrolsandKingasone.Butfromthehistogramwecanseethatmosthandshave0-2controlsandareconsideredabadhand.BecauseitismorelikeababyversionofHCPnotconsideringQueensandJacks,thecorrelationbetweencontrolsandtrickswonarearound0.08whichimpliesitisnotagoodindicatorof

0 5 10 15 20 25

24

68

1012

distp vs tricks

distp

tricks

Page 6: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

6

trickwinningpower.

Histogram for control

control

Frequency

0 2 4 6 8 10

0200

400

600

800

Page 7: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

7

-LosersThisisthelastmethodwearegoingtolookat.Losersaresomehowmorecomplicatedtocount.Foreachsuit,justlookatthethreelargestcardsinthesuitarecounthowmanyofAceKingandQueenismissing.Forexample,AQhasoneloser,KQ432hasoneloser,andQ82hastwolosers.Themorelosersonehave,theworsehishandis.Thecorrelationisaround-0.2,adecentone,butstillnotgoodenough.

0 2 4 6 8 10

24

68

1012

controls vs tricks

control

tricks

Page 8: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

8

Histogram for losers

loser

Frequency

2 4 6 8 10 12

0200

400

600

800

Page 9: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

9

3.HandEvaluatorModelAftershowingthatexistingmethodsarenotidealweaponstouse,wenowwanttocomeupwithsomethingmore.Firstwewanttodecidewhichfactorswearelookingatinthemodel.Generallywehavefouraspectsthatweareconcerningaboutahandwhenwearebidding:notrumpcontractsandtrumpcontracts,anddefendinganddeclaringineachcase.Intheprogram,wearegoingtouseabbreviation,“NTOffense”,“NTDefense”,“TrumpOffense”and“TrumpDefense”torepresentthesefouraspects.Weassumethatthevalueofahandisthelinearsumofthevaluesofeachsuit,andthuswehavethefollowingequation,wheretisthenumberoftricksyouexpecttogetinacertainsuit.

2 4 6 8 10 12

24

68

1012

losers vs tricks

losers

tricks

Page 10: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

10

V (h) = t(s)s∈h∑

Weallknowthatalongersuitismorelikelytowinmoretrickssincewehavepotentialchancetorunthesuitandwintherestofthesmallcardsinthesuit.Thuswewishtonormalizeandreducetheeffectofthissuit.Wewishtotaketheexpectedwinningtricksofthesuitinthispattern(thatis,suitsthathavethesamehighcards),minustheexpectedwinningtricksamongallthesuitsofthesamelength.Theequationisstatedbelow:

t(s) = ave(s)− ave(length(s))

Basedonthesetwoequations,weareabletoanalyzetheexpectedtrickswecanwingiven a hand.Wewrote the code in python andwith an input, the hand analysisfunctionwouldprintoutexpectedtrickstowininfouroccasionsrespectively.

Forexample,Ijustrandomlypickedahandfrommyhandhistoryandwishtoknowhowgooditis.ThusIinputthehandinterminalandruntheprogram.

Wecanseefromtheoutputthatweareexpectedtowin7tricksinNTcontracts,and7.8tricksintrumpcontractsifwearedeclarer.Andweareexpectedtowin8.8tricksinNTcontractsand9tricksintrumpcontractsifwearedefending.Thisissuggestingthatourhandisbetterthanaverage,soweshoulddefinitelyopenthebidding.However,itisabetterhandindefense,andhasalotofpotentialindefendingopponents’contract.Sowearealsohappytodefendiftheopponentsstarttointerfereinthebidding.

Page 11: Double Dummy Hand Evaluatoraldous/Research/... · It is a classic and the most popular method of evaluating one’s hand that everyone will learn at the beginning of playing bridge

11

4.StrengthandWeaknessoftheModel:Thismodelusesthewholesetofdataoutofover700000hands,andthusthenumericalanalysisisindeedreliable.Fromtheaboveanalysisweareabletomakedecisionsinbiddingmoreconfidently.However,wehavenotconsideredpartner’shandinthemodel,whichmeansthatweeliminatehalfoftheinformation.Furthermore,duringthebidding,asweknowmoreandmoreaboutpartner’shand,weshouldbeabletodynamicallyconditionontheknownfacttoreevaluateourhand’svalue,butunfortunatelywehavenotfoundgoodwaystogetthatinvolved.Finally,inthemodel,weassumethelinearityofhonorsindifferentsuitssothatwecansumthemtogethertogetthefinalresult.However,intherealworld,atrumpsuitwillplayalargerrolethanothersuit,butwecannotdecidethetrumpsuitjustlookingatourhandsincepartner’shandisunclear.Alsowejustignorethecomplexcorrelationbetweenhonorsinasuitandthusitmakesthemodelinaccurate.Inconclusion,Iwillrecommendthismodelmoreasareferenceforreconsideringyourhandvaluethananaccuratetrickwinningpowerreportofasinglehand.5.ReferenceMattGinsberg’spaperonLawoftotaltricks:http://bocosan.tripod.com/ginsberg/total.HTMLExplanationsofbinaryfileofdoubledummylibrary:http://bocosan.tripod.com/ginsberg/library_notes.HTML