combinatorics 1: combinatorics in real life · 4 bayern germany ajax nederland 5 manchester city...

16
Bachelor of Ecole Polytechnique Computational Mathematics, year 2, semester 1 Lecturer: Lucas Gerin (send mail) ( mailto:[email protected] Combinatorics 1: Combinatorics in real life Table of contents Champions League 2018-19 Coincidences in Probability The birthday paradox The lottery coincidence Champions League 2018-19 # execute this part to modify the css style from IPython.core.display import HTML def css_styling(): styles = open("./style/custom2.css").read() return HTML(styles) css_styling() ## loading python libraries # necessary to display plots inline: %matplotlib inline # load the libraries import matplotlib.pyplot as plt # 2D plotting library import numpy as np # package for scientific computing from math import * # package for mathematics (pi, arctan, sqrt, fa

Upload: others

Post on 19-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Bachelor of Ecole PolytechniqueComputational Mathematics, year 2, semester 1

Lecturer: Lucas Gerin (send mail) (mailto:[email protected])

Combinatorics 1: Combinatorics inreal life

Table of contentsChampions League 2018-19

Coincidences in Probability

The birthday paradox

The lottery coincidence

Champions League 2018-19

# execute this part to modify the css stylefrom IPython.core.display import HTMLdef css_styling():

styles = open("./style/custom2.css").read()return HTML(styles)

css_styling()

## loading python libraries

# necessary to display plots inline:%matplotlib inline

# load the librariesimport matplotlib.pyplot as plt # 2D plotting libraryimport numpy as np # package for scientific computing

from math import * # package for mathematics (pi, arctan, sqrt, factorial ...)

Page 2: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

The round of 16 in UEFA Champions League (football) consists of games involvingthe 16 teams which qualify as st and d of each of the eight groups

in the group stage.

81 2

0, 1, 2, … , 7

Here are the 16 teams involved in 2018-19 in the round of 16, together with theirgroups and countries:

Teams which ended st Teams which ended d

# Group Name Country Name Country

0 Dortmund Germany Atletico Madrid Spain

1 Barcelona Spain Tottenham England

2 Paris France Liverpool England

3 Porto Portugal Schalke Germany

4 Bayern Germany Ajax Nederland

5 Manchester City England Lyon France

6 Real Madrid Spain Roma Italy

7 Juventus Italy Manchester United England

1 2

Page 3: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

The mechanism for the round of is as follows:

Every team which ended st plays against one team which ended d.1. Two teams of the same group cannot play against each other.2. Two teams of the same country cannot play against each other.3.

The draw of the round of is picked uniformly among all the con�gurations whichsatisfy the above constraints. The aim of this lab session is to recover the odds thatcan be found online:

(source: Twitter @2010MisterChip, Dec.2018)

(https://twitter.com/2010MisterChip/status/1072973885014970368/photo/1)

The table must be interpreted in the following way: if we pick uniformly at randoman admissible con�guration for the round of , then with probability Roma will play against Porto (observe that with probability Porto will play againstSchalke, since they both were in Group #3).

161 2

16

16 0.13590

1. Preliminaries: Con�gurations without constraints

Page 4: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

A con�guration is a one-to-one correspondence between the two sets

Later we will encode con�gurations by permutations of the set .Therefore we need to be able to generate all the permutations of a given set.

{Teams which ended 1st} → {Teams which ended 2d}.{0, 1, … , 7}

Do it yourself. Write a function Permutations(List) which returns the list of

all the permutations of the elements of List . For instance, one should have

Permutations([7,1,4])

[[7, 1, 4], [7, 4, 1], [1, 7, 4], [1, 4, 7], [4, 7, 1

], [4, 1, 7]]

(You must use recursive programming.)

2. Checking the constraints

[[7, 1, 4], [7, 4, 1], [1, 7, 4], [1, 4, 7], [4, 7, 1], [4, 1, 7]]

def Permutations(List):if len(List)==1:

return [List]else:

Output=[] # Temporary outputfor i in List:

SubList=list(List) # We copy 'List'SubList.remove(i) # We remove 'i' from 'SubList'for SubPermutation in Permutations(SubList):

Output.append([i]+SubPermutation) # We add lists of the form [i,SubPermutation]return Output

# Test: all permutations of [7,1,4] print(Permutations([7,1,4]))

Page 5: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Let be a permutation of . We associateto a con�guration of games in the following way:

Team 1st of Group # plays against Team 2d of Group # ,

Team 1st of Group # plays against Team 2d of Group # ,

...

Team 1st of Group # plays against Team 2d of Group # .

For example, if [7,2,5,0,1,4,3,6] then

Dortmund (1st of Group # ) plays against Manchester Utd (2d of Group # ),

Barcelona (1st of Group # ) plays against Liverpool (2d of Group # ),

...

Juventus (1st of Group # ) plays against Roma (2d of Group # ).

We see that this con�guration does not match the required constraints sinceJuventus and Roma (both from Italy) are not supposed to play against each other.

σ (σ(0), σ(1), … , σ(7)) {0, 1, … , 7}σ 8

0 σ(0)1 σ(1)

7 σ(7)

σ =

0 71 2

7 6

Do it yourself. Create a function AdmissibleConfiguration() which returns

True if the input is an admissible con�guration, and False if it is not

admissible. If possible, it should return the explanation of why it is not admissible.

To save your time we have already created two lists Teams1st and Teams2d

ordered acccording to their groups.

Page 6: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

3. Computing probabilities

Do it yourself. Let be a uniform permutation of , and let denote the event "At the round of , Roma plays against Porto".Write as a conditional probability, using the random variable .

σ {0, 1, … , 7} A16

ℙ(A) σ

How to use the lists Team1st and Teams2d:1st Team of Group #3 : ['Porto', 'Portugal']Country of 2d Team of Group #5 : France---------------Test of AdmissibleConfiguration:[0,6,2,7,1,4,3,5] is admissible? Answer: False[7,2,5,0,1,4,3,6] is admissible? Answer: False[7,2,6,0,1,4,3,5] is admissible? Answer: True

Teams1st=[['Dortmund','Germany'],['Barcelona','Spain'],['Paris SG','France'], ['Porto','Portugal'],['Bayern Munchen','Germany'],['Manchester City', ['Real Madrid','Spain'],['Juventus','Italy']]

Teams2d=[['Atletico Madrid','Spain'],['Tottenham','England'],['Liverpool','England' ['Schalke','Germany'],['Ajax','Nederland'],['Lyon','France'], ['Roma','Italy'],['Manchester United','England']]

# A few examples :print('How to use the lists Team1st and Teams2d:')

print('1st Team of Group #3 : '+str(Teams1st[3])) print('Country of 2d Team of Group #5 : '+str(Teams2d[5][1]))

print('---------------')

def AdmissibleConfiguration(Config):# returns "True" if and only if Config is admissiblefor i in range(8):

if Teams1st[i][1]==Teams2d[Config[i]][1]:#print('False (same country): '+str(Teams1st[i][0])+' plays '+str(Teams2d[Config[i]][0]return False

elif Config[i]==i:#print('False (same group): '+str(Teams1st[i][0])+' plays '+str(Teams2d[Config[i]][0]))return False

return True

# A test:print('Test of AdmissibleConfiguration:')print('[0,6,2,7,1,4,3,5] is admissible? Answer: '+str(AdmissibleConfiguration([print('[7,2,5,0,1,4,3,6] is admissible? Answer: '+str(AdmissibleConfiguration([print('[7,2,6,0,1,4,3,5] is admissible? Answer: '+str(AdmissibleConfiguration([

Page 7: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Answers. Because of the assumption

The draw of the round of is picked uniformly among all the con�gurationswhich satisfy the above constraints

we have that

16

ℙ(A) =  Number of adm. config. s such that s(3) = 6Number of adm. config.

= × Number of adm. config. such that s(3) = 6Number of config.

 Number of config.Number of adm. config.

= ℙ( = 6 ∩ σ is admissible.) ×σ31

ℙ(σ is admissible.)= ℙ( = 6 | σ is admissible.)σ3

Do it yourself. Write a script which computes

the number of admissible con�gurations (to check your result: you should

�nd )

the number of admissible con�gurations such that Roma plays against Porto

the probability that Roma plays Porto.

3694

Do it yourself. Write a script which computes the table of all probabilities.8 × 8

Number of admissible config. = 3694Number of admissible config where Roma plays Porto = 502Probability that Roma plays Porto = 0.1359

NumberOfAdmissibleConfig=0NumberOfAdmissibleConfigRomaVsPorto=0

for Config in AllConfigurations(range(8)):if AdmissibleConfiguration(Config)==True:

NumberOfAdmissibleConfig=NumberOfAdmissibleConfig+1if Config[3]==6:

NumberOfAdmissibleConfigRomaVsPorto=NumberOfAdmissibleConfigRomaVsPorto

print('Number of admissible config. = '+str(NumberOfAdmissibleConfig))print('Number of admissible config where Roma plays Porto = '+str(NumberOfAdmissibleConfigRomaVs

Ratio=NumberOfAdmissibleConfigRomaVsPorto/(NumberOfAdmissibleConfig+0.0)print('Probability that Roma plays Porto = '+str(np.round(Ratio,4)))

Page 8: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

3. Bad luck for Germany

Do it yourself. Football specialists considered that a bad draw for Germany wouldbe

Bayern Munchen vs Atletico Madrid and Dortmund vs Liverpool.

Write a script which computes the probability that both events occur.1. Are these two events independent?2.

[[ 0. 0.176 0.176 0. 0.1375 0.176 0.1586 0.176 ] [ 0. 0. 0.1727 0.183 0.1381 0.1727 0.1608 0.1727] [ 0.1884 0.1684 0. 0.1825 0.1364 0. 0.1576 0.1668] [ 0.1586 0.1478 0.1467 0. 0.1175 0.1467 0.1359 0.1467] [ 0.1787 0.1678 0.1665 0. 0. 0.1665 0.1538 0.1668] [ 0.2891 0. 0. 0.2777 0.1998 0. 0.2334 0. ] [ 0. 0.1722 0.1719 0.177 0.1359 0.1719 0. 0.1711] [ 0.1852 0.1678 0.1662 0.1798 0.1348 0.1662 0. 0. ]]

NumberOfAdmissibleConfig=0MatrixOfNumbers=np.zeros([8,8]) # MatrixOfNumbers[a,b] will be the number of config in which a play

for Config in AllConfigurations(range(8)):if AdmissibleConfiguration(Config)==True:

NumberOfAdmissibleConfig=NumberOfAdmissibleConfig+1# If Config is admissible then all the 8 games of Config increase by onefor i in range(8):

MatrixOfNumbers[i,Config[i]]=MatrixOfNumbers[i,Config[i]]+1

#print(NumberOfAdmissibleConfig)MatrixOfProbabilities=np.round(MatrixOfNumbers/NumberOfAdmissibleConfig,4)print(MatrixOfProbabilities)

Page 9: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Answers.We consider the events:

If were independent, then one would have

(NB: As the LHS is greater than the RHS one says that these two events arepositively correlated. This is really bad luck for Germany.)

1.

E

F

= { Bayern Munchen plays Atletico Madrid}= { Dortmund plays Liverpool}.

E, F0.0338 = ℙ(E ∩ F) = ℙ(E)ℙ(F) = 0.0315.

Number of admissible config. = 3694Number of bad config for Germany = 125Probability of bad luck for Germany = 0.0338If independence: 0.0315

NumberOfAdmissibleConfig=3694 # computed earlierNumberOfAdmissibleConfigBadForGermany=0

for Config in AllConfigurations(range(8)):if AdmissibleConfiguration(Config)==True:

if Config[4]==0 and Config[0]==2:NumberOfAdmissibleConfigBadForGermany=NumberOfAdmissibleConfigBadForGermany

print('Number of admissible config. = '+str(NumberOfAdmissibleConfig))print('Number of bad config for Germany = '+str(NumberOfAdmissibleConfigBadForGermany

Ratio=np.round(NumberOfAdmissibleConfigBadForGermany/(NumberOfAdmissibleConfig+print('Probability of bad luck for Germany = '+str(Ratio))

# If these events were independent, then we would have# P(bad luck)= P(Bayern Munchen vs Atletico Madrid) x P(Dortmund vs Liverpool) Product=np.round(MatrixOfProbabilities[4,0]*MatrixOfProbabilities[0,2],4)print('If independence: '+str(Product))

Page 10: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Coincidences in Probability1. The birthday paradox

We consider the following problem. Consider a group of people, we assumethat their birthdays are uniformly distributed and independent in

, with . The birthday paradox asks for the probability of theevent

Obviously we have that as soon as . The so-calledparadox is that a high probability is reached for quite small values of .

n ≥ 2, … ,X1 Xn

{1, 2, … , k} k = 365

= { there exist i ≠ j, 1 ≤ i, j ≤ n; = }.En,k Xi Xj

ℙ( ) = 1En,365 n ≥ 365n

Do it yourself. Let be the complementary event of .

Compute and . (Justify carefully your answer for .)1. Compute

and deduce the formulas for .

2.

Fn,k En,k

ℙ( )F1,k ℙ( )F2,k F2,k

ℙ( | ),Fn,k Fn−1,kℙ( ),ℙ( )Fn,k En,k

Page 11: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Answers.We obviously have .1.

One writes

Therefore .

If the event occurs then all the 's are distinct up to .Then occurs if and only if takes one of the remainingvalues:

By induction we easily obtain that

and .

1.

ℙ( ) = 1F1,k

ℙ( )F2,k = ℙ( ≠ )X1 X2

= ℙ( = i, ≠ ) (law of total probabilities)∑i=1

k

X1 X1 X2

= ℙ( = i, ≠ i)∑i=1

k

X1 X2

= ℙ( = i)ℙ( ≠ i) (independence)∑i=1

k

X1 X2

= = (the sum does not depend on i).∑i=1

k 1k

k − 1k

k − 1k

ℙ( ) = (k − 1)/kF2,k

Fn−1,k Xi i = n − 1Fn,k Xn k − (n − 1)

ℙ( | ) = .Fn,k Fn−1,kk − (n − 1)

k

ℙ( ) = × × × ⋯ × .Fn,kk

k

k − 1k

k − 2k

k − (n − 1)k

ℙ( ) = 1 − ℙ( )En,k Fn,k

Do it yourself. Write a function that takes as inputs and returns .n, k ℙ( )En,k

For n=8, two identical birthdays with probability 0.0743352923517

def TwoIdenticalBirthdays(n,k):# returns the probability P(E_{n,k})Vector=np.arange(k-n+1,k+1) # computes [k-n+1,...,k]Quotient=Vector/(k+0.0) # '+0.0' forces the float divisionProduct= np.prod(Quotient)return 1-Product

# Test : (for n=8,k=365 this should return 0.0743...)print('For n=8, two identical birthdays with probability '+str(TwoIdenticalBirthdays

Page 12: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Do it yourself.Plot for to .1. Find the smallest such that .2. n ↦ ℙ( )En,365 n = 2 n = 100

n ℙ( ) ≥ 3/4En,365

Answers. 2) According to the above script, there are more than % chances assoon as .

75n ≥ 32

-----------------Question 2For n = 69, we have 0.998963666308 chance of 2 identical birthdaysFor n = 70, we have 0.999159575965 chance of 2 identical birthdays-----------------

# Question 1BirthdayParadox = [TwoIdenticalBirthdays(n,365) for n in range(2,100,3)]plt.plot(range(2,100,3),BirthdayParadox,'o-')plt.xlabel('Size $n$ of the group'),plt.ylabel('Probability')plt.title('Probability in the birthday paradox')plt.show()

# Question 2

BirthdayParadox = [TwoIdenticalBirthdays(n,365) for n in range(1,365)]i=1while BirthdayParadox[i]<0.999:

i=i+1print('-----------------')print('Question 2')print('For n = '+str(i)+', we have '+str(TwoIdenticalBirthdays(i,365))+' chance of 2 identical bprint('For n = '+str(i+1)+', we have '+str(TwoIdenticalBirthdays(i+1,365))+' chance of 2 identicprint('-----------------')

Page 13: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Bonus: 2. The lottery coincidence(Inspired by The North-Carolina Lottery Coincidence (Leonard Stefanski)(https://www4.stat.ncsu.edu/~stefanski/NC%20Lottery%20Coincidence.pdf).)

On July 9th, 2007, the North Carolina Cash 5 lottery numbers came up , , ,, . Two days later (the lottery runs every day), the same �ve numbers came up

again. This seems very unlikely, the aim of this exercise is to show that this is notthat extraordinary.

The rules of Cash 5 are the following: every day �ve distinct numbers are pickeduniformly (order does not matter) between and . More formally, at eachdrawing we are given a random variable uniform in the set , where is the

set of all the combinations.

As a warm-up we will �rst estimate the probability that the same combination ispicked twice in two days (instead of twice in three days).

4 21 2334 39

1 39X C5 C5( ) = 57575739

5

Do it yourself. Let be a sequence of independent random variables

uniform in . Put .

Let denote the event

Compute .1. Compute2.

Let , compute the probability that in days there are no twoconsecutive drawings which are identical.

3.

, , …X1 X2C5 k = card( ) = ( )C5

395

An

= { ≠ }.An Xn Xn+1

ℙ( )A1

ℙ ( | ∩ ∩ ⋯ ∩ ) .An−1 A1 A2 An−2n ≥ 2 pn n

Answers.

This is the birthday paradox with : .1.

Assume that the event occur. The same reasoning asbefore shows that, no matter the values of , we have that

with probability .

2.

We have that

By induction

3.

n = 2 ℙ( ) =A1k−1k

, , … ,A1 A2 An−2, … ,X1 Xn−1

= { ≠ }An−1 Xn−1 Xn (k − 1)/k

= ℙ (  and   and   and  … ) .pn A1 A2 A3 An−1

= .pn ((k − 1)/k)n−1

Page 14: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Do it yourself.In the cell below, write a function

NotTwoConsecutiveIdenticalDrawings(n) which takes as input and

returns .

npn

Fine, we obtain that the probability of having two identical successive drawings isindeed very small as long as, say, (about years of daily lotteries).

We now turn to the actual lottery problem: estimating the probability that there isthe same drawing twice in three days.

n ≤ 10000 10

Do it yourself. Let denote the event

Write in terms of .1.

Compute2.

Deduce from above the probability that, in days, there are notwice the same drawing in three consecutive days.

3.

Bn

= { , ,  are all distinct}.Bn Xn Xn+1 Xn+2

ℙ( )B1 k = ( )395

ℙ ( | ∩ ∩ ⋯ ∩ ) .Bn−2 B1 B2 Bn−3qn n ≥ 3

0.9827832155669857

k=575757+0.0 # Number of combinationsdef NotTwoConsecutiveIdenticalDrawings(n):

# returns the probability p_nreturn (1-1/k)**(n-1)

# Test: for n=10000 this should return 0.98278...NotTwoConsecutiveIdenticalDrawings(10000)

Page 15: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Answers.This is just the birthday paradox for :1.

Assume that occur. In particular, we have , i.e. are all distinct.

2.

Therefore occurs if and only if and , whichhappens with probability . Finally,

We have that

Let us prove by induction that for every

For this is Question 1. For ,

1.

n = 3ℙ( ) = .B1

(k − 1)(k − 2)k2

, , … ,B1 B2 Bn−3 Bn−3, ,Xn−3 Xn−2 Xn−1

Bn−2 ≠Xn Xn−1 ≠Xn Xn−2(k − 2)/k

ℙ ( | ∩ ∩ ⋯ ∩ ) =Bn−2 B1 B2 Bn−3k − 2k

= ℙ( ∩ ⋯ ∩ ).qn B1 Bn−2n ≥ 3

= .qnk(k − 1)k2 ( )k − 2

k

n−3

n = 3 n ≥ 4qn = ℙ( ∩ ⋯ ∩ )B1 Bn−2

= ℙ( ∩ ⋯ ∩ | ∩ ⋯ ∩ )ℙ( ∩ ⋯ ∩ )B1 Bn−2 B1 Bn−3 B1 Bn−3

= k − 2kqn−1

= =k − 2k

(k − 1)(k − 2)k2 ( )k − 2

k

n−4 (k − 1)(k − 2)n−2

kn−1

Do it yourself. Write a function NotTwoIdenticalDrawingsInThreeDays(n)

which takes as input and returns .To check your result:

np.round(NotTwoIdenticalDrawingsInThreeDays(10000),4)

0.9659

n qn

k=575757+0.0def NotTwoIdenticalDrawingsInThreeDays(n):

# returns the probability q_nreturn ((k-1)/k)*((k-2)/k)**(n-2)

# Test: for n=50000 this should return 0.840566...print(np.round(NotTwoIdenticalDrawingsInThreeDays(10000),40))

Page 16: Combinatorics 1: Combinatorics in real life · 4 Bayern Germany Ajax Nederland 5 Manchester City England Lyon France 6 Real Madrid Spain Roma Italy 7 Juventus Italy Manchester United

Do it yourself. Assume that lotteries similar to the Cash 5 Lottery take place times in �ve years, in cities across the US.

Compute the probability that for �ve years there is at least one lottery in whichthere are twice the same drawing in three consecutive days. Write the answer in

terms of NotTwoIdenticalDrawingsInThreeDays and compute the

numerical value in the cell above.

1200 20

Answers. We want to compute

According to the script below, the answer is almost %.

ℙ( Two equal drawings in c in 1200 days)⋃c: city

= 1 − ℙ(  No two equal drawings in ⋂c: city

= 1 − ℙ( No two equal drawings in 

= 1 − ( .q1200)20

8

1-(NotTwoIdenticalDrawingsInThreeDays(1200))**20