cs 121: lecture 23 probability review - harvard university
TRANSCRIPT
CS 121: Lecture 23Probability Review
Madhu Sudan
https://madhu.seas.Harvard.edu/courses/Fall2020
Book: https://introtcs.org
Only the course heads (slower): [email protected]{How to contact usThe whole staff (faster response): CS 121 Piazza
Announcements:โข Advanced section: Roei Tell (De)Randomizationโข 2nd feedback surveyโข Homework 6 out today. Due 12/3/2020โข Section 11 week starts
Where we are:Part I: Circuits: Finite computation, quantitative study
Part II: Automata: Infinite restricted computation, quantitative study
Part III: Turing Machines: Infinite computation, qualitative study
Part IV: Efficient Computation: Infinite computation, quantitative study
Part V: Randomized computation: Extending studies to non-classical algorithms
Review of course so farโข Circuits:
โข Compute every finite function โฆ but no infinite functionโข Measure of complexity: SIZE. ALL๐๐ โ SIZE ๐๐ 2๐๐
๐๐;โ๐๐ โ SIZE ๐๐ 2๐๐
๐๐
โข Finite Automata:โข Compute some infinite functions (all Regular functions) โข Do not compute many (easy) functions (e.g. EQ ๐ฅ๐ฅ,๐ฆ๐ฆ = 1 โ ๐ฅ๐ฅ = ๐ฆ๐ฆ)
โข Turing Machines:โข Compute everything computable (definition/thesis)โข HALT is not computable:
โข Efficient Computation:โข P Polytime computable Boolean functions โ our notion of efficiently computableโข NP Efficiently verifiable Boolean functions. Our wishlist โฆ โข NP-complete: Hardest problems in NP: 3SAT, ISET, MaxCut, EU3SAT, Subset Sum
Counting
Pumping/Pigeonhole
Diagonalization, Reductions
Polytime Reductions
Last Module: Challenges to STCT
โข Strong Turing-Church Thesis: Everything physically computable in polynomial time can be computable in polynomial time on Turing Machine.
โข Challenges: โข Randomized algorithms: Already being computed by our computers:
โข Weather simulation, Market prediction, โฆ โข Quantum algorithms:
โข Status:โข Randomized: May not add power?โข Quantum: May add power?โข Still can be studied with same tools. New classes: BPP, BQP:
โข B โ bounded error; P โ Probabilistic, Q - Quantum
Today: Review of Basic Probability
โข Sample spaceโข Eventsโข Union/intersection/negation โ AND/OR/NOT of eventsโข Random variablesโข Expectationโข Concentration / tail bounds
Throughout this lectureProbabilistic experiment: tossing ๐๐ independent unbiased coins.
Equivalently: Choose ๐ฅ๐ฅ โผ {0,1}๐๐
EventsFix sample space to be ๐ฅ๐ฅ โผ {0,1}๐๐
An event is a set ๐ด๐ด โ {0,1}๐๐. Probability that ๐ด๐ด happens is Pr ๐ด๐ด = ๐ด๐ด2๐๐
Example: If ๐ฅ๐ฅ โผ {0,1}3 , Pr ๐ฅ๐ฅ0 = 1 = 48
= 12
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ๐๐ = 3, ๐ด๐ด = ๐ฅ๐ฅ0 = 1 ,๐ต๐ต = ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 โฅ 2 ,๐ถ๐ถ = {๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 = 1 ๐๐๐๐๐๐ 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โฉ ๐ถ๐ถ] ?
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ๐๐ = 3, ๐ด๐ด = ๐ฅ๐ฅ0 = 1 ,๐ต๐ต = ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 โฅ 2 ,๐ถ๐ถ = {๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 = 1 ๐๐๐๐๐๐ 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โฉ ๐ถ๐ถ] ?
๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
0 1
0
0 0 0 0
01
1 1 1 1
1
000 001 010 011 100 101 110 111
Q: Let ๐๐ = 3, ๐ด๐ด = ๐ฅ๐ฅ0 = 1 ,๐ต๐ต = ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 โฅ 2 ,๐ถ๐ถ = {๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 + ๐ฅ๐ฅ3 = 1 ๐๐๐๐๐๐ 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โฉ ๐ถ๐ถ] ?
๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
= ๐๐๐๐
๐๐๐๐
Compare Pr ๐ด๐ด โฉ ๐ต๐ต vs Pr ๐ด๐ด Pr[๐ต๐ต]Pr[๐ด๐ด โฉ ๐ถ๐ถ] vs Pr ๐ด๐ด Pr[๐ถ๐ถ]
Operations on eventsPr ๐ด๐ด ๐๐๐๐ ๐ต๐ต โ๐๐๐๐๐๐๐๐๐๐๐๐ = Pr[ ๐ด๐ด โช ๐ต๐ต]
Pr ๐ด๐ด ๐๐๐๐๐๐ ๐ต๐ต โ๐๐๐๐๐๐๐๐๐๐ = Pr[ ๐ด๐ด โฉ ๐ต๐ต]
Pr ๐ด๐ด ๐๐๐๐๐๐๐๐๐๐โฒ๐ก๐ก โ๐๐๐๐๐๐๐๐๐๐ = Pr ๐ด๐ด = Pr {0,1}๐๐ โ ๐ด๐ด = 1 โ Pr[๐ด๐ด]
Q: Prove the union bound: Pr ๐ด๐ด โจ ๐ต๐ต โค Pr ๐ด๐ด + Pr[๐ต๐ต]
Example: Pr ๐ด๐ด = 4/25 , Pr ๐ต๐ต = 6/25, Pr ๐ด๐ด โช ๐ต๐ต = 9/25
IndependenceInformal: Two events ๐ด๐ด,๐ต๐ต are independent if knowing ๐ด๐ด happened doesnโt give any information on whether ๐ต๐ต happened.Formal: Two events ๐ด๐ด,๐ต๐ต are independent if Pr ๐ด๐ด โฉ ๐ต๐ต = Pr ๐ด๐ด Pr[๐ต๐ต]
Q: Which pairs are independent?(i) (ii) (iii)
Pr ๐ต๐ต ๐ด๐ด = Pr ๐ต๐ตโConditioning is the soul
of statisticsโ
More than 2 eventsInformal: Two events ๐ด๐ด,๐ต๐ต are independent if knowing whether ๐ด๐ด happened doesnโt give any information on whether ๐ต๐ต happened.
Formal: Three events ๐ด๐ด,๐ต๐ต,๐ถ๐ถ are independent if every pair ๐ด๐ด,๐ต๐ต , ๐ด๐ด,๐ถ๐ถ and ๐ต๐ต,๐ถ๐ถis independent and Pr ๐ด๐ด โฉ ๐ต๐ต โฉ ๐ถ๐ถ = Pr ๐ด๐ด Pr[๐ต๐ต] Pr[๐ถ๐ถ]
(i) (ii)
Random variablesAssign a number to every outcome of the coins.
Formally r.v. is ๐๐: {0,1}๐๐ โ โ
Example: ๐๐ ๐ฅ๐ฅ = ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2
0 1 1 2 1 2 2 3
๐๐ ๐ท๐ท๐ท๐ท[๐ฟ๐ฟ = ๐๐]
0 1/8
1 3/8
2 3/8
3 1/8
ExpectationAverage value of ๐๐ : ๐ผ๐ผ ๐๐ = โ๐ฅ๐ฅโ{0,1}๐๐ 2โ๐๐๐๐(๐ฅ๐ฅ) = โ๐ฃ๐ฃโโ ๐ฃ๐ฃ โ Pr[๐๐ = ๐ฃ๐ฃ]
Example: ๐๐ ๐ฅ๐ฅ = ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2
0 1 1 2 1 2 2 3
๐๐ ๐ท๐ท๐ท๐ท[๐ฟ๐ฟ = ๐๐]
0 1/8
1 3/8
2 3/8
3 1/8
Q: What is ๐ผ๐ผ[๐๐]? 0 โ 18
+ 1 โ 38
+ 2 โ 38
+ 3 โ 18
=128
= 1.5
Linearity of expectationLemma: ๐ผ๐ผ ๐๐ + ๐๐ = ๐ผ๐ผ ๐๐ + ๐ผ๐ผ[๐๐]
Corollary: ๐ผ๐ผ ๐ฅ๐ฅ0 + ๐ฅ๐ฅ1 + ๐ฅ๐ฅ2 = ๐ผ๐ผ ๐ฅ๐ฅ0 + ๐ผ๐ผ ๐ฅ๐ฅ1 + ๐ผ๐ผ ๐ฅ๐ฅ2 = 1.5
Proof:
๐ผ๐ผ ๐๐ + ๐๐ = 2โ๐๐๏ฟฝ๐ฅ๐ฅ
(๐๐ ๐ฅ๐ฅ + ๐๐ ๐ฅ๐ฅ ) = 2โ๐๐๏ฟฝ๐ฅ๐ฅ
๐๐ ๐ฅ๐ฅ + 2โ๐๐๏ฟฝ๐ฅ๐ฅ
๐๐(๐ฅ๐ฅ) = ๐ผ๐ผ ๐๐ + ๐ผ๐ผ[๐๐]
Independent random variablesDef: ๐๐,๐๐ are independent if {๐๐ = ๐ข๐ข} and {๐๐ = ๐ฃ๐ฃ} are independent โ๐ข๐ข, ๐ฃ๐ฃ
Def: ๐๐0, โฆ ,๐๐๐๐โ1, independent if ๐๐0 = ๐ฃ๐ฃ0 , โฆ , {๐๐๐๐โ1 = ๐ฃ๐ฃ๐๐โ1} ind. โ๐ข๐ข0 โฆ๐ข๐ข๐๐โ1i.e. โ๐ฃ๐ฃ, โฆ , ๐ฃ๐ฃ๐๐โ1 โ๐๐ โ [๐๐]
Pr ๏ฟฝ๐๐โ๐๐
๐๐๐๐ = ๐ฃ๐ฃ๐๐ = ๏ฟฝ๐๐โ๐๐
Pr[๐๐๐๐ = ๐ฃ๐ฃ๐๐]
Q: Let ๐ฅ๐ฅ โผ {0,1}๐๐. Let ๐๐0 = ๐ฅ๐ฅ0,๐๐1 = ๐ฅ๐ฅ1, โฆ ,๐๐๐๐โ1 = ๐ฅ๐ฅ๐๐โ1Let ๐๐0 = ๐๐1 = โฏ = ๐๐๐๐โ1 = ๐ฅ๐ฅ0
Are ๐๐0, โฆ ,๐๐๐๐โ1 independent?
Are ๐๐0, โฆ ,๐๐๐๐โ1 independent?
Independence and concentrationQ: Let ๐ฅ๐ฅ โผ {0,1}๐๐. Let ๐๐0 = ๐ฅ๐ฅ0,๐๐1 = ๐ฅ๐ฅ1, โฆ ,๐๐๐๐โ1 = ๐ฅ๐ฅ๐๐โ1
Let ๐๐0 = ๐๐1 = โฏ = ๐๐๐๐โ1 = ๐ฅ๐ฅ0Let ๐๐ = ๐๐0 + โฏ+ ๐๐๐๐โ1 , ๐๐ = ๐๐0 + โฏ+ ๐๐๐๐โ1
Compute ๐ผ๐ผ[๐๐], compute ๐ผ๐ผ[๐๐]
For ๐๐ = 100, estimate Pr[๐๐ โ 0.4๐๐, 0.6๐๐ ] , Pr[๐๐ โ 0.4๐๐, 0.6๐๐ ]
Sums of independent random variables
๐๐ = 5 Pr[๐๐ โ 0.4,0.6 ๐๐]= 37.5%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 10 Pr[๐๐ โ 0.4,0.6 ๐๐]= 34.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 20 Pr[๐๐ โ 0.4,0.6 ๐๐]= 26.3%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 50 Pr[๐๐ โ 0.4,0.6 ๐๐]= 11.9%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 100 Pr[๐๐ โ 0.4,0.6 ๐๐]= 3.6%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 200 Pr[๐๐ โ 0.4,0.6 ๐๐]= 0.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Sums of independent random variables
๐๐ = 400 Pr[๐๐ โ 0.4,0.6 ๐๐]= 0.01%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm
Concentration bounds๐๐ sum of ๐๐ independent random variables โ โโNormal/Gaussian/bell curveโ:
Pr ๐๐ โ 0.99,1.01 ๐ผ๐ผ ๐๐ < exp(โ๐ฟ๐ฟ โ ๐๐)
Otherwise: weaker bounds
In concentrated r.v.โs, expectation, median, mode, etc.. โ same
Chernoff Bound: Let ๐๐0, โฆ ,๐๐๐๐โ1 i.i.d. r.v.โs with ๐๐๐๐ โ [0,1]. Then if ๐๐ = ๐๐0 + โฏ+ ๐๐๐๐โ1 and ๐๐ = ๐ผ๐ผ ๐๐ for every ๐๐ > 0,
Pr ๐๐ โ ๐๐๐๐ > ๐๐๐๐ < 2 โ exp(โ2๐๐2 โ ๐๐)
Independent & identically distributed
Concentration bounds๐๐ sum of ๐๐ independent random variables โ โโNormal/Gaussian/bell curveโ:
Pr ๐๐ โ 0.99,1.01 ๐ผ๐ผ ๐๐ < exp(โ๐ฟ๐ฟ โ ๐๐)
Otherwise: weaker bounds
In concentrated r.v.โs, expectation, median, mode, etc.. โ same
Chernoff Bound: Let ๐๐0, โฆ ,๐๐๐๐โ1 i.i.d. r.v.โs with ๐๐๐๐ โ [0,1]. Then if ๐๐ = ๐๐0 + โฏ+ ๐๐๐๐โ1 and ๐๐ = ๐ผ๐ผ ๐๐ for every ๐๐ > 0,
Pr ๐๐ โ ๐๐๐๐ > ๐๐๐๐ < 2 โ exp(โ2๐๐2 โ ๐๐)
Independent & identically distributed
< exp โ๐๐2 โ ๐๐ if ๐๐ >1๐๐2
Simplest Bound: Markovโs InequalityQ: Suppose that the average age in a neighborhood is 20. Prove that at most 1/4 of the residents are 80 or older.
Thm (Markovโs Inequality): Let ๐๐ be non-negative r.v. and ๐๐ = ๐ผ๐ผ[๐๐]. Then for every ๐๐ > 1, Pr ๐๐ โฅ ๐๐๐๐ โค 1/๐๐Proof: Same as question above
Avg > 14โ 80 = 20
A: Suppose otherwise:80
0
>14
Contribute enough to
make avg>20
Variance & ChebychevIf ๐๐ is r.v. with ๐๐ = ๐ผ๐ผ[๐๐] then ๐๐๐๐๐๐ ๐๐ = ๐ผ๐ผ ๐๐ โ ๐๐ 2 = ๐ผ๐ผ ๐๐2 โ ๐ผ๐ผ ๐๐ 2
๐๐ ๐๐ = ๐๐๐๐๐๐(๐๐)
Chebychevโs Inequality: For every r.v. ๐๐
Pr ๐๐ โฅ ๐๐ ๐๐๐๐๐ฃ๐ฃ๐๐๐๐๐ก๐ก๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐๐ ๐๐ โค1๐๐2
Compare with ๐๐ = โ๐๐๐๐ i.i.d or other r.v.โs well approx. by Normal where
Pr ๐๐ โฅ ๐๐ ๐๐๐๐๐ฃ๐ฃ๐๐๐๐๐ก๐ก๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐๐ ๐๐ โ exp(โ๐๐2)
(Proof: Markov on ๐๐ = ๐๐ โ ๐๐ 2 )
Next Lectures
โข Randomized Algorithmsโข Some examples
โข Randomized Complexity Classesโข BPTIME ๐๐ ๐๐ , BPPโข Properties of randomized computation (Reducing error โฆ)