cs 121: lecture 23 probability review - harvard university

33
CS 121: Lecture 23 Probability Review Madhu Sudan https://madhu.seas.Harvard.edu/courses/Fall2020 Book: https://introtcs.org Only the course heads (slower): [email protected] { How to contact us The whole staff (faster response): CS 121 Piazza

Upload: others

Post on 10-May-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 121: Lecture 23 Probability Review - Harvard University

CS 121: Lecture 23Probability Review

Madhu Sudan

https://madhu.seas.Harvard.edu/courses/Fall2020

Book: https://introtcs.org

Only the course heads (slower): [email protected]{How to contact usThe whole staff (faster response): CS 121 Piazza

Page 2: CS 121: Lecture 23 Probability Review - Harvard University

Announcements:โ€ข Advanced section: Roei Tell (De)Randomizationโ€ข 2nd feedback surveyโ€ข Homework 6 out today. Due 12/3/2020โ€ข Section 11 week starts

Page 3: CS 121: Lecture 23 Probability Review - Harvard University

Where we are:Part I: Circuits: Finite computation, quantitative study

Part II: Automata: Infinite restricted computation, quantitative study

Part III: Turing Machines: Infinite computation, qualitative study

Part IV: Efficient Computation: Infinite computation, quantitative study

Part V: Randomized computation: Extending studies to non-classical algorithms

Page 4: CS 121: Lecture 23 Probability Review - Harvard University

Review of course so farโ€ข Circuits:

โ€ข Compute every finite function โ€ฆ but no infinite functionโ€ข Measure of complexity: SIZE. ALL๐‘›๐‘› โŠ† SIZE ๐‘‚๐‘‚ 2๐‘›๐‘›

๐‘›๐‘›;โˆƒ๐‘“๐‘“ โˆ‰ SIZE ๐‘œ๐‘œ 2๐‘›๐‘›

๐‘›๐‘›

โ€ข Finite Automata:โ€ข Compute some infinite functions (all Regular functions) โ€ข Do not compute many (easy) functions (e.g. EQ ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ = 1 โ‡” ๐‘ฅ๐‘ฅ = ๐‘ฆ๐‘ฆ)

โ€ข Turing Machines:โ€ข Compute everything computable (definition/thesis)โ€ข HALT is not computable:

โ€ข Efficient Computation:โ€ข P Polytime computable Boolean functions โ€“ our notion of efficiently computableโ€ข NP Efficiently verifiable Boolean functions. Our wishlist โ€ฆ โ€ข NP-complete: Hardest problems in NP: 3SAT, ISET, MaxCut, EU3SAT, Subset Sum

Counting

Pumping/Pigeonhole

Diagonalization, Reductions

Polytime Reductions

Page 5: CS 121: Lecture 23 Probability Review - Harvard University

Last Module: Challenges to STCT

โ€ข Strong Turing-Church Thesis: Everything physically computable in polynomial time can be computable in polynomial time on Turing Machine.

โ€ข Challenges: โ€ข Randomized algorithms: Already being computed by our computers:

โ€ข Weather simulation, Market prediction, โ€ฆ โ€ข Quantum algorithms:

โ€ข Status:โ€ข Randomized: May not add power?โ€ข Quantum: May add power?โ€ข Still can be studied with same tools. New classes: BPP, BQP:

โ€ข B โ€“ bounded error; P โ€“ Probabilistic, Q - Quantum

Page 6: CS 121: Lecture 23 Probability Review - Harvard University

Today: Review of Basic Probability

โ€ข Sample spaceโ€ข Eventsโ€ข Union/intersection/negation โ€“ AND/OR/NOT of eventsโ€ข Random variablesโ€ข Expectationโ€ข Concentration / tail bounds

Page 7: CS 121: Lecture 23 Probability Review - Harvard University

Throughout this lectureProbabilistic experiment: tossing ๐‘›๐‘› independent unbiased coins.

Equivalently: Choose ๐‘ฅ๐‘ฅ โˆผ {0,1}๐‘›๐‘›

Page 8: CS 121: Lecture 23 Probability Review - Harvard University

EventsFix sample space to be ๐‘ฅ๐‘ฅ โˆผ {0,1}๐‘›๐‘›

An event is a set ๐ด๐ด โŠ† {0,1}๐‘›๐‘›. Probability that ๐ด๐ด happens is Pr ๐ด๐ด = ๐ด๐ด2๐‘›๐‘›

Example: If ๐‘ฅ๐‘ฅ โˆผ {0,1}3 , Pr ๐‘ฅ๐‘ฅ0 = 1 = 48

= 12

Page 9: CS 121: Lecture 23 Probability Review - Harvard University

0 1

0

0 0 0 0

01

1 1 1 1

1

000 001 010 011 100 101 110 111

Q: Let ๐‘›๐‘› = 3, ๐ด๐ด = ๐‘ฅ๐‘ฅ0 = 1 ,๐ต๐ต = ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 โ‰ฅ 2 ,๐ถ๐ถ = {๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 = 1 ๐‘š๐‘š๐‘œ๐‘œ๐‘š๐‘š 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โˆฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โˆฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โˆฉ ๐ถ๐ถ] ?

Page 10: CS 121: Lecture 23 Probability Review - Harvard University

0 1

0

0 0 0 0

01

1 1 1 1

1

000 001 010 011 100 101 110 111

Q: Let ๐‘›๐‘› = 3, ๐ด๐ด = ๐‘ฅ๐‘ฅ0 = 1 ,๐ต๐ต = ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 โ‰ฅ 2 ,๐ถ๐ถ = {๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 = 1 ๐‘š๐‘š๐‘œ๐‘œ๐‘š๐‘š 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โˆฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โˆฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โˆฉ ๐ถ๐ถ] ?

๐Ÿ๐Ÿ๐Ÿ๐Ÿ

๐Ÿ’๐Ÿ’๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ’๐Ÿ’

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ’๐Ÿ’

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

Page 11: CS 121: Lecture 23 Probability Review - Harvard University

0 1

0

0 0 0 0

01

1 1 1 1

1

000 001 010 011 100 101 110 111

Q: Let ๐‘›๐‘› = 3, ๐ด๐ด = ๐‘ฅ๐‘ฅ0 = 1 ,๐ต๐ต = ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 โ‰ฅ 2 ,๐ถ๐ถ = {๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 + ๐‘ฅ๐‘ฅ3 = 1 ๐‘š๐‘š๐‘œ๐‘œ๐‘š๐‘š 2} .What are (i) Pr[๐ต๐ต], (ii) Pr[๐ถ๐ถ], (iii) Pr[๐ด๐ด โˆฉ ๐ต๐ต], (iv) Pr[๐ด๐ด โˆฉ ๐ถ๐ถ] (v) Pr[๐ต๐ต โˆฉ ๐ถ๐ถ] ?

๐Ÿ๐Ÿ๐Ÿ๐Ÿ

๐Ÿ’๐Ÿ’๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ๐Ÿ

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ’๐Ÿ’

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

= ๐Ÿ๐Ÿ๐Ÿ’๐Ÿ’

๐Ÿ๐Ÿ๐Ÿ–๐Ÿ–

Compare Pr ๐ด๐ด โˆฉ ๐ต๐ต vs Pr ๐ด๐ด Pr[๐ต๐ต]Pr[๐ด๐ด โˆฉ ๐ถ๐ถ] vs Pr ๐ด๐ด Pr[๐ถ๐ถ]

Page 12: CS 121: Lecture 23 Probability Review - Harvard University

Operations on eventsPr ๐ด๐ด ๐‘œ๐‘œ๐‘œ๐‘œ ๐ต๐ต โ„Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘›๐‘›๐‘Ž๐‘Ž = Pr[ ๐ด๐ด โˆช ๐ต๐ต]

Pr ๐ด๐ด ๐‘Ž๐‘Ž๐‘›๐‘›๐‘š๐‘š ๐ต๐ต โ„Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘›๐‘› = Pr[ ๐ด๐ด โˆฉ ๐ต๐ต]

Pr ๐ด๐ด ๐‘š๐‘š๐‘œ๐‘œ๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘›๐‘›โ€ฒ๐‘ก๐‘ก โ„Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘›๐‘› = Pr ๐ด๐ด = Pr {0,1}๐‘›๐‘› โˆ– ๐ด๐ด = 1 โˆ’ Pr[๐ด๐ด]

Q: Prove the union bound: Pr ๐ด๐ด โˆจ ๐ต๐ต โ‰ค Pr ๐ด๐ด + Pr[๐ต๐ต]

Example: Pr ๐ด๐ด = 4/25 , Pr ๐ต๐ต = 6/25, Pr ๐ด๐ด โˆช ๐ต๐ต = 9/25

Page 13: CS 121: Lecture 23 Probability Review - Harvard University

IndependenceInformal: Two events ๐ด๐ด,๐ต๐ต are independent if knowing ๐ด๐ด happened doesnโ€™t give any information on whether ๐ต๐ต happened.Formal: Two events ๐ด๐ด,๐ต๐ต are independent if Pr ๐ด๐ด โˆฉ ๐ต๐ต = Pr ๐ด๐ด Pr[๐ต๐ต]

Q: Which pairs are independent?(i) (ii) (iii)

Pr ๐ต๐ต ๐ด๐ด = Pr ๐ต๐ตโ€œConditioning is the soul

of statisticsโ€

Page 14: CS 121: Lecture 23 Probability Review - Harvard University

More than 2 eventsInformal: Two events ๐ด๐ด,๐ต๐ต are independent if knowing whether ๐ด๐ด happened doesnโ€™t give any information on whether ๐ต๐ต happened.

Formal: Three events ๐ด๐ด,๐ต๐ต,๐ถ๐ถ are independent if every pair ๐ด๐ด,๐ต๐ต , ๐ด๐ด,๐ถ๐ถ and ๐ต๐ต,๐ถ๐ถis independent and Pr ๐ด๐ด โˆฉ ๐ต๐ต โˆฉ ๐ถ๐ถ = Pr ๐ด๐ด Pr[๐ต๐ต] Pr[๐ถ๐ถ]

(i) (ii)

Page 15: CS 121: Lecture 23 Probability Review - Harvard University

Random variablesAssign a number to every outcome of the coins.

Formally r.v. is ๐‘‹๐‘‹: {0,1}๐‘›๐‘› โ†’ โ„

Example: ๐‘‹๐‘‹ ๐‘ฅ๐‘ฅ = ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2

0 1 1 2 1 2 2 3

๐’—๐’— ๐‘ท๐‘ท๐‘ท๐‘ท[๐‘ฟ๐‘ฟ = ๐’—๐’—]

0 1/8

1 3/8

2 3/8

3 1/8

Page 16: CS 121: Lecture 23 Probability Review - Harvard University

ExpectationAverage value of ๐‘‹๐‘‹ : ๐”ผ๐”ผ ๐‘‹๐‘‹ = โˆ‘๐‘ฅ๐‘ฅโˆˆ{0,1}๐‘›๐‘› 2โˆ’๐‘›๐‘›๐‘‹๐‘‹(๐‘ฅ๐‘ฅ) = โˆ‘๐‘ฃ๐‘ฃโˆˆโ„ ๐‘ฃ๐‘ฃ โ‹… Pr[๐‘‹๐‘‹ = ๐‘ฃ๐‘ฃ]

Example: ๐‘‹๐‘‹ ๐‘ฅ๐‘ฅ = ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2

0 1 1 2 1 2 2 3

๐’—๐’— ๐‘ท๐‘ท๐‘ท๐‘ท[๐‘ฟ๐‘ฟ = ๐’—๐’—]

0 1/8

1 3/8

2 3/8

3 1/8

Q: What is ๐”ผ๐”ผ[๐‘‹๐‘‹]? 0 โ‹…18

+ 1 โ‹…38

+ 2 โ‹…38

+ 3 โ‹…18

=128

= 1.5

Page 17: CS 121: Lecture 23 Probability Review - Harvard University

Linearity of expectationLemma: ๐”ผ๐”ผ ๐‘‹๐‘‹ + ๐‘Œ๐‘Œ = ๐”ผ๐”ผ ๐‘‹๐‘‹ + ๐”ผ๐”ผ[๐‘Œ๐‘Œ]

Corollary: ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ0 + ๐‘ฅ๐‘ฅ1 + ๐‘ฅ๐‘ฅ2 = ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ0 + ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ1 + ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ2 = 1.5

Proof:

๐”ผ๐”ผ ๐‘‹๐‘‹ + ๐‘Œ๐‘Œ = 2โˆ’๐‘›๐‘›๏ฟฝ๐‘ฅ๐‘ฅ

(๐‘‹๐‘‹ ๐‘ฅ๐‘ฅ + ๐‘Œ๐‘Œ ๐‘ฅ๐‘ฅ ) = 2โˆ’๐‘›๐‘›๏ฟฝ๐‘ฅ๐‘ฅ

๐‘‹๐‘‹ ๐‘ฅ๐‘ฅ + 2โˆ’๐‘›๐‘›๏ฟฝ๐‘ฅ๐‘ฅ

๐‘Œ๐‘Œ(๐‘ฅ๐‘ฅ) = ๐”ผ๐”ผ ๐‘‹๐‘‹ + ๐”ผ๐”ผ[๐‘Œ๐‘Œ]

Page 18: CS 121: Lecture 23 Probability Review - Harvard University

Independent random variablesDef: ๐‘‹๐‘‹,๐‘Œ๐‘Œ are independent if {๐‘‹๐‘‹ = ๐‘ข๐‘ข} and {๐‘Œ๐‘Œ = ๐‘ฃ๐‘ฃ} are independent โˆ€๐‘ข๐‘ข, ๐‘ฃ๐‘ฃ

Def: ๐‘‹๐‘‹0, โ€ฆ ,๐‘‹๐‘‹๐‘˜๐‘˜โˆ’1, independent if ๐‘‹๐‘‹0 = ๐‘ฃ๐‘ฃ0 , โ€ฆ , {๐‘‹๐‘‹๐‘˜๐‘˜โˆ’1 = ๐‘ฃ๐‘ฃ๐‘˜๐‘˜โˆ’1} ind. โˆ€๐‘ข๐‘ข0 โ€ฆ๐‘ข๐‘ข๐‘˜๐‘˜โˆ’1i.e. โˆ€๐‘ฃ๐‘ฃ, โ€ฆ , ๐‘ฃ๐‘ฃ๐‘˜๐‘˜โˆ’1 โˆ€๐‘†๐‘† โŠ† [๐‘˜๐‘˜]

Pr ๏ฟฝ๐‘–๐‘–โˆˆ๐‘†๐‘†

๐‘‹๐‘‹๐‘–๐‘– = ๐‘ฃ๐‘ฃ๐‘–๐‘– = ๏ฟฝ๐‘–๐‘–โˆˆ๐‘†๐‘†

Pr[๐‘‹๐‘‹๐‘–๐‘– = ๐‘ฃ๐‘ฃ๐‘–๐‘–]

Q: Let ๐‘ฅ๐‘ฅ โˆผ {0,1}๐‘›๐‘›. Let ๐‘‹๐‘‹0 = ๐‘ฅ๐‘ฅ0,๐‘‹๐‘‹1 = ๐‘ฅ๐‘ฅ1, โ€ฆ ,๐‘‹๐‘‹๐‘›๐‘›โˆ’1 = ๐‘ฅ๐‘ฅ๐‘›๐‘›โˆ’1Let ๐‘Œ๐‘Œ0 = ๐‘Œ๐‘Œ1 = โ‹ฏ = ๐‘Œ๐‘Œ๐‘˜๐‘˜โˆ’1 = ๐‘ฅ๐‘ฅ0

Are ๐‘‹๐‘‹0, โ€ฆ ,๐‘‹๐‘‹๐‘›๐‘›โˆ’1 independent?

Are ๐‘Œ๐‘Œ0, โ€ฆ ,๐‘Œ๐‘Œ๐‘›๐‘›โˆ’1 independent?

Page 19: CS 121: Lecture 23 Probability Review - Harvard University

Independence and concentrationQ: Let ๐‘ฅ๐‘ฅ โˆผ {0,1}๐‘›๐‘›. Let ๐‘‹๐‘‹0 = ๐‘ฅ๐‘ฅ0,๐‘‹๐‘‹1 = ๐‘ฅ๐‘ฅ1, โ€ฆ ,๐‘‹๐‘‹๐‘›๐‘›โˆ’1 = ๐‘ฅ๐‘ฅ๐‘›๐‘›โˆ’1

Let ๐‘Œ๐‘Œ0 = ๐‘Œ๐‘Œ1 = โ‹ฏ = ๐‘Œ๐‘Œ๐‘˜๐‘˜โˆ’1 = ๐‘ฅ๐‘ฅ0Let ๐‘‹๐‘‹ = ๐‘‹๐‘‹0 + โ‹ฏ+ ๐‘‹๐‘‹๐‘›๐‘›โˆ’1 , ๐‘Œ๐‘Œ = ๐‘Œ๐‘Œ0 + โ‹ฏ+ ๐‘Œ๐‘Œ๐‘›๐‘›โˆ’1

Compute ๐”ผ๐”ผ[๐‘‹๐‘‹], compute ๐”ผ๐”ผ[๐‘Œ๐‘Œ]

For ๐‘›๐‘› = 100, estimate Pr[๐‘Œ๐‘Œ โˆ‰ 0.4๐‘›๐‘›, 0.6๐‘›๐‘› ] , Pr[๐‘‹๐‘‹ โˆ‰ 0.4๐‘›๐‘›, 0.6๐‘›๐‘› ]

Page 20: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 5 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 37.5%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 21: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 10 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 34.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 22: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 20 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 26.3%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 23: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 50 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 11.9%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 24: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 100 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 3.6%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 25: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 200 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 0.4%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 26: CS 121: Lecture 23 Probability Review - Harvard University

Sums of independent random variables

๐‘›๐‘› = 400 Pr[๐‘‹๐‘‹ โˆ‰ 0.4,0.6 ๐‘›๐‘›]= 0.01%https://www.stat.berkeley.edu/~stark/Java/Html/BinHist.htm

Page 27: CS 121: Lecture 23 Probability Review - Harvard University

Concentration bounds๐‘‹๐‘‹ sum of ๐‘›๐‘› independent random variables โ€“ โ‰ˆโ€œNormal/Gaussian/bell curveโ€:

Pr ๐‘‹๐‘‹ โˆ‰ 0.99,1.01 ๐”ผ๐”ผ ๐‘‹๐‘‹ < exp(โˆ’๐›ฟ๐›ฟ โ‹… ๐‘›๐‘›)

Otherwise: weaker bounds

In concentrated r.v.โ€™s, expectation, median, mode, etc.. โ‰ˆ same

Chernoff Bound: Let ๐‘‹๐‘‹0, โ€ฆ ,๐‘‹๐‘‹๐‘›๐‘›โˆ’1 i.i.d. r.v.โ€™s with ๐‘‹๐‘‹๐‘–๐‘– โˆˆ [0,1]. Then if ๐‘‹๐‘‹ = ๐‘‹๐‘‹0 + โ‹ฏ+ ๐‘‹๐‘‹๐‘›๐‘›โˆ’1 and ๐‘Ž๐‘Ž = ๐”ผ๐”ผ ๐‘‹๐‘‹ for every ๐œ–๐œ– > 0,

Pr ๐‘‹๐‘‹ โˆ’ ๐‘›๐‘›๐‘Ž๐‘Ž > ๐œ–๐œ–๐‘›๐‘› < 2 โ‹… exp(โˆ’2๐œ–๐œ–2 โ‹… ๐‘›๐‘›)

Independent & identically distributed

Page 28: CS 121: Lecture 23 Probability Review - Harvard University

Concentration bounds๐‘‹๐‘‹ sum of ๐‘›๐‘› independent random variables โ€“ โ‰ˆโ€œNormal/Gaussian/bell curveโ€:

Pr ๐‘‹๐‘‹ โˆ‰ 0.99,1.01 ๐”ผ๐”ผ ๐‘‹๐‘‹ < exp(โˆ’๐›ฟ๐›ฟ โ‹… ๐‘›๐‘›)

Otherwise: weaker bounds

In concentrated r.v.โ€™s, expectation, median, mode, etc.. โ‰ˆ same

Chernoff Bound: Let ๐‘‹๐‘‹0, โ€ฆ ,๐‘‹๐‘‹๐‘›๐‘›โˆ’1 i.i.d. r.v.โ€™s with ๐‘‹๐‘‹๐‘–๐‘– โˆˆ [0,1]. Then if ๐‘‹๐‘‹ = ๐‘‹๐‘‹0 + โ‹ฏ+ ๐‘‹๐‘‹๐‘›๐‘›โˆ’1 and ๐‘Ž๐‘Ž = ๐”ผ๐”ผ ๐‘‹๐‘‹ for every ๐œ–๐œ– > 0,

Pr ๐‘‹๐‘‹ โˆ’ ๐‘›๐‘›๐‘Ž๐‘Ž > ๐œ–๐œ–๐‘›๐‘› < 2 โ‹… exp(โˆ’2๐œ–๐œ–2 โ‹… ๐‘›๐‘›)

Independent & identically distributed

< exp โˆ’๐œ–๐œ–2 โ‹… ๐‘›๐‘› if ๐‘›๐‘› >1๐œ–๐œ–2

Page 29: CS 121: Lecture 23 Probability Review - Harvard University

Simplest Bound: Markovโ€™s InequalityQ: Suppose that the average age in a neighborhood is 20. Prove that at most 1/4 of the residents are 80 or older.

Thm (Markovโ€™s Inequality): Let ๐‘‹๐‘‹ be non-negative r.v. and ๐œ‡๐œ‡ = ๐”ผ๐”ผ[๐‘‹๐‘‹]. Then for every ๐‘˜๐‘˜ > 1, Pr ๐‘‹๐‘‹ โ‰ฅ ๐‘˜๐‘˜๐œ‡๐œ‡ โ‰ค 1/๐‘˜๐‘˜Proof: Same as question above

Avg > 14โ‹… 80 = 20

A: Suppose otherwise:80

0

>14

Contribute enough to

make avg>20

Page 30: CS 121: Lecture 23 Probability Review - Harvard University

Variance & ChebychevIf ๐‘‹๐‘‹ is r.v. with ๐œ‡๐œ‡ = ๐”ผ๐”ผ[๐‘‹๐‘‹] then ๐‘‰๐‘‰๐‘Ž๐‘Ž๐‘œ๐‘œ ๐‘‹๐‘‹ = ๐”ผ๐”ผ ๐‘‹๐‘‹ โˆ’ ๐œ‡๐œ‡ 2 = ๐”ผ๐”ผ ๐‘‹๐‘‹2 โˆ’ ๐”ผ๐”ผ ๐‘‹๐‘‹ 2

๐œŽ๐œŽ ๐‘‹๐‘‹ = ๐‘‰๐‘‰๐‘Ž๐‘Ž๐‘œ๐‘œ(๐‘‹๐‘‹)

Chebychevโ€™s Inequality: For every r.v. ๐‘‹๐‘‹

Pr ๐‘‹๐‘‹ โ‰ฅ ๐‘˜๐‘˜ ๐‘š๐‘š๐‘Ž๐‘Ž๐‘ฃ๐‘ฃ๐‘‘๐‘‘๐‘Ž๐‘Ž๐‘ก๐‘ก๐‘‘๐‘‘๐‘œ๐‘œ๐‘›๐‘›๐‘Ž๐‘Ž ๐‘“๐‘“๐‘œ๐‘œ๐‘œ๐‘œ๐‘š๐‘š ๐œ‡๐œ‡ โ‰ค1๐‘˜๐‘˜2

Compare with ๐‘‹๐‘‹ = โˆ‘๐‘‹๐‘‹๐‘–๐‘– i.i.d or other r.v.โ€™s well approx. by Normal where

Pr ๐‘‹๐‘‹ โ‰ฅ ๐‘˜๐‘˜ ๐‘š๐‘š๐‘Ž๐‘Ž๐‘ฃ๐‘ฃ๐‘‘๐‘‘๐‘Ž๐‘Ž๐‘ก๐‘ก๐‘‘๐‘‘๐‘œ๐‘œ๐‘›๐‘›๐‘Ž๐‘Ž ๐‘“๐‘“๐‘œ๐‘œ๐‘œ๐‘œ๐‘š๐‘š ๐œ‡๐œ‡ โ‰ˆ exp(โˆ’๐‘˜๐‘˜2)

(Proof: Markov on ๐‘Œ๐‘Œ = ๐‘‹๐‘‹ โˆ’ ๐œ‡๐œ‡ 2 )

Page 31: CS 121: Lecture 23 Probability Review - Harvard University

Next Lectures

โ€ข Randomized Algorithmsโ€ข Some examples

โ€ข Randomized Complexity Classesโ€ข BPTIME ๐‘‡๐‘‡ ๐‘›๐‘› , BPPโ€ข Properties of randomized computation (Reducing error โ€ฆ)

Page 32: CS 121: Lecture 23 Probability Review - Harvard University
Page 33: CS 121: Lecture 23 Probability Review - Harvard University