chapter 1 sample space and probability

Post on 26-Nov-2015

36 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Probability

Tuesday, February 4, 14

What is Probability?

• Probability is a mathematical discipline akin to geometry.

• Formal Logic Content.

• Intuitive Background.

• Applications.

Tuesday, February 4, 14

What is Probability?

• “The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is , or ought to be, in a reasonable man's mind” James Clerck Maxwell (1850)

Tuesday, February 4, 14

Why to study Probability?

Probability

Machine Learning

CS Theory

Algo

rithm

s

Networks Information

Theory

Tuesday, February 4, 14

Probabilistic Model

• The sample space ⌦, which is the set of all possible

outcomes of an experiment.

• The probability law, which assigns to a set A of

possible outcomes (also called an event) a nonnega-

tive number P(A) (called probability of A) that

encodes our knowledge or belief about the collec-

tive “likelihood” of the elements A. The probability

law must satisfy certain properties to be introduced

shortly.

Tuesday, February 4, 14

Probabilistic Model

Experiment

Sample Space

(Set of possible outcomes)

Probability law

EventsA B

A

B

P(A)

P(B)

Tuesday, February 4, 14

Set Theory

• Finite, Countable and Uncountable Sets.

• Set Operations.

• The Algebra of Sets.

Tuesday, February 4, 14

Sets

A set is a collection of objects, which are the elementsof the set.

x 2 S

x /2 S

Tuesday, February 4, 14

Specifying Sets• As a list of elements:

S = {x1, x2, . . . , xn}

• With words:

the set of even natural numbers

• Specify a rule or algorithm:

S = {r 2 Q : r

2< 2}

Tuesday, February 4, 14

Subsets

If every element of a set S is also an element of a set

T , we say that S is a subset of T , and we write

S ⇢ T or T � S

If S ⇢ T and T ⇢ S the two sets are equal , and we write

S = T

The universal set, denoted by ⌦, contains all the ob-

jects of interest in a particular context.

Tuesday, February 4, 14

Set Operations

• The complement of a set S, with respect to the

universe ⌦ is the set of elements in ⌦ that do not

belong to S:

S

c= {x 2 ⌦ : x /2 S}

Tuesday, February 4, 14

Set Operations

• The union of two sets S and T is the set of all ele-

ments that belong to S or T (or both):

S [ T = {x 2 ⌦ : x 2 S orx 2 T}

• The intersection of two sets S and T is the set of

all elements that belong to both S and T :

S \ T = {x 2 ⌦ : x 2 S andx 2 T}

Tuesday, February 4, 14

Algebra of Sets

Tuesday, February 4, 14

De Morgan’s laws

Tuesday, February 4, 14

Functions

Domain Range

Tuesday, February 4, 14

Cardinality

{Thorin,Balin,Bifur, Bofur,

Bombur, Dori,Fili, Gloin,

Kili}

{Gruñon, Mocoso,Tímido, Mudito,Dormilón, Felíz,

Sabio}

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Power Sets and Cantor’s Theorem

Tuesday, February 4, 14

Probabilistic Model

• The sample space ⌦, which is the set of all possible

outcomes of an experiment.

• The probability law, which assigns to a set A of

possible outcomes (also called an event) a nonnega-

tive number P(A) (called probability of A) that

encodes our knowledge or belief about the collec-

tive “likelihood” of the elements A. The probability

law must satisfy certain properties to be introduced

shortly.

Tuesday, February 4, 14

Probabilistic Model

Experiment

Sample Space

(Set of possible outcomes)

Probability law

EventsA B

A

B

P(A)

P(B)

Tuesday, February 4, 14

Sample Space

• Different elements of the sample space should be distinct and mutually exclusive.

• The sample space must be collectively exhaustive.

Tuesday, February 4, 14

Sample Space (Examples)

For a single toss coin the sample space ⌦ consist of two

points:

⌦ = {H,T}

(We exclude possibilities like “the coin stands on edge”,

“the coin disappears”, etc.)

Tuesday, February 4, 14

For n tosses of a coin the sample space is

⌦ = {w : w = (a1, . . . , an), ai = H or T}

and the general number N(⌦) of outcomes is 2

n.

Sample Space (Examples)

Tuesday, February 4, 14

For n tosses of a coin the sample space is

⌦ = {w : w = (a1, . . . , an), ai = H or T}

and the general number N(⌦) of outcomes is 2

n.

Sample Space (Examples)

Tuesday, February 4, 14

Sample Space (Examples)

12 3 41

2

3

4

1st roll

2nd

roll 2

3

4

1

root

(1,1)(1,2)(1,3)(1,4)

Sample space for pair of 4-sided rolls

Tree-

based

seque

ncial

descri

ption

Tuesday, February 4, 14

Probability Laws

1. (Non-negativity) P(A) � 0, for every event A.

Tuesday, February 4, 14

Probability Laws2. (Additivity) If A and B are two disjoint events,

then the probability of their union satisfies

P(A [B) = P(A) +P(B)

More generally, if the sample space has an infinite

number of elements and A1, A2, . . . is a sequence of

disjoint events then the probability of their union

satisfies

P(A1 [A2 [ · · · ) = P(A1) +P(A2) + · · ·

Tuesday, February 4, 14

Probability Laws

3. (Normalization) The probability of the entire sam-

ple space ⌦ is equal to 1, that is

P(⌦) = 1

Tuesday, February 4, 14

Properties

Let A, B and C be events.

1. P(;) = 0.

2. If A ✓ B, then P(A) P(B)

3. P(A [B) = P(A) +P(B)�P(A \B)

4. P(A [B) P(A) +P(B)

Tuesday, February 4, 14

Discrete Probability LawsIf the sample space consists of a finite number of pos-

sible outcomes, then the probability law is specified by

the probabilities of the events that consists of a single el-

ement. In particular, the probability of any event A =

{s1, s2, . . . , sn} is the sum of the probabilities of its ele-

ments:

P(A) = P(s1) +P(s2) + · · ·+P(sn)

Tuesday, February 4, 14

Discrete Uniform Probability Laws

If the sample space consists of n possible outcomes

which are equally likely (i.e. all single-element events have

the same probability), then the probability of any event Ais given by

P(A) =

N(A)

n

Tuesday, February 4, 14

Discrete Uniform Probability Laws

12 3 41

2

3

4

1st roll

2nd

roll

Sample space for pair of 4-sided rolls

{the first roll is equal to the second}

{at least one roll is a 4}

Tuesday, February 4, 14

Counting

Tuesday, February 4, 14

Cardinality

Tuesday, February 4, 14

Product Rule

Theorem (Pairs). With n elements a1, a2, . . . , an and melements b1, b2, . . . , bm, it is possible to form nm pairs (ai, bj)containing one element for each group.

Tuesday, February 4, 14

Generalized Product Rule

Theorem. Given n1 elements a1, a2, . . . , an1 and n2 ele-

ments b1, b2, . . . , bn2 , etc., up to nr elements x1, x2, . . . , xnr ;

it is possible to form n1n2 · · ·nr ordered r-tuples (ai1 , ai2 , . . . , air )containing one element for each group.

Tuesday, February 4, 14

Generalized Product Rule

n1 n2 n3 n4choices choices choices choices

Stage 1 Stage 2 Stage 3 Stage 4

Tuesday, February 4, 14

Examples

• Throwing a dice r times. ¿No 1 in r throws?

• Display of flags. Suppose r flags of di↵erent colors

are to be displayed on n poles in a row. In how many

ways can this be done?

• Loops, Recursions, ...

Tuesday, February 4, 14

Ordered Samples• Consider a set or population of n elements a1, . . . , an.Any ordered arrangement (ai1 , . . . , aik) of k sym-

bols is called an ordered sample of size k drawn

from our population.

– Sampling with replacement (repetitions are al-

lowed).

– Sampling without replacement (repetitions are

not allowed).

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

N(⌦) = nk

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample with Replacement

N(⌦) = nk

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

N(⌦) = (n)k

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Ordered Sample without Replacement

Ordered Samples

N(⌦) = (n)k

Tuesday, February 4, 14

Ordered Samples

Theorem. For a population of n elements and a pre-

scribed sample size k, there exist nkdi↵erent samples with

replacement and (n)k samples without replacement.

Tuesday, February 4, 14

Ordered Samples• When k = n that sample is called a permutation of

the elements of the population and (n)n = n!

• Whenever we speak of random samples of fixed size

k, the adjective random is to imply that all samples

have the same probability, namely, n�kin sampling

with replacement and 1/(n)k in sampling without

replacement.

• If n is large and k is relatively small, the ratio (n)k/nk

is near unity i.e. the two ways of sampling are prac-

tically equivalent.

Tuesday, February 4, 14

Examples (Balls and Bins)• Probability an element is not included in the sample?

• If n balls are randomly placed into n cells, what is

the probability every cell is occupied?

1 2 3 4 5 6 7 8 9

Tuesday, February 4, 14

Birthday Paradox

• Throw a dice 6 times, what is the probability all six

faces appear?

• Elevator (10 floors, 7 people)

Tuesday, February 4, 14

The Sum Rule

Theorem. If A1, A2, . . . , An are disjoint sets, then:

|A1 [A2 [ · · · [An| = |A1|+ |A2|+ · · ·+ |An|

Tuesday, February 4, 14

Counting Passwords

• On a certain computer system, a valid password isa sequence between six and eight symbols. The firstsymbol must be a letter which can be upper case orlower case and the remaining symbols must be eitherletter or numbers.

Tuesday, February 4, 14

Subpopulations

• Two populations are considered di↵erent only if one

contains an element not contained in the other.

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

Subpopulations

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

Subpopulations

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

Subpopulations

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

Subpopulations

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

N(⌦) =

✓n

k

Subpopulations

Tuesday, February 4, 14

We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.

Unordered Sample without Replacement

N(⌦) =

✓n

k

Subpopulations

Tuesday, February 4, 14

Subpopulations

Theorem. A population of n elements possesses

�nr

�dif-

ferent sub-populations of size r n.

✓n

r

◆=

✓n

n� r

✓n

0

◆= 1, 0! = 1

Tuesday, February 4, 14

The Division Rule

Theorem. If f : A ! B is k-to-1, then:

|A| = k |B|

• Unlike as it may seem, many counting problems are

made much easier by initially counting every item

multiple times and the correcting the answer using

the division rule.

Tuesday, February 4, 14

Examples

• Hands of poker. What is the probability that a handof poker contains contains five di↵erent face values?

• Occupancy problem. Probability that a specified cellcontains exactly k balls?

Tuesday, February 4, 14

Examples (Balls and bins)

1 2 3 4 5 6 7 8 9

Tuesday, February 4, 14

PartitionsTheorem. Let k1, . . . , kr be integers such that

k1 + k2 + · · ·+ kr = n, ki � 0,

The number of ways in which a population of n elements

can be divided into k ordered parts (partitioned into k sub-

populations) of which the first contains k1 elements, the

second k2 elements, etc., is

n!

k1!k2! · · · kr!

Tuesday, February 4, 14

Example: Sequences with repetitions

• How many sequences can be formed by permutingthe letters in the 10-letter word BOOKKEEPER?

Tuesday, February 4, 14

Example: Balls and Bins

1 2 3 4 5 6 7 8 9

Tuesday, February 4, 14

Example: Binomial Theorem

Theorem. For all n 2 N and a, b 2 R:

(a+ b)n =nX

k=0

✓n

k

◆an�kbk

Tuesday, February 4, 14

Example: Multinomial Theorem

Theorem. For all n 2 N and zi 2 R:

(z1 + z2 + · · ·+ zm)n =X

k1, . . . , km 2 N

k1 + · · ·+ km = n

✓n

k1, . . . km

◆zk11 zk2

2 · · · zkmm

Tuesday, February 4, 14

Combinatorial Proofs• Symmetry ✓

n

k

◆=

✓n

n� k

• Pascal’s Identity

✓n

k

◆=

✓n� 1

k � 1

◆+

✓n� 1

k

• ✓3n

n

◆=

nX

r=0

✓n

r

◆✓2n

n� r

Tuesday, February 4, 14

Conditional Probability

Tuesday, February 4, 14

Conditional Probability

• Conditional Probability provides us with a way to

reason about the outcome of an experiment, based

on partial information.

• We seek to construct a new probability law that

takes into account the new information: a probability

law that form ant even A, specifies the conditional

probability of A given B denoted by

P(A|B)

Tuesday, February 4, 14

Conditional Probability

• P(A|B) must constitute a legitimate probability law.

• “The conditional probabilities must be consistent with

our intuition in important special cases”

P(A|B) =P(A \B)

P(B)

where we assume that P(B) > 0.

Tuesday, February 4, 14

Conditional Probability

1. (Non-negativity) P(A|B) � 0, for every event A.

Tuesday, February 4, 14

Conditional Probability2. (Additivity) If A1 and A2 are two disjoint events,

then the probability of their union satisfies

P(A1 [A2|B) = P(A1|B) +P(A2|B)

More generally, if the sample space has an infinite

number of elements and A1, A2, . . . is a sequence of

disjoint events then the probability of their union

satisfies

P(A1 [A2 [ · · · |B) = P(A1|B) +P(A2|B) + · · ·

Tuesday, February 4, 14

Conditional Probability

3. (Normalization) The probability of the entire sam-

ple space ⌦ is equal to 1, that is

P(⌦|B) = 1

Tuesday, February 4, 14

Examples

We toss a coin three successive times. Compute the

conditional probability P(A|B) when A and B are the

events:

A = { more heads than tails come up}

B = {1st toss is a head}

Tuesday, February 4, 14

Conditional Probability for Modeling

• When building models for experiments with sequen-tial character, it is natural to first specify condi-

tional probabilities and the use them to determine

unconditional probabilities:

P(A \B) = P(B)P(A|B)

Tuesday, February 4, 14

ExamplesIf an aircraft is present in a certain area, a radar detects

it and generates an alarm signal with probability 0.99. Ifan aircraft is not present, the radar generates a (false)

alarm with probability 0.10. We assume that an aircraft

is present with probability 0.005.

• What is the probability of no aircraft presence and

a false alarm?

• What is the probability of aircraft presence and no

detection?

Tuesday, February 4, 14

Examples

P(A)

=0.0

5

P(A c) =

0.95P(B

|Ac ) =

0.10

P(B c|A c) = 0.90

P(B c|A) = 0.01

P(B|A) =

0.99

AircraftPresent

Missed Detection

FalseAlarm

Tuesday, February 4, 14

Multiplication Rule

Assuming that all of the conditioning events have pos-

itive probability, we have:

P(

n\

i=1

Ai) = P(A1)P(A2|A1)P(A3|A1A2) · · ·P(An|n�1\

i=1

Ai)

Tuesday, February 4, 14

Example

Three cards are drawn from an ordinary 52-card deck

without replacement.

• What is the probability that none of the three cards

is a heart, assuming that all triplets are equally likely?

Tuesday, February 4, 14

The Monty Hall Problem 1 2 3

Tuesday, February 4, 14

The Monty Hall Problem 1 2 3

Tuesday, February 4, 14

The Monty Hall Problem 1 2 3

Tuesday, February 4, 14

The Monty Hall Problem 1 2 3

Tuesday, February 4, 14

The Monty Hall Problem

D O P(D) P(O|D) P(D)P(O|D) Keep Switch

1 2 13 p 1

3p Win Lose1 3 1

3 (1� p) 13 (1� p) Win Lose

2 3 13 1 1

3 Lose Win3 2 1

3 1 13 Lose Win

Tuesday, February 4, 14

Minimum Global Cut

• Global min cut. Given a connected, undirected

graph G = (V,E), find a cut (A,B) of minimum

cardinality.

• Applications. Identify clusters, network reliability,

TSP solvers.

Tuesday, February 4, 14

Minimum Global Cut

10

Contraction algorithm

Contraction algorithm. [Karger 1995]

・Pick an edge e = (u, v) uniformly at random.

・Contract edge e.- replace u and v by single new super-node w- preserve edges, updating endpoints of u and v to w- keep parallel edges, but delete self-loops

・Repeat until graph has just two nodes v1 and v1.

・Return the cut (all nodes that were contracted to form v1).

u v w⇒

contract u-v

a b c

e

f

ca b

f

d

Tuesday, February 4, 14

Minimum Global Cut

11

Contraction algorithm

Contraction algorithm. [Karger 1995]

・Pick an edge e = (u, v) uniformly at random.

・Contract edge e.- replace u and v by single new super-node w- preserve edges, updating endpoints of u and v to w- keep parallel edges, but delete self-loops

・Repeat until graph has just two nodes v1 and v1.

・Return the cut (all nodes that were contracted to form v1).

Reference: Thore Husfeldt

Tuesday, February 4, 14

Minimum Global Cut

Tuesday, February 4, 14

Total Probability TheoremLet A1, A2, . . . , An be events that form a partition of

the sample space ⌦, that is

Ai \Aj = ; for i 6= j and

n[

i=1

Ai = ⌦

and assume that P(Ai) > 0, for all i. Then, for any

event B, we have

P(B) = P(A1)P(B|A1) + · · ·P(An)P(B|An)

Tuesday, February 4, 14

Total Probability Theorem

A1

A2

A3

A4

B

Tuesday, February 4, 14

Example

You enter a chess tournament where your probability

of winning is 0.3 against half the players (call then type

I), 0.4 against a quarter of the players (call the type II),

and 0.5 against the remainder quarter of the players (call

them type III). You play a game against a randomly chosen

opponent. What is the probability of winning?

Tuesday, February 4, 14

Inference and Bayes’ Theorem

Let A1, A2, . . . , An be disjoint events that form a par-

tition of the sample space, and assume that P(Ai) > 0, for

all i, then, for any event B such that P(B) > 0, we have

P(Ai|B) =

P(Ai)P(B|Ai)

P(B)

=

P(Ai)P(B|Ai)

P(A1)P(B|A1) + · · ·+P(An)P(B|An)

Tuesday, February 4, 14

Inference and Bayes’ Theorem

P(Ai|B) =P(Ai)P(B|Ai)

P(B)

Prior

Posterior

Tuesday, February 4, 14

Example

• Probability that an aircraft is present?

• Suppose that you won. What is the probability you

had an opponent of type I?

• The False-Positive Puzzle.

Tuesday, February 4, 14

Examples

If an aircraft is present in a certain area, a radar detects

it and generates an alarm signal with probability 0.99. Ifan aircraft is not present, the radar generates a (false)

alarm with probability 0.10. We assume that an aircraft

is present with probability 0.005.

• What is the probability of aircraft presence given

alarm went o↵?

Tuesday, February 4, 14

ExampleYou enter a chess tournament where your probability

of winning is 0.3 against half the players (call then type

I), 0.4 against a quarter of the players (call the type II),

and 0.5 against the remainder quarter of the players (call

them type III). You play a game against a randomly chosen

opponent. What is the probability of having an opponent

of type I given you won?

Tuesday, February 4, 14

Independence

Tuesday, February 4, 14

Independent Events

• When the occurrence of B does not alter the proba-

bility of A:

P(A|B) = P(A)

we say that A is independent of B. Equivalently,

P(A \B) = P(A)P(B)

Tuesday, February 4, 14

Example• Consider an experiment involving two successive rolls

of a 4-sided die in which all 16 possible outcomes are

equally likely.

Ai = {1st roll results in i}, Bj = {2nd roll results in j}

A = {1st roll is a 1}, B = {sum is a 5}

A = {minimum is 2}, B = {maximum is 2}

Tuesday, February 4, 14

Conditional Independence

• Given an even C, the events A and B are called con-

ditionally independent if

P(A \B|C) = P(A|C)P(B|C)

Tuesday, February 4, 14

Example

Tuesday, February 4, 14

Independence of aCollection of Events

We say that the events A1, A2, . . . , An are indepen-dent if

P(

\

i2S

Ai) =

Y

i2S

P(Ai), for every subset S of {1, 2, . . . , n}

Tuesday, February 4, 14

Pairwise independence does not imply independence

Tuesday, February 4, 14

Example

Tuesday, February 4, 14

Minimum Global Cut

Tuesday, February 4, 14

Network Connectivity

A

D

B

C

F

E

0.9

0.8 0.9

0.75

0.950.85

0.95

Tuesday, February 4, 14

Network Connectivity

1 32

Series Connection

1

32

Parallel Connection

Tuesday, February 4, 14

Independent Trials andBinomial Probabilities

• If an experiment involves a sequence of independent

but identical stages, we say we have a sequence of

independent trials.

• In the special case where there are only two possible

results at each stage, we say we have a sequence of

independent Bernoulli trials.

Tuesday, February 4, 14

Contention Resolution in a Distributed System

4

Contention resolution in a distributed system

Contention resolution. Given n processes P1, …, Pn, each competing for

access to a shared database. If two or more processes access the database

simultaneously, all processes are locked out. Devise protocol to ensure all

processes get through on a regular basis.

Restriction. Processes can't communicate.

Challenge. Need symmetry-breaking paradigm.

P1

P2

Pn

.

.

.

Tuesday, February 4, 14

top related