chapter 1 sample space and probability
TRANSCRIPT
Probability
Tuesday, February 4, 14
What is Probability?
• Probability is a mathematical discipline akin to geometry.
• Formal Logic Content.
• Intuitive Background.
• Applications.
Tuesday, February 4, 14
What is Probability?
• “The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is , or ought to be, in a reasonable man's mind” James Clerck Maxwell (1850)
Tuesday, February 4, 14
Why to study Probability?
Probability
Machine Learning
CS Theory
Algo
rithm
s
Networks Information
Theory
Tuesday, February 4, 14
Probabilistic Model
• The sample space ⌦, which is the set of all possible
outcomes of an experiment.
• The probability law, which assigns to a set A of
possible outcomes (also called an event) a nonnega-
tive number P(A) (called probability of A) that
encodes our knowledge or belief about the collec-
tive “likelihood” of the elements A. The probability
law must satisfy certain properties to be introduced
shortly.
Tuesday, February 4, 14
Probabilistic Model
Experiment
Sample Space
(Set of possible outcomes)
Probability law
EventsA B
A
B
P(A)
P(B)
Tuesday, February 4, 14
Set Theory
• Finite, Countable and Uncountable Sets.
• Set Operations.
• The Algebra of Sets.
Tuesday, February 4, 14
Sets
A set is a collection of objects, which are the elementsof the set.
x 2 S
x /2 S
Tuesday, February 4, 14
Specifying Sets• As a list of elements:
S = {x1, x2, . . . , xn}
• With words:
the set of even natural numbers
• Specify a rule or algorithm:
S = {r 2 Q : r
2< 2}
Tuesday, February 4, 14
Subsets
If every element of a set S is also an element of a set
T , we say that S is a subset of T , and we write
S ⇢ T or T � S
If S ⇢ T and T ⇢ S the two sets are equal , and we write
S = T
The universal set, denoted by ⌦, contains all the ob-
jects of interest in a particular context.
Tuesday, February 4, 14
Set Operations
• The complement of a set S, with respect to the
universe ⌦ is the set of elements in ⌦ that do not
belong to S:
S
c= {x 2 ⌦ : x /2 S}
Tuesday, February 4, 14
Set Operations
• The union of two sets S and T is the set of all ele-
ments that belong to S or T (or both):
S [ T = {x 2 ⌦ : x 2 S orx 2 T}
• The intersection of two sets S and T is the set of
all elements that belong to both S and T :
S \ T = {x 2 ⌦ : x 2 S andx 2 T}
Tuesday, February 4, 14
Algebra of Sets
Tuesday, February 4, 14
De Morgan’s laws
Tuesday, February 4, 14
Functions
Domain Range
Tuesday, February 4, 14
Cardinality
{Thorin,Balin,Bifur, Bofur,
Bombur, Dori,Fili, Gloin,
Kili}
{Gruñon, Mocoso,Tímido, Mudito,Dormilón, Felíz,
Sabio}
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Power Sets and Cantor’s Theorem
Tuesday, February 4, 14
Probabilistic Model
• The sample space ⌦, which is the set of all possible
outcomes of an experiment.
• The probability law, which assigns to a set A of
possible outcomes (also called an event) a nonnega-
tive number P(A) (called probability of A) that
encodes our knowledge or belief about the collec-
tive “likelihood” of the elements A. The probability
law must satisfy certain properties to be introduced
shortly.
Tuesday, February 4, 14
Probabilistic Model
Experiment
Sample Space
(Set of possible outcomes)
Probability law
EventsA B
A
B
P(A)
P(B)
Tuesday, February 4, 14
Sample Space
• Different elements of the sample space should be distinct and mutually exclusive.
• The sample space must be collectively exhaustive.
Tuesday, February 4, 14
Sample Space (Examples)
For a single toss coin the sample space ⌦ consist of two
points:
⌦ = {H,T}
(We exclude possibilities like “the coin stands on edge”,
“the coin disappears”, etc.)
Tuesday, February 4, 14
For n tosses of a coin the sample space is
⌦ = {w : w = (a1, . . . , an), ai = H or T}
and the general number N(⌦) of outcomes is 2
n.
Sample Space (Examples)
Tuesday, February 4, 14
For n tosses of a coin the sample space is
⌦ = {w : w = (a1, . . . , an), ai = H or T}
and the general number N(⌦) of outcomes is 2
n.
Sample Space (Examples)
Tuesday, February 4, 14
Sample Space (Examples)
12 3 41
2
3
4
1st roll
2nd
roll 2
3
4
1
root
(1,1)(1,2)(1,3)(1,4)
Sample space for pair of 4-sided rolls
Tree-
based
seque
ncial
descri
ption
Tuesday, February 4, 14
Probability Laws
1. (Non-negativity) P(A) � 0, for every event A.
Tuesday, February 4, 14
Probability Laws2. (Additivity) If A and B are two disjoint events,
then the probability of their union satisfies
P(A [B) = P(A) +P(B)
More generally, if the sample space has an infinite
number of elements and A1, A2, . . . is a sequence of
disjoint events then the probability of their union
satisfies
P(A1 [A2 [ · · · ) = P(A1) +P(A2) + · · ·
Tuesday, February 4, 14
Probability Laws
3. (Normalization) The probability of the entire sam-
ple space ⌦ is equal to 1, that is
P(⌦) = 1
Tuesday, February 4, 14
Properties
Let A, B and C be events.
1. P(;) = 0.
2. If A ✓ B, then P(A) P(B)
3. P(A [B) = P(A) +P(B)�P(A \B)
4. P(A [B) P(A) +P(B)
Tuesday, February 4, 14
Discrete Probability LawsIf the sample space consists of a finite number of pos-
sible outcomes, then the probability law is specified by
the probabilities of the events that consists of a single el-
ement. In particular, the probability of any event A =
{s1, s2, . . . , sn} is the sum of the probabilities of its ele-
ments:
P(A) = P(s1) +P(s2) + · · ·+P(sn)
Tuesday, February 4, 14
Discrete Uniform Probability Laws
If the sample space consists of n possible outcomes
which are equally likely (i.e. all single-element events have
the same probability), then the probability of any event Ais given by
P(A) =
N(A)
n
Tuesday, February 4, 14
Discrete Uniform Probability Laws
12 3 41
2
3
4
1st roll
2nd
roll
Sample space for pair of 4-sided rolls
{the first roll is equal to the second}
{at least one roll is a 4}
Tuesday, February 4, 14
Counting
Tuesday, February 4, 14
Cardinality
Tuesday, February 4, 14
Product Rule
Theorem (Pairs). With n elements a1, a2, . . . , an and melements b1, b2, . . . , bm, it is possible to form nm pairs (ai, bj)containing one element for each group.
Tuesday, February 4, 14
Generalized Product Rule
Theorem. Given n1 elements a1, a2, . . . , an1 and n2 ele-
ments b1, b2, . . . , bn2 , etc., up to nr elements x1, x2, . . . , xnr ;
it is possible to form n1n2 · · ·nr ordered r-tuples (ai1 , ai2 , . . . , air )containing one element for each group.
Tuesday, February 4, 14
Generalized Product Rule
…
…
…
…
n1 n2 n3 n4choices choices choices choices
Stage 1 Stage 2 Stage 3 Stage 4
Tuesday, February 4, 14
Examples
• Throwing a dice r times. ¿No 1 in r throws?
• Display of flags. Suppose r flags of di↵erent colors
are to be displayed on n poles in a row. In how many
ways can this be done?
• Loops, Recursions, ...
Tuesday, February 4, 14
Ordered Samples• Consider a set or population of n elements a1, . . . , an.Any ordered arrangement (ai1 , . . . , aik) of k sym-
bols is called an ordered sample of size k drawn
from our population.
– Sampling with replacement (repetitions are al-
lowed).
– Sampling without replacement (repetitions are
not allowed).
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
N(⌦) = nk
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample with Replacement
N(⌦) = nk
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
N(⌦) = (n)k
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Ordered Sample without Replacement
Ordered Samples
N(⌦) = (n)k
Tuesday, February 4, 14
Ordered Samples
Theorem. For a population of n elements and a pre-
scribed sample size k, there exist nkdi↵erent samples with
replacement and (n)k samples without replacement.
Tuesday, February 4, 14
Ordered Samples• When k = n that sample is called a permutation of
the elements of the population and (n)n = n!
• Whenever we speak of random samples of fixed size
k, the adjective random is to imply that all samples
have the same probability, namely, n�kin sampling
with replacement and 1/(n)k in sampling without
replacement.
• If n is large and k is relatively small, the ratio (n)k/nk
is near unity i.e. the two ways of sampling are prac-
tically equivalent.
Tuesday, February 4, 14
Examples (Balls and Bins)• Probability an element is not included in the sample?
• If n balls are randomly placed into n cells, what is
the probability every cell is occupied?
1 2 3 4 5 6 7 8 9
Tuesday, February 4, 14
Birthday Paradox
• Throw a dice 6 times, what is the probability all six
faces appear?
• Elevator (10 floors, 7 people)
Tuesday, February 4, 14
The Sum Rule
Theorem. If A1, A2, . . . , An are disjoint sets, then:
|A1 [A2 [ · · · [An| = |A1|+ |A2|+ · · ·+ |An|
Tuesday, February 4, 14
Counting Passwords
• On a certain computer system, a valid password isa sequence between six and eight symbols. The firstsymbol must be a letter which can be upper case orlower case and the remaining symbols must be eitherletter or numbers.
Tuesday, February 4, 14
Subpopulations
• Two populations are considered di↵erent only if one
contains an element not contained in the other.
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
Subpopulations
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
Subpopulations
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
Subpopulations
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
Subpopulations
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
N(⌦) =
✓n
k
◆
Subpopulations
Tuesday, February 4, 14
We now consider examples involving the selection of kballs from an urn containing n distinguishable balls.
Unordered Sample without Replacement
N(⌦) =
✓n
k
◆
Subpopulations
Tuesday, February 4, 14
Subpopulations
Theorem. A population of n elements possesses
�nr
�dif-
ferent sub-populations of size r n.
✓n
r
◆=
✓n
n� r
◆
✓n
0
◆= 1, 0! = 1
Tuesday, February 4, 14
The Division Rule
Theorem. If f : A ! B is k-to-1, then:
|A| = k |B|
• Unlike as it may seem, many counting problems are
made much easier by initially counting every item
multiple times and the correcting the answer using
the division rule.
Tuesday, February 4, 14
Examples
• Hands of poker. What is the probability that a handof poker contains contains five di↵erent face values?
• Occupancy problem. Probability that a specified cellcontains exactly k balls?
Tuesday, February 4, 14
Examples (Balls and bins)
1 2 3 4 5 6 7 8 9
Tuesday, February 4, 14
PartitionsTheorem. Let k1, . . . , kr be integers such that
k1 + k2 + · · ·+ kr = n, ki � 0,
The number of ways in which a population of n elements
can be divided into k ordered parts (partitioned into k sub-
populations) of which the first contains k1 elements, the
second k2 elements, etc., is
n!
k1!k2! · · · kr!
Tuesday, February 4, 14
Example: Sequences with repetitions
• How many sequences can be formed by permutingthe letters in the 10-letter word BOOKKEEPER?
Tuesday, February 4, 14
Example: Balls and Bins
1 2 3 4 5 6 7 8 9
Tuesday, February 4, 14
Example: Binomial Theorem
Theorem. For all n 2 N and a, b 2 R:
(a+ b)n =nX
k=0
✓n
k
◆an�kbk
Tuesday, February 4, 14
Example: Multinomial Theorem
Theorem. For all n 2 N and zi 2 R:
(z1 + z2 + · · ·+ zm)n =X
k1, . . . , km 2 N
k1 + · · ·+ km = n
✓n
k1, . . . km
◆zk11 zk2
2 · · · zkmm
Tuesday, February 4, 14
Combinatorial Proofs• Symmetry ✓
n
k
◆=
✓n
n� k
◆
• Pascal’s Identity
✓n
k
◆=
✓n� 1
k � 1
◆+
✓n� 1
k
◆
• ✓3n
n
◆=
nX
r=0
✓n
r
◆✓2n
n� r
◆
Tuesday, February 4, 14
Conditional Probability
Tuesday, February 4, 14
Conditional Probability
• Conditional Probability provides us with a way to
reason about the outcome of an experiment, based
on partial information.
• We seek to construct a new probability law that
takes into account the new information: a probability
law that form ant even A, specifies the conditional
probability of A given B denoted by
P(A|B)
Tuesday, February 4, 14
Conditional Probability
• P(A|B) must constitute a legitimate probability law.
• “The conditional probabilities must be consistent with
our intuition in important special cases”
P(A|B) =P(A \B)
P(B)
where we assume that P(B) > 0.
Tuesday, February 4, 14
Conditional Probability
1. (Non-negativity) P(A|B) � 0, for every event A.
Tuesday, February 4, 14
Conditional Probability2. (Additivity) If A1 and A2 are two disjoint events,
then the probability of their union satisfies
P(A1 [A2|B) = P(A1|B) +P(A2|B)
More generally, if the sample space has an infinite
number of elements and A1, A2, . . . is a sequence of
disjoint events then the probability of their union
satisfies
P(A1 [A2 [ · · · |B) = P(A1|B) +P(A2|B) + · · ·
Tuesday, February 4, 14
Conditional Probability
3. (Normalization) The probability of the entire sam-
ple space ⌦ is equal to 1, that is
P(⌦|B) = 1
Tuesday, February 4, 14
Examples
We toss a coin three successive times. Compute the
conditional probability P(A|B) when A and B are the
events:
A = { more heads than tails come up}
B = {1st toss is a head}
Tuesday, February 4, 14
Conditional Probability for Modeling
• When building models for experiments with sequen-tial character, it is natural to first specify condi-
tional probabilities and the use them to determine
unconditional probabilities:
P(A \B) = P(B)P(A|B)
Tuesday, February 4, 14
ExamplesIf an aircraft is present in a certain area, a radar detects
it and generates an alarm signal with probability 0.99. Ifan aircraft is not present, the radar generates a (false)
alarm with probability 0.10. We assume that an aircraft
is present with probability 0.005.
• What is the probability of no aircraft presence and
a false alarm?
• What is the probability of aircraft presence and no
detection?
Tuesday, February 4, 14
Examples
P(A)
=0.0
5
P(A c) =
0.95P(B
|Ac ) =
0.10
P(B c|A c) = 0.90
P(B c|A) = 0.01
P(B|A) =
0.99
AircraftPresent
Missed Detection
FalseAlarm
Tuesday, February 4, 14
Multiplication Rule
Assuming that all of the conditioning events have pos-
itive probability, we have:
P(
n\
i=1
Ai) = P(A1)P(A2|A1)P(A3|A1A2) · · ·P(An|n�1\
i=1
Ai)
Tuesday, February 4, 14
Example
Three cards are drawn from an ordinary 52-card deck
without replacement.
• What is the probability that none of the three cards
is a heart, assuming that all triplets are equally likely?
Tuesday, February 4, 14
The Monty Hall Problem 1 2 3
Tuesday, February 4, 14
The Monty Hall Problem 1 2 3
Tuesday, February 4, 14
The Monty Hall Problem 1 2 3
Tuesday, February 4, 14
The Monty Hall Problem 1 2 3
Tuesday, February 4, 14
The Monty Hall Problem
D O P(D) P(O|D) P(D)P(O|D) Keep Switch
1 2 13 p 1
3p Win Lose1 3 1
3 (1� p) 13 (1� p) Win Lose
2 3 13 1 1
3 Lose Win3 2 1
3 1 13 Lose Win
Tuesday, February 4, 14
Minimum Global Cut
• Global min cut. Given a connected, undirected
graph G = (V,E), find a cut (A,B) of minimum
cardinality.
• Applications. Identify clusters, network reliability,
TSP solvers.
Tuesday, February 4, 14
Minimum Global Cut
10
Contraction algorithm
Contraction algorithm. [Karger 1995]
・Pick an edge e = (u, v) uniformly at random.
・Contract edge e.- replace u and v by single new super-node w- preserve edges, updating endpoints of u and v to w- keep parallel edges, but delete self-loops
・Repeat until graph has just two nodes v1 and v1.
・Return the cut (all nodes that were contracted to form v1).
u v w⇒
contract u-v
a b c
e
f
ca b
f
d
Tuesday, February 4, 14
Minimum Global Cut
11
Contraction algorithm
Contraction algorithm. [Karger 1995]
・Pick an edge e = (u, v) uniformly at random.
・Contract edge e.- replace u and v by single new super-node w- preserve edges, updating endpoints of u and v to w- keep parallel edges, but delete self-loops
・Repeat until graph has just two nodes v1 and v1.
・Return the cut (all nodes that were contracted to form v1).
Reference: Thore Husfeldt
Tuesday, February 4, 14
Minimum Global Cut
Tuesday, February 4, 14
Total Probability TheoremLet A1, A2, . . . , An be events that form a partition of
the sample space ⌦, that is
Ai \Aj = ; for i 6= j and
n[
i=1
Ai = ⌦
and assume that P(Ai) > 0, for all i. Then, for any
event B, we have
P(B) = P(A1)P(B|A1) + · · ·P(An)P(B|An)
Tuesday, February 4, 14
Total Probability Theorem
A1
A2
A3
A4
B
Tuesday, February 4, 14
Example
You enter a chess tournament where your probability
of winning is 0.3 against half the players (call then type
I), 0.4 against a quarter of the players (call the type II),
and 0.5 against the remainder quarter of the players (call
them type III). You play a game against a randomly chosen
opponent. What is the probability of winning?
Tuesday, February 4, 14
Inference and Bayes’ Theorem
Let A1, A2, . . . , An be disjoint events that form a par-
tition of the sample space, and assume that P(Ai) > 0, for
all i, then, for any event B such that P(B) > 0, we have
P(Ai|B) =
P(Ai)P(B|Ai)
P(B)
=
P(Ai)P(B|Ai)
P(A1)P(B|A1) + · · ·+P(An)P(B|An)
Tuesday, February 4, 14
Inference and Bayes’ Theorem
P(Ai|B) =P(Ai)P(B|Ai)
P(B)
Prior
Posterior
Tuesday, February 4, 14
Example
• Probability that an aircraft is present?
• Suppose that you won. What is the probability you
had an opponent of type I?
• The False-Positive Puzzle.
Tuesday, February 4, 14
Examples
If an aircraft is present in a certain area, a radar detects
it and generates an alarm signal with probability 0.99. Ifan aircraft is not present, the radar generates a (false)
alarm with probability 0.10. We assume that an aircraft
is present with probability 0.005.
• What is the probability of aircraft presence given
alarm went o↵?
Tuesday, February 4, 14
ExampleYou enter a chess tournament where your probability
of winning is 0.3 against half the players (call then type
I), 0.4 against a quarter of the players (call the type II),
and 0.5 against the remainder quarter of the players (call
them type III). You play a game against a randomly chosen
opponent. What is the probability of having an opponent
of type I given you won?
Tuesday, February 4, 14
Independence
Tuesday, February 4, 14
Independent Events
• When the occurrence of B does not alter the proba-
bility of A:
P(A|B) = P(A)
we say that A is independent of B. Equivalently,
P(A \B) = P(A)P(B)
Tuesday, February 4, 14
Example• Consider an experiment involving two successive rolls
of a 4-sided die in which all 16 possible outcomes are
equally likely.
Ai = {1st roll results in i}, Bj = {2nd roll results in j}
A = {1st roll is a 1}, B = {sum is a 5}
A = {minimum is 2}, B = {maximum is 2}
Tuesday, February 4, 14
Conditional Independence
• Given an even C, the events A and B are called con-
ditionally independent if
P(A \B|C) = P(A|C)P(B|C)
Tuesday, February 4, 14
Example
Tuesday, February 4, 14
Independence of aCollection of Events
We say that the events A1, A2, . . . , An are indepen-dent if
P(
\
i2S
Ai) =
Y
i2S
P(Ai), for every subset S of {1, 2, . . . , n}
Tuesday, February 4, 14
Pairwise independence does not imply independence
Tuesday, February 4, 14
Example
Tuesday, February 4, 14
Minimum Global Cut
Tuesday, February 4, 14
Network Connectivity
A
D
B
C
F
E
0.9
0.8 0.9
0.75
0.950.85
0.95
Tuesday, February 4, 14
Network Connectivity
1 32
Series Connection
1
32
Parallel Connection
Tuesday, February 4, 14
Independent Trials andBinomial Probabilities
• If an experiment involves a sequence of independent
but identical stages, we say we have a sequence of
independent trials.
• In the special case where there are only two possible
results at each stage, we say we have a sequence of
independent Bernoulli trials.
Tuesday, February 4, 14
Contention Resolution in a Distributed System
4
Contention resolution in a distributed system
Contention resolution. Given n processes P1, …, Pn, each competing for
access to a shared database. If two or more processes access the database
simultaneously, all processes are locked out. Devise protocol to ensure all
processes get through on a regular basis.
Restriction. Processes can't communicate.
Challenge. Need symmetry-breaking paradigm.
P1
P2
Pn
.
.
.
Tuesday, February 4, 14