ch 8 fundamentals of probability theory

Ch 8Fundamentals of Probability Theory

ENGR 4323/5323Digital and Analog Communication

Engineering and PhysicsUniversity of Central Oklahoma

Dr. Mohamed Bingabr

Chapter Outline

• Concept of Probability

• Random Variables

• Statistical Averages (MEANS)

• Correlation

• Linear Mean Square Estimation

• Sum of Random Variables

• Central Limit Theorem

2

Deterministic and Random Signals

Deterministic Signals: Signals that can be determined by

mathematical equation or graph. It is possible to predict the

future values with 100% certainty.

Random Process Signals: Unpredictable message signals and

noise waveform. These type of signal are information-bearing

signals and they play key roles in communications.

3

Concept of Probability

4

Experiment: In probability theory an experiment is a process

whose outcome cannot be fully predicted. (Throwing a die)

Sample space: A set that contain all possible outcomes of an

experiment. {1, 2, 3, 4, 5, 6}

Sample point (element): an outcome of an experiment. {3}

Event: A subset of the sample space that share some

common characteristics. {2, 4, 6} even number

Complement of event A (Ac): Event containing all points not in

A. {1, 3, 5}


5

Null event (ø): Event that has no sample point.

Union of events A and/or B (A U B): The event that contains all points in A or in B or in both.

Intersection (joint) of events A and B (A ∩ B, AB): The event that contain all points common to event A and B.

Mutually Exclusive: Events A and B are mutually exclusive if A occur then B can not occur.

Relative frequency and Probability: If event A is of interest and an experiment is conducted N times then the relative frequency of A occurrence (probability) is


6

A B

S


7

Joint Probability:

If A and B are mutually exclusive A ∩ B = ø then

Conditional Probability: the probability of one event is influenced by the outcome of another event.

Independent Events: The occurrence of one event is not influenced by the occurrence of the other event.

A B

S

Bernoulli Trials

8

Bernoulli trial is an experiment where there are two possible outcomes, success or failure. If the probability of success is p then the probability of failure is (1-p).

Number of way to arrange k success in n trials =

𝑝 ¿ = 𝑛!

𝑘!(𝑛−𝑘)! 𝑝𝑘(1−𝑝)𝑛−𝑘

Example 1

9

A binary symmetric channel (BSC) has an error probability Pe = 0.001 (i.e., the probability of receiving 0 when 1 is transmitted, or vice versa). Note that the channel behavior is symmetrical with respect to 0 and 1. A sequence of 8 binary digits is transmitted over this channel. Determine the probability of receiving exactly 2 digits in error.

Example 2

10

In binary communication, one of the techniques used to increase the reliability of a channel is to repeat a message several times. For example, we can send each message (0 or 1) three times. Hence, the transmitted digits are 000 (for message 0) or 111 (for message 1). Because of channel noise, we may receive any one of the eight possible combinations of three binary digits. The decision as to which message is transmitted is made by the majority rule. If Pe is the error probability of one digit, and P(ϵ) is the probability of making a wrong decision in this scheme. Find P(ϵ) in term of Pe. If Pe = 0.01 then what is P(ϵ) ?

Multiplication Rule for Conditional Probability

11

𝑃 ( 𝐴∩𝐵 )=𝑃 ( 𝐴 ) 𝑃 (𝐵 /𝐴 )𝑃 (𝐴1 𝐴2 … 𝐴𝑛 )=𝑃 (𝐴1 ) .𝑃 ( 𝐴2/𝐴1 ) .𝑃 ( 𝐴3 /𝐴1 𝐴2 ) …𝑃 (𝐴𝑛/ 𝐴1 𝐴2 … 𝐴𝑛−1 )

Example

Suppose a box of diodes consist of Ng good diodes and Nb bad diodes. If five diodes are randomly selected, one at a time, without replacement, determine the probability of obtaining the sequence of diodes in the order of good, bad, good, good, bad.

The Total Probability Theorem

12

Let n disjoint events A1, …, An from a partition of the sample spaces S such that

Then the probability of an event B can be written as

𝑃 (𝐵)=∑𝑖=1

𝑛

𝑃 (𝐵/ 𝐴𝑖)𝑃 ( 𝐴𝑖)

¿ 𝑖=1¿𝑛 𝐴𝑖=𝑆 and 𝐴𝑖∩ 𝐴 𝑗=∅ , if 𝑖≠ 𝑗

This theorem simplifies the analysis of the more complex events of interest, B, by identifying all different causes Ai.

Example

13

The decoding of a data packet may be in error because of N distinct error patterns E1, E2, …, En it encounters. These error patterns are mutually exclusive, each with probability P(Ei) = pi. When the error pattern Ei occurs, the data packet would be incorrectly decoded with probability qi. Find the probability that the data packet is incorrectly decoded.

Baye’s Theorem

14

Baye’s theorem determines the likelihood of a particular cause of an event among many disjoint possible causes.

Theorem

Let n disjoint events A1, …, An form a partition of the sample space S. Let B be an event with P(B) >0. Then for j=1, …, n,

𝑃 (𝐴 𝑗 /𝐵)=𝑃 (𝐵 /𝐴 𝑗 )𝑃 (𝐴 𝑗)

𝑃 (𝐵)=

𝑃 (𝐵/ 𝐴 𝑗 )𝑃 (𝐴 𝑗)

∑𝑖=1

𝑛

𝑃 (𝐵/ 𝐴𝑖)𝑃 (𝐴𝑖)

Example

15

A communication system always encounter one of three possible interference waveforms: F1, F2, or F3. The probability of each interference is 0.8, 0.16, and 0.04, respectively. The communication system fails with probability 0.01, 0.1, and 0.4 when it encounters F1, F2, and F3, respectively. Given that the system has failed, find the probability that the failure is a result of F1, F2, or F3, respectively.

Random Variable

16

A discrete random variable has numerical values that resulted from mapping sample points (outcomes of experiment) to these numbers.

The outcomes of tossing a coin are {H, T} we can assign 1 for head and -1 for tail. The random variable X = {1, -1}

∑𝑖𝑃𝑥 (𝑥 𝑖 )=1

Random Variable

17

𝑃𝑥𝑦 (𝑥𝑖 , 𝑦 𝑗 )=𝑃𝑥 (𝑥 𝑖 )𝑃 𝑦 (𝑦 𝑗 )

For two independent random variables X and Y (tossing two coins):

∑𝑖∑𝑗𝑃𝑥 𝑦 (𝑥𝑖 , 𝑦 𝑗 )=1

Example

18

A binary symmetric channel (BSC) error probability is Pe. The probability of transmission 1 is Q, and that of transmitting 0 is 1-Q. Determine the probability of receiving 1 and 0 at the receiver.

Conditional Probabilities

19

∑𝑖𝑃x∨ y (𝑥 𝑖|𝑦 𝑗 )=∑

𝑗𝑃 y∨x ( 𝑦 𝑗|𝑥 𝑖 )=1

If x and y are two RVs, then the conditional probability of

x = xi given y = yj is denoted by Px|y(xi|yj)

∑𝑖∑𝑗𝑃 x y (𝑥 𝑖 , 𝑦 𝑗 )=1

𝑃 y (𝑦 𝑗)=∑𝑖𝑃 xy(𝑥𝑖 , 𝑦 𝑗)

𝑃 x (𝑥 𝑖 )=∑𝑗𝑃 xy(𝑥 𝑖 , 𝑦 𝑗)

Conditional Probabilities

20

𝑃 y ( 𝑦 𝑗 )=∑𝑖𝑃 y∨x ( 𝑦 𝑗|𝑥 𝑖 )𝑃 x (𝑥 𝑖 )

If x and y are two RVs, then the conditional probability of

x = xi given y = yj is denoted by Px|y(xi|yj)

𝑃 x (𝑥 𝑖 )=∑𝑗𝑃 x∨ y (𝑥𝑖|𝑦 𝑗 )𝑃 y (𝑦 𝑗 )

Example

21

Over a certain binary communication channel, the symbol 0 is transmitted with probability 0.4 and 1 is transmitted with probability 0.6. It is given that P(ϵ|0) = 10-6 and P(ϵ|1) = 10-4, where P(ϵ|xi) is the probability of detecting the error given that xi is transmitted. Determine P(ϵ), the error probability of the channel.

Cumulative Distribution Function (CDF)

22

A CDF, Fx(x), of an RV X is the probability that X takes a value less than or equal to x.

Property of CDF

1) Fx(x) 0

2) Fx() = 1

3) Fx (-)=0

4) Fx(x) is a nondecreasing function.

Example

23

In an experiment, a trial consists of four successive tosses of a coin. If we define an RV x as the number of heads appearing in a trial, determine Px(x) and Fx(x).

Continuous Random Variable

24

The random variable has continuous value.

px(x) is the probability density function (pdf) that describes the relative frequency of occurrence of different values of x.

Properties of the probability density function:

∫− ∞

∞

𝑝 x (𝑥 ) 𝑑𝑥=1

𝑃 (𝑥1<𝑥≤ 𝑥2 )=∫𝑥1

𝑥2

𝑝x (𝑥 )𝑑𝑥=𝐹 x (𝑥2)− 𝐹 x (𝑥1)

𝑝x (𝑥 ) ≥ 0

Cumulative distribution function:𝐹 x (𝑥 )=∫−∞

𝑥

𝑝x (𝑢)𝑑𝑢=1

𝑝x (𝑥 )=𝑑𝐹 x (𝑥 )𝑑𝑥

Continuous Random Variable

25

The Gaussian (Normal) Random Variable

26

𝑝x (𝑥 )= 1√2𝜋

𝑒−𝑥2/2

𝐹 x (𝑥 )= 1√2𝜋 ∫

−∞

𝑥

𝑒−𝑥2 /2𝑑𝑥

𝑄 (𝑥 )=1 −𝐹 x (𝑥)

Q (𝑦 )= 1√2𝜋∫

𝑦

∞

𝑒−𝑥2/2𝑑𝑥

𝐹 x (𝑥 )=𝑃 ( x ≤ 𝑥 )=1 −𝑄 (𝑥 )

𝑃 (x>𝑥 )=𝑄 (𝑥)

Standard Gaussian RV (µ = 0, σ = 1)

The Gaussian (Normal) Random Variable

29

𝑝x (𝑥 )= 1𝜎 √2𝜋

𝑒−(𝑥−𝑚)2 /2𝜎 2

𝐹 x (𝑥 )= 1𝜎 √2𝜋 ∫

− ∞

𝑥

𝑒−(𝑥−𝑚 )2/2𝜎 2

𝑑𝑥

𝐹 x (𝑥 )=𝑃 ( x ≤ 𝑥 )=1 −𝑄 (𝑥−𝑚𝜎 )

𝑃 (x>𝑥 )=𝑄 (𝑥−𝑚𝜎 )

General Gaussian RV (µ , σ)

Example

30

Over a certain binary channel, message m = 0 and 1 are transmitted with equal probability by using a positive and negative pulse, respectively. The received pulse corresponding to 1 is p(t), shown in the figure, and the received pulse corresponding to 0 is –p(t). Let the peak amplitude of p(t) be Ap at t = Tp. The channel noise n(t) has a normal distribution with zero mean and standard deviation. Because of the channel noise, the received pulse will be

What is the probability of error Pe.

𝑟 (𝑡 )=±𝑝 (𝑡 )+𝑛(𝑡 )

Example (cont.)

31

𝑃𝑒=∑𝑖𝑃 (𝜖 ,𝑚𝑖)

𝑃𝑒=∑𝑖𝑃 (𝑚𝑖)𝑃 (𝜖∨𝑚𝑖)

𝑃𝑒=𝑃 (0 )𝑃 (𝜖|0 )+𝑃 (1)𝑃 (𝜖∨1)

𝑃 (𝜖|0 )=𝑃 (𝑛>𝐴𝑃 )=𝑄( 𝐴𝑝

𝜎𝑛)

𝑃 (𝜖|1 )=𝑃 (𝑛<− 𝐴𝑃 )=𝑄 ( 𝐴𝑝

𝜎𝑛)

𝑃𝑒=𝑄 ( 𝐴𝑝

𝜎𝑛)

Joint Distribution

32

For two RVs x and y, the CDF Fxy(x,y)𝐹 xy (𝑥 , 𝑦 )=𝑃 (x ≤ 𝑥∧y≤ 𝑦)

𝑝xy (𝑥 , 𝑦 )= 𝜕2

𝜕 𝑥𝜕 𝑦 𝐹 xy(𝑥 , 𝑦 )

𝑃 (𝑥1<x ≤𝑥2 , 𝑦1< y ≤ 𝑦2 )=∫𝑥1

𝑥2

∫𝑦1

𝑦2

𝑝xy (𝑥 , 𝑦 )𝑑𝑥𝑑𝑦

𝑝x (𝑥 )=∫− ∞

∞

𝑝 xy (𝑥 , 𝑦 )𝑑𝑦

𝑝 y (𝑦 )=∫− ∞

∞

𝑝 xy (𝑥 , 𝑦 )𝑑𝑥

Conditional Densities

33

For two RVs x and y, the CDF Fxy(x,y)𝐹 xy (𝑥 , 𝑦 )=𝑃 (x ≤ 𝑥𝑎𝑛𝑑 y ≤ 𝑦 )

𝑝x ∨ y (𝑥∨𝑦 )=𝑝 xy (𝑥 , 𝑦 )𝑝 y (𝑦 )

𝑝 y∨ x ( 𝑦∨𝑥 )=𝑝 xy (𝑥 , 𝑦 )𝑝x (𝑥 )

Bayes’ rule𝑝x ∨ y (𝑥∨𝑦 )𝑝 y ( 𝑦 )=𝑝 y∨x (𝑦∨𝑥 )𝑝x (𝑥 )

Independent Random Variables

𝑝x ∨ y (𝑥∨𝑦 )=𝑝x (𝑥 )

𝑝x ∨ y (𝑥∨𝑦 )=𝑝x (𝑥 )

𝑝xy (𝑥 , 𝑦 )=𝑝 x (𝑥 )𝑝 y (𝑦 )

Rayleigh Density Example

34

Derive the Rayleigh probability density function (pdf).

𝑝𝑟 (𝑟 )={ 𝑟𝜎2 𝑒−𝑟 2 /2𝜎 2

𝑟 ≥ 0

0𝑟<0

Statistical Averages (MEANS)

35

The average value or expected value of RV x

x=𝐸 [𝑥]=∑𝑖=1

𝑛

𝑥 𝑖𝑃 x (𝑥 𝑖)

x=𝐸 [𝑥]=∫− ∞

∞

𝑥𝑝 x (𝑥 )𝑑𝑥

Mean of a function g(x) of a random variable x

𝑔 (𝑥)=∑𝑖=1

𝑛

𝑔 (𝑥¿¿ 𝑖)𝑃 x(𝑥 𝑖)¿

𝑔 (𝑥)=∫−∞

∞

𝑔(𝑥 )𝑝 x (𝑥 )𝑑𝑥

The random variable x can be the alphabetic letters and the function could be the PCM

Example

36

Example:The output voltage of sinusoid generator is A cos(ωt). This output is sampled randomly. The sampled output is an RV x, which can take on any value in the range (-A, A). Determine the mean value and the mean square value of the sample output.

Statistical Averages (MEANS)

37

Mean of the Sum

+

𝑔1(𝑥)𝑔2(𝑦 )=∫− ∞

∞

∫− ∞

∞

𝑔1(𝑥)𝑔2(𝑦 )𝑝 x y (𝑥 , 𝑦 )𝑑𝑥𝑑𝑦

Mean of the product

If RVs x and y are independent, then

𝑔1(𝑥)𝑔2(𝑥)=∫− ∞

∞

𝑔1(𝑥)𝑝x (𝑥 )𝑑𝑥∫− ∞

∞

𝑔2(𝑦 )𝑝 y (𝑦 ) 𝑑𝑦

Moments

38

The nth moment of an RV x

The nth central moment of an RV x

The variance and standard deviation

x𝑛=∫−∞

∞

𝑥𝑛𝑝 x (𝑥 ) 𝑑𝑥

(x − x )𝑛=∫− ∞

∞

(𝑥− x )𝑛𝑝x (𝑥 )𝑑𝑥

𝜎 x2=(x − x )2=x2 − x2

Example

39

Find the mean, variance, and the Mean Square of the Uniform Quantization Error in PCM.

Example

40

Find the variance and the Mean Square Error Caused by Channel Noise in PCM.

Variance of a Sum of Independent RVs

41

z= x+ y 𝜎 z2=𝜎 x

2+𝜎 y2

ExampleFind the total mean square error in PCM

Quantization Channelm ~m

𝑞=𝑚−�� 𝜖=m−~m

m

Chebyshev’s Inequality

42

The standard deviation σ of an RV x is a measure of the width of its PDF. The standard deviation in communication is also used to estimate the bandwidth of a signal spectrum.

𝑃 (¿ x − x∨≤𝑘𝜎 x ) ≥ 1− 1𝑘2

𝑃 (¿ x∨≤𝑘𝜎 x ) ≥ 1− 1𝑘2

Correlation

43

The covariance is a measure of the nature of dependence between the RVs x and y.

𝜎 xy=(x− x)( y− y)

𝜎 xy=xy − x y

Correlation coefficient is a normalized covariance.

𝜌 xy=𝜎 xy

𝜎x𝜎 y

−1≤ 𝜌 xy ≤1

Independent variable are uncorrelated, the converse is not necessarily true.

Linear Mean Square Estimation

44

When two random variables x and y are related (dependent), then it is possible to estimate the value of y from a knowledge of the value of x.

𝜖2=( y− y )2y=𝑎 x

Minimum square error is one possible criterion for the estimation of y.

The optimum estimation is to choose a to make

𝜕𝜖2

𝜕𝑎 =2𝑎 x2− 2 xy=0 𝑎= xyx2 =

𝑅xy

𝑅xx

𝜖2= ( y−𝑎 x )2= ( y−𝑎 x ) y −𝑎 .𝜖 x

𝜖2=( y−𝑎 x ) y=𝑅yy −𝑎𝑅xy

𝜖 x=( y −𝑅xy

𝑅xxx )x=0

Mean square error

Using n Random Variable for Estimation

45

Using n random variables x1, x2,…,xn to estimate a random variable x0.

𝜕𝜖2

𝜕 𝑎𝑖=− 2 [ x0 − (𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛 ) ] x𝑖=0

x0=𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛

𝑅0 𝑖=𝑎1 𝑅𝑖1+𝑎2 𝑅𝑖2+…+𝑎𝑛𝑅𝑖𝑛

𝜖2= [ x0− (𝑎1 x1+𝑎2 x2+…+𝑎𝑛 x𝑛) ]2

𝑅𝑖𝑗= x𝑖 x 𝑗where

[𝑎1

𝑎2

⋮𝑎𝑛

]=[𝑅11 𝑅12 … 𝑅1𝑛

𝑅21 𝑅22 … 𝑅2𝑛

⋯ ⋯ ⋯ ⋯𝑅𝑛1 𝑅𝑛2 ⋯ 𝑅𝑛𝑛

]−1

[𝑅0 1

𝑅02

⋮𝑅0𝑛

]𝜖2=𝑅00− (𝑎1𝑅01+𝑎2𝑅02+…+𝑎𝑛𝑅0𝑛)

Example

46

In differential pulse code modulation (DPCM), instead of transmitting sample values directly, we estimate (predict) the value of each sample from the knowledge of previous n samples. The estimation error k, the difference between the actual value and the estimated value of the kth sample, is quantized and transmitted. Because the estimation error k is smaller than the sample value mk, for the same number of quantization levels, the SNR is increased. The SNR improvement is equal to , where and are the mean square values of the speech signal and the estimation error , respectively.

Find the optimum linear second-order predictor and the corresponding SNR improvement.

Sum of Random Variables

47

How does the pdf of z relate to the pdfs of x and y?

If x and y are independent random variables

z= x+ y

𝐹 z (𝑧 )=𝑃 ( z≤ 𝑧 )=𝑃 ( x≤ ∞ , y ≤ 𝑧−𝑥 )=∫− ∞

∞

𝑑𝑥 ∫− ∞

𝑧−𝑥

𝑝xy (𝑥 , 𝑦 )𝑑𝑦

𝑝z (𝑧 )=𝑑 𝐹 𝑧 (𝑧)𝑑𝑧 =∫

− ∞

∞

𝑝xy (𝑥 , 𝑧−𝑥 )𝑑𝑥

𝑝z (𝑧 )=∫− ∞

∞

𝑝 x (𝑥 )𝑝 y (𝑧−𝑥 )𝑑𝑥

The PDF of z is the convolution of the PDFs of x and y.

Sum of Gaussian Random Variables

48

y is a Gaussian RV with

If x1 and x2 are jointly Gaussian but not necessarily independent then

y=x1+x2

The sum of jointly distributed Gaussian random variables is also a Gaussian random variable regardless of their relationship such as independence.

y=x1+x2 𝜎 y2=𝜎 x1

2 +𝜎 x2

2

𝜎 y2=𝜎 x1

2 +𝜎 x2

2 +2𝜎 x1 x2

Sum of Gaussian Random Variables

49

The fact that the sum of jointly distributed Gaussian random variables is also a Gaussian random variable, has important practical application.

For example, if xk is a sequence of jointly Gaussian signal samples passing through a discrete time filter with impulse response {hi}, then the filter output y is also Gaussian

𝑦=∑𝑖=0

∞

h𝑖 x𝑘−𝑖

The Central Limit Theorem

50

The sum of a large number of independent RVs tends to be a Gaussian random variable, independently of the probability densities of the variable added.

The Central Limit Theorem (for the Sample Mean)

51

Let x1, x2, …, xn be independent random variables from a given distribution with mean µ and variance σ2 with 0< σ2<. Then the sample mean

is a Gaussian random variable with mean equals µ and variance equals σ2/n.

lim𝑛→ ∞

𝑃 [ x𝑛−𝜇𝜎 /√𝑛

≤ 𝑥]=∫− ∞

𝑥 1√2𝜋

𝑒−𝑣2 /2𝑑𝑣

lim𝑛→ ∞

𝑃 [ x𝑛−𝜇𝜎 /√𝑛

>𝑥 ]=𝑄 (𝑥 )

Also is a Gaussian random variable with mean equals nµ and variance equals nσ2.

Example

52

Consider the communication system that transmits a data packet of 1024 bits. Each bit can be in error with probability of 10-2. Find the (approximate) probability that more than 30 of the 1024 bits are in error.

ch 8 fundamentals of probability theory

Documents