farag - random variables

7/31/2019 Farag - Random Variables

1/38

F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 1

Contents

Chapter 4 ......................................................................................................................................... 3Basics of Statistical Modeling ........................................................................................................ 3

4.1 Introduction ........................................................................................................................... 34.2 Statistical Experiments.......................................................................................................... 44.3 Random Variables ................................................................................................................. 8

4.3.1 Basic Concepts ............................................................................................................... 8

4.3.2 Properties of the CDF and the PDF of a random variable ........................................... 13

4.3.3The Conditional Distribution ........................................................................................ 15

4.3.4StatisticalExpectation .................................................................................................... 17

4.3.5Functions of Random Variable ..................................................................................... 22

4.4 Two Random Variables ...................................................................................................... 254.4.1 Statistical Expectation in Two Dimensions ................................................................. 29

4.4.2 Functions of two Random Variables............................................................................ 31

4.4.3Two functions of two random varaibles: ...................................................................... 33


2/38


Fig. 4. 1: Illustration of the random variable as a mapping function from the sample space of the

sigma algebra into Borel sets on the real line. 8

Fig. 4. 2:Illustration of the domain and range of a discrete-valued random variable. .................... 9

Fig. 4. 3: The CDF of a mixed random variable. .......................................................................... 12

Fig. 4. 4: Illustration of the measurability of functions of random variables. .............................. 23

Fig. 4. 5: Illustration of the concept of two-dimensional random vectors for two random variables

on the same sample space.. ........................................................................................................... 26

Fig. 4. 6: Illustration of region of integration for the marginal density. ................................. 28


3/38


4/38


4.2 Statistical Experiments

We follow the tradition of describing a statistical experiment; i.e., E = {, , P}.

Sample space :The sample space enumerates the elementary outcomes of an experiment, and by construct,

various combinations of them as well. For example, if for some , are theelementary outcomes of E (i.e., its single-valued possibilities or entities of its alphabet), it is a

natural requirement that combinations of these outcomes, that may also happen in E, should be

included in . However, events may be constructed arbitrary and some may never happen!

Hence, it is the tradition to list only the elementary outcomes (elements of the alphabet) in ,

and the events (all possible combinations of the elementary outcomes or dictionary of the events

in the experiment) are given a special entity, denoted by a fieldor algebra .

Field (algebra) :As the events need to be scaled or measured, an enumeration procedure is needed to generate the

dictionary of events, including those who may not happen; these will be given zero weight or

measure. Therefore, desirable characteristics of will be such that it will be closed under all

countable (may be infinite in number) inclusions of elementary events, such as unions.

Specifically, we define for events and such that:a) The empty set (or null event) and the sample space (sure event) must be in .b) implies that for all .c) If then .

The above requirements are all natural for the field to be an inclusive space that spans

(includes) all measurable events.

Probability measure P:


5/38


The probability measure P is a function that assigns weights to the events in . It is the tradition

to have a positive scale normalized to the unit interval. It is natural that events may be related

(dependent) or unrelated (independent) from each others. Events also may be joint (share some

elementary outcomes) or disjoint(share no elementary outcomes) in their constructs. Therefore,

a measure P satisfies the following requirements (axioms): [Ash]

a) for all events and .b) The empty set (or null event) has zero weight and the sample space (sure event) has

a weight of 1; that is: and c) If then if the events are

mutually disjoint.

As a consequence of axioms b and c, if is the complement event of; then Example 1:Consider a histology experiment conducted on cells from a tissue specimen taken

during a procedure to determine the existence of cancer. In this experiment tissue is treated and

viewed under a microscope with viewed cells cancerous (C) or noncancerous (NC). As biopsies

are performed after reasonable doubts to justify doing them, the weights of the C and NC eventsdiffer with procedure. A possible statistical experiment may be described as follows: , , and Example 2: Consider the same experiment in Example 1, in which viewing the cells under the

microscope revealed a contradictory conclusion with the agent used to prepare the tissue. The

histologist chart showed unspecified cells (US) under these circumstances. That is the sample

space became:

. As these events are independent, the sigma algebra would be

simply the collection of five possibilities: , and the probability measure willbe such that , and Example 3:Consider the experiment of recording the life span of cells in a specimen under the

effect of an agent. Suppose that the viewing domain of the microscope allows individual cell

viewing and tracking, and that it is possible to tag each cell as it moves within the specimen until


6/38


its death. It is expected that a counting experiment will be a tedious one especially with large

degrees of motion of the cells, yet a histogram may be constructed indicating the number of cells

that die with time.

Events containing similar elementary outcomes are called joint, while those with no

common elementary outcomes are classed disjoint. Probability measures on disjoint events are

additive. Events with no bearing on each others are called independent. Probability measures on

independent events are multiplicative. As cause and effect is a natural rule of physics, it is also

logical to ask of the likelihood of related events with respect to each others. Conditional

probability measures addresses this issue, providing the value of the probability of an event

given a second event took place as the ratio of their joint probability normalized by the

probability of the second event. We formally define these concepts as follows.

Definition 1: Two events and on the same sample space are said to disjoint if theirintersection is empty; i.e., Definition 2: Two events and , not necessarily on the same sample space , are said to beindependent if their joint probability measure is multiplicative;

i.e.,

Definition 3: The conditional probability measure of event given the occurrence of event ,where and , not necessarily on the same sample space , is denoted by and isdefined by: provided that The expression for the conditional probability in Definition 3 is usually called Bayes Theorem.

As direct corollary to the above definitions, independent events do not affect the value of the

conditional probability measure. Likewise, disjoint events have a zero-valued joint probability

measure and, subsequently, zero-valued conditional probability measure.


7/38


Example 4: Consider events and that is a subset of; symbolically, Itfollows that the conditional probability = and that the joint probabilitymeasure A stimulus may affect a set of events even though they may be disjoint. In this case,conditional probabilities enable measuring the overall effect of that stimulus. For example,

suppose that an experiment produces three disjoint events such that . We can measure the effect of a stimulus (i.e., another event) on the set ofevents by noting that , and since the intersections are disjoint as well, thenthe probability measure

is additive with respect to the components

; i.e.,

,which can be written in terms of conditional expectations as: This can be generalized to a larger number of events in what is known asthe total probability theorem which is stated below.

Theorem 1: Consider a set of events and . Suppose the events aremutually disjoint; i.e., they span the sample space. Consider an event . We can show that

The proof of the total probability theorem follows from the fact that the probability measure of

disjoint events is analytically additive. [Ash]In general, enlisting the events in and will not be a simple task, if not impossible,

with real-world problems. Likewise, obtaining the corresponding functional form for the

probability measure will be a rough task to accomplish. A logical task that statistions faced hasbeen to search for automatic procedures to generate events and their measure for particular

experiments. Indeed, a general approach has evolved to accomplish this task through the conceptof random functions; commonly referred to as random variables. This type of function maps

events on the sample space and the sigma algebra, which may be quite abstract, into another

representation on the Cartesian domain on which assigning a scale (measure or weight) will be

possible using the common tools of calculus. We examine this concept next as prelude to

describing multivariable experiments and complicated stochastic phenomena.


8/38


4.3 Random Variables

4.3.1 Basic Concepts

A random variable is a mapping from the sigma algebra into the real line such that theeven . As illustrated in Fig. 1, the random variable is a mapping fromevents in into other forms of events (sets) defined on the real line; these sets are called Borel

sets. For compact representation, we insist that the inverse mapping of the Borel sets are events

on ; i.e., the set for all Borel sets [Peiper].To avoid potential confusion on the difference between the sample space and the sigma

algebra vis--vis the random variable, we use as stated before, the alphabet of the experiment to

represent and its dictionary to represent ; hence, is technically a mapping from into. Of course, it is understood that events in are formed of elementary events, i.e., mustsubsequently be contained in . Like any function, has a domain which will be and arange which is the real line . Hence, a short hand for this relationship would be as follows: . As domains and ranges may be continuous or discrete, we will be able to considercases where the mapping between continuous entities, in which

will be denoted as

continuous random variable, and between discrete entities, in which will be denoted asdiscrete random variable. As the mapping is between events, it is natural that weights (measure)of the two ends of the mapping would be the same. That is, in the illustration shown in Fig. 1, we

must have

Fig. 4. 1: Illustration of the random variable as a mapping function from the sample space of the sigma algebra into

Borel sets on the real line.


9/38


We shall provide few illustrative examples to demonstrate the power of the random variable

concept in describing a statistical experiment.

Example 5: Consider the experiment in example 1 above. Suppose we have a function which takes an arbitrary value of 1 for C and -1 for NC. It is evident that we can establishevents . In particular, we have the following possibilities:

The above example illustrates a discrete-valued random variable, which has a finite range and

domain. Fig. 2 illustrates this mapping.

Fig. 4. 2: Illustration of the domain and range of a discrete-valued random variable.

Example 6: Suppose a statistical experiment with sample space consisting of eight elementary outcomes. Considera function . We mayillustrate that

is a discrete-valued random variable on

by examining whether

is an event. It is evident that:


10/38


Hence, is a random variable.The above illustrates the value of the random variable concept as an event generator.

Equally important is to see whether the random variable will also enable measurability of the

generated events. In other words, can we evaluate The ability toevaluate this quantity achieves the measurability requirements for the events in , and thus

completely describes the statistical experiment. Indeed, the quantity hasa special name in mathematical statistics, known as the cumulative probability distribution

function (CDF). From its definition, the event shrinks as and expands as ;therefore, the CDF is non-decreasing. Indeed it is also monotone; hence, continuous from right;

that is:

As the CDF is a

fundamental concept in the theory of statistics we shall examine it in further details. Indeed, we

shall, for the ease of notations, connote the CDF as and will give it the specialsymbol . To make the notations sync, we shall define the CDF formally as follows:Definition 3:Given a random variable defined for a statistical experiment E = {, , P}.The cumulative probability distribution (CDF) of is a function on the real line whichmeasures the event

We connote this function as

for every Borel set The CDF, like any function, may be studied from the theory of calculus. Therefore, the

description of the statistical experiments and indeed the statistical theory gets transitioned from a

seemingly guessing game into a well-founded mathematical theory through the concepts of

random variable. The monotoncity, hence, continuity of is evident from the definition.We can easily infer the following properties of the CDF.

Fact: The CDF function have the following properties:(i) (ii) is differentiable over .


11/38


Definition 4: The probability density function (PDF) of a random variable is thederivative of the CDF function; i.e., = . We shall see later on that both and enable various studies related to theexperiment under consideration.

The differentiability of the CDF is a functional property due its continuity, and that does

not imply that the random variable itself (the function that maps events to Borel sets on the real

line) is continuous or discrete. The distinction between continuous and discrete random variables

is in the form of the Borel sets that are generated from the mapping. A discrete random variable

means that its domain may take only finite but discrete values; i.e., the Borel sets are discrete,

while a continuous random variable has a continuous domains; i.e., the Borel sets are continuous.

As a consequence of these characteristics, we may be able to define the probability of a point

event for discrete random variables; i.e., . Point events for continuous random variables are emptysets; i.e., their measures will be zero.The implications for these characteristics and differentiability of will be such that thefollowing distinctions will be made between discrete and continuous random variables:

{

(1)

For the integer-valued case, the set will be integer valued and the upper limit of the summation in Eq. (1) would beeasy to determine. If we use the generalized function representation, we may express the PDF for

the discrete random variables in terms of the Dirac delta function; i.e., hence, Therefore, a unified expression for the CDF may be writtenfor both continuous and discrete random variables in terms of an integral functional only; i.e., In practice, however, a clear distinction will be implicit in termsof the definition of the experiment and the values that events may take. Indeed, on principle, we


12/38


may define a random variable to describe events that has both continuous and discrete domains.

We provide below some illustrative examples for the CDF and PDF for various random

variables.

Example 7: Consider a random variable with the CDF shown in Fig. 3 which has the following

functional form:

{

We note that the CDF in this example is not differentiable at The problem,however, is at where the CDF has a jump (discontinuity) of size We can address thebehavior of the CDF at the jump noting that:

Fig. 4. 3: The CDF of a mixed random variable.

Hence, functional analysis enables us to address the various scenarios that the CDF may have.


13/38


4.3.2 Properties of the CDF and the PDF of a random variable

As the CDF is a probability measure for the event

then, the positivity

and upper bound of the CDF are evident. Likewise, as the name implies, the CDF is non-

decreasing and monotone, hence, the continuity from the right is evident. We can also infer the

properties of the CDF for disjoint events as well as independent events using the basic

definitions. We also note that the following identities follow straight from the definition of the

CDF. i) The PDF, being a derivative of the CDF, possess the positivity and summability (integrability)

constraints. Writing the PDF for discrete random variable as:

= The most important virtues of the CDF is their ability to provide a computational mechanism (or

algorithm) to evaluate the probability measure using the form of the probability mass of the

Borel sets corresponding to the events on the sigma algebra through summation (discrete

random variable) or integration (continuous random variable). Some examples will illustrate

these concepts.

Example 8: Consider a statistical experiment defined within the boundaries illustrated in Figure

1. Suppose the mass function of the Borel sets are such that they have an exponential form:

We can evaluate the probability measure of the event

by the mass of its

equivalent Borel set . In particular, we can evaluate theprobability of such events as:


14/38


Over the long history of statistics, various functional forms for resulted for specificspractical experiments. Some of the most common PDFs are listed in Table 1.

Table 1: Common PDF functional forms.

Name Continuous PDF Discrete PDF

Uniform Exponenti

al

;discrete-valued

Gaussian N/ACauchy N/ALaplace N/ARayleigh

N/A

Maxwell N/AGamma N/A

Beta

N/A


15/38


Binomial N/A n: independent trails

Poisson N/A

The parameters in the CDF forms in Table 1 represent the modes as well as resolution measures

of the functions and will be addressed later on.

It is important to remember the context of the definitions of the CDF and the PDF and the

brilliance of the statisticians, physicists and mathematicians who, over the years, developed these

tools that enable proper description and characterization of statistical experiments. The random

variable concept has been the catalyst for mapping the statistical experiment from a guessing

game into a functional representation that is governed by the rules of calculus.

4.3.3The Conditional Distribution

The CDF and the PDF may be defined for conditional events. For example, we may study the

behavior of event A defined such that: with respect to another event B which may or may not be defined on the same sample spaceWe can consider the conditional CDF as measure of event { in same way we definedthe conditional probability before

We point out that } means that , and the event may or maynot be a subset of the same sample space . We maintain that the conditional CDF possesses allthe properties of the CDF as discussed before. In particular, it is straight forward to justify the


16/38


following properties: , , and for Indeed, the total probability theorem that we studied before, may be expressed as follows:

consider a partition on a sample space with known apriori measures Any event may defined in terms of thepartition such that: , and the CDF for random variable will be defined asfollows:

.

In the same fashion we studied the probability density function (PDF) before, we can

define the conditional PDF as well; that is, for continuous random variables we have:

.Example 9: Consider the two events

and

. We can show that: By derivatives, we obtain the conditional PDF

Example 10:Consider the events and . We can show that:


17/38


{ The conditional PDF will be:

The conditional distribution is an important mechanism to measure the effects of a stimulus over

a certain behavior, or the degree of dependency of events over each others. Later on we will

generalize the concepts of random variable to describe experiments that can create indexed set of

random variables, where the index is defined on the set of real numbers or on a graph; we will

exploit the dependency of the random variables to perform clustering of classes into meaningful

description, and we will be able to study various control mechanisms of experiments that can

only be studied over a long period of observations.

4.3.4StatisticalExpectation

The CDF and the PDF in Table 1 are described in terms of parameters that quantify the basic

behavior of these functions, such as their mean value and the dispersion (variance) around that

mean value. Over the years of statistical analysis, statisticians have developed metrics that can be

computed from the CDF or the PDF which provide important clues about the behavior of a

statistical experiment. In fact, some of these metrics may provide an approximation or upper

bounds of the probabilities that a certain event may take without actual calculation of the CDF

and the PDF; thus a prediction may be possible for the behavior of events based on numerical

calculations. Among these metrics is the statistical expectation which will be formally defined

below.

Definition 5:Given a random variable with PDF, itsexpect value is defined by


18/38


{

The expectation as defined above for a single random variable is a numerical indication of how

symmetric the PDF is around the origin. Due to the positivity constraints of the PDF, a zero

expected value connotes symmetry, while a positive value indicates that the PDF is tilted

towards the positive (right hand side) of the origin, and a negative value indicates that the PDF is

tilted to left of the origin. Hence, in reality the quantity provides the mean value of thePDF or its locus of symmetry on the

-axis. Therefore,

the mean value or the average of

the random variable From the strict mathematical sense, we can define the effect of a generalfunction of the random variable on the PDF. For example, we may define the expected value of afunction as follows:Definition 6:Given a random variable with PDF , suppose we define a mapping ; i.e., the domain and range of is similar to the expect value of isdefined by

{

In this case, the mean is a special case of The expectation of measures theaverage of the function. Indeed, we shall study the fact that a function of a random variable is

also a measurable function, which defines a random variable as well. Hence, we may calculate

as above, using the PDF of random variable , or first calculate the PDF of and then we calculate as follows:


19/38


{

The above result is very fascinating, even though we have not formally derived it, and indeed has

enormous practical implications in the statistical analysis, this due to the fact that calculating the

PDF is not a simple matter, even for simple transformations. Therefore, the PDF of a random

variable serves a great deal for functions of as well. We focus on few definitions which arespecial cases of and provide some specific clues on the PDF of.Definition 7:Given a random variable with PDF, the variance of is defined as:

{

The variance measures the dispersion (width or concentration) of the PDF around the mean

value. It serves to estimate the domain of the values that the random variable

may take. In

fact, an upper bound has been derived, through the Chebyshev inequality, which estimates the

upper bound of the probability of a normalizedevent.

Fact:The Chebyshev inequality [Papoulis]. Consider a random variable with PDF having amean value and variance Define the normalized event We canshow that: .The proof the Chebyshev inequality follows directly from the definitions and the positivity of the

density function. Its importance is in providing a measure for the concentration of the values of

the random variable around its mean value.

Definition 8: Given a random variable with PDF, the moment ofaround the originis defined as:


20/38


{

The central moment is obtained in a similar fashion by the proper substitution inthe above equation. We now provide few examples to illustrate how the expectations may be

calculated for some of the PDFs in Table 1.

Example 11: Evaluate the mean and the variance of the Poisson distribution. This distribution is

that of a counting random variable which estimates the instances of occurrences of a particular

event, within a certain period, given its frequency of occurrences.

Similarly,

On the other hand, we note that

Hence, it is immediate that

for the Poisson distribution. Example 12: Evaluate the mean and the variance of the Cauchy distribution.


21/38


The mean value is zero since is an even symmetric function. The difficulty arriveswith the variance since the 2nd order expectation is infinite. Hence, the variance is

undefined. Yet, the parameter

measures the relationship between the maximum height

and the span (width) of the PDF.The knowledge of the PDF (or the CDF) enables computing the expectations. It is a

natural question to ask whether the knowledge of the expectations enable the

determination of the functional form of the PDF? Unfortunately, it is not possible to infer

the PDF from the expectations (moments) except in the case of the Gaussian PDF which is

explicitly defined in terms of its mean and variance. Yet, statisticians studied mechanisms

to generate the moments automatically through the so-called moment generating function,which is defined below.

Definition 9: Given a random variable with PDF, the moment generating function ofis defined as The moments may be generated from as follows.

; Example 13: Evaluate the moment generating function for the standard uniform PDF defined

over the interval .

It is evident that the mean value

for example may be obtained from first derivative of

and taking the limit as approaches zero (note the need for using the l'Hpital's rule).


22/38


One form of expectations, that mimics the moment generating function yet provides a great

functional ability to analyze, and indeed construct, the PDF of a random variable is known

as the characteristic function and is defined as follows:

Definition 10: Given a random variable with PDF, the characteristic function of is defined as

{

is the conjugate of the Fourier transform, which is a unique mapping of the function, thus a direct link is established between the PDF and the characteristic function,which is very useful in a number of applications.

Example 14: Evaluate the characteristic function for the standard uniform PDF defined over

the interval .

4.3.5Functions of Random Variable

It is natural to ask how a transformation of a random variable (which is a function) would

be considered, in the context of the statistical experiment that the random variable was

introduced, in order to characterize in terms of the theory of calculus instead of

enumerations and guessing. The key issue is the measurability of functions of random

variables; i.e., whether the mapping is actually constrained by the original domain (of

course the range as well) of the original random variable?A random variable is amapping from the sigma algebra into the real line such that the event , and we also insisted that the inverse mapping is an event as well, i.e., . Now let us define a function ()It is evident that the transformation


23/38


will generate events on that must be Borel sets. Likewise, the inverse mapping will generateevents that must be in and , by construction. Hence, the measurability of ()is quite evident and inherent. Fig. 4 illustrates this concept, where the mappings of

onto the real line produce Borelsets , which may be remapped back tothe event .

Fig. 4. 4: Illustration of the measurability of functions of random variables.

Theorem [Piper]: If a random variable and is a Borel function, then () is a random variable measurable by ; i.e., the algebra generated by is a subset of the algebra generated by As we done before with random variables, we characterize

()by its

cumulative probability distribution function or by its probability density function.

Depending on the functional form of , we may obtain the CFD interms of . Table 2 lists few examples of the CDF and PDFs of functions ofrandom variables (Papoulis).


24/38


Table 2: The CDF and PDF of functions of random variables for various functional forms.

() CDF

PDF

() ( ) ( )

Monotone increasing ()Monotone decreasing

()

Continuous and hascountable number of real

roots , Integrate

;

Derivations of the entries in Table 2 are straight forward and follow from the definitions of

the CDF, which should be obtained first, and then the derivative operator is applied to

obtain the PDF.

Functions of random variables is an important concept in mathematical statistics that

enables statistical inference about one random variable (or, in general, a random process

as will be studied later) in terms of another. Indeed, simulations of random variables from

different distributions may be obtained from simulating a random variable from a standard

uniform; i.e., such form has been studied and various randomnumber generators exist to generate samples from it. We will study simulations of random

variables after we briefly cover the case of random experiments that may be described by

two random variables; step towards generalizations to sequence of unstructured random


25/38


variables, which will be followed by random variables defined on a graph or have a

particular meaningful index. As our intention is to study random processes and random

fields that have been useful for biomedical image analysis, we will not dwell into the details

of random variables and statistical inferences beyond the basic background needed for our

subsequent coverage.

4.4 Two Random Variables

Events in random experiments may take various forms, and may be better understood in

terms of higher dimensional mapping, instead of one-dimensional. In this case the

measurability of the events on

would be in terms of the mass of events (Borel

sets)on multidimensional spaces. Consider two random variables , theevents and may be paired with respect tothe two parameters such that a new event may be defined on the space as shown in Fig. 5. We mayexpress the random variable with respect to as ( ) or in a vector; i.e., . We will be interested mainly onrandom variables defined on the same sample space, hence, for simplicity we will write

and


26/38


Fig. 4. 5: Illustration of the concept of two-dimensional random vectors for two random variables on the same

sample space. For measurability, the sets and are Borel sets.In direct analogy to the single random variables case, we may determine the probability

mass distribution on the Borel sets defined on the plane , by the assignment ()

hence, when dealing with a pair of random variables

, we have three

induced probability measures: The mass distribution on represented by consists of the unit mass distributed along . Similarly, represents a unit mass distributed along , and represents a unit mass distributedalong . The measures are related to as follows:

Now if ; i.e., the entire real line, then , so that . Similarly, if ; i.e., the entire real line, then , so that . We formally define these quantities below.


27/38


Definition 11:The probability measure defined on the Borel sets in the plane is called the joint probability measure induced by the joint mapping

. The probability mass distribution is called the joint distribution. The

probability measures and are called themarginal probability measures induced by random variables respectively. Thecorresponding probability mass distributions are called the marginal probability

distributions. Definition 12: The joint probability distribution function, or the joint cumulative distribution

function for random variables

, denoted by

is defined by

for all values The special cases and arecalled the marginal distribution functions for random variables respectively. We follow the same development for the single random variable case and define the probability

density function for two random variables.

Fact (Feller Vol. 1): Ifthe joint probability measure

induced by

is absolutely

continuous, a function exists such that: is called the probability density function for .Conversely, given and that satisfy the previous properties, there exists two random variables defined on some sample space with joint probability distribution and densityfunctions and respectively. We may enlist the main properties of

following the same manner as we did with single

random variables; specifically, we note that:

i. ii.

iii.


28/38


iv. Example 15: Consider two jointly continuous and positive random variables

with the

following joint density function:

Find Solution: The required quantities may be obtained by integrating

over the specific

region, see Fig. 6.

;

Fig. 4. 6: Illustration of region of integration for the marginal density

.

Also, ; .The cumulative distribution function will be computed from the basic definition:


29/38


given that leads to

We may address the issue of dependence of random variables here due to its importance.

We maintain from definition 2 and example 9, that events are independent if their joint probabilitiesdistribute; that

= .

Note that in the above example, has a separable form, but the two random variablesare not independent due to the restrictions

The independence concept will come

handy in various applications, including simulations of random variables.

4.4.1 Statistical Expectation in Two Dimensions

Expectations in two dimensions follow in the same manner the one-dimensional case (see

definitions 5 and 6). In particular, the expected value of a function of randomvariables is defined as follows:Definition 13: Given a pair of random variables , suppose we define a mapping ; i.e., the domain and range of is similar to the expectvalue of is defined by


30/38


{

where is the probability density function for continuous random variables and isthe probability mass function for discrete random variables. Definition 14: Given a pair of random variables , the covariance function is definedas:

Cov( where are the marginal means as defined in Definition 5. That is, for any randomvariable, we have:

{

where is the probability density function for continuous random variable and is theprobability mass function for discrete random variable . A special case for definition 14 that does not include the marginal means is called the auto correlation

;

i.e., .We note that all the properties of the expectation operators in higher dimensions are similar to that in one-

dimension.


31/38


In two dimensions (multi-dimensions, in general), the random variables may possess various degrees of

relatedness; which may span the entire scale offully relatedto completely unrelatedor independent.

We define the relatedness in terms ofcorrelation coefficient

; i.e.,

The Correlation coefficient , where is the standard deviation of random variable .If the covariance is zero, then the two random variables are said to be uncorrelated. If the two

random variables are independent, then their covariance becomes zero, and therefore, by

definition, the two random become uncorrelated. We note that the converse is not true, except

in the case of Gaussian random variables, which will be considered later.

4.4.2 Functions of two Random Variables

It is natural to study the mappings (transformations) of two random variables as we did with

single random variables. Here we consider the transformation where is ageneral function. We used the bold face letter to represent the functional form for whichmay be a vector. A given realization (certain value or instant ofwill be represented by smallbold face letter The key issue is the measurability of which may be understood in asimilar fashion as we did in one-dimensional case; see Fig. 4. We maintain that under normal

conditions, is measurable and, therefore, we may define a complete functional descriptionfor it in terms of the probability distribution function; ). We consider fewcases below.

Example 16: Consider two jointly continuous random variables with probability densityfunction Define a summation function We maintain that is ameasurable random variable whose CDF may be calculated as follows.

and the density function would be evaluated as follows:


32/38


The inner differentiation may be evaluated using Leibnitz rule; i.e.,

Hence, As a special case, if

are independent, then joint denisty becomes separable, and

will

have the following form: ,which is the convolution integral. For this case, the CDF would be: For illustartion, let us consider the simple casewhere both are independent and have uniform distribution. Let It is straight forward to show that

{

Table 3 lists few examples for the distribution of functions of two random variables; all can be

derived from the basic defintions. These functions maps two random variables into a single

random variable; i.e., there is a reduction in dimensionality involved in the transformation. We

will consider the case where the functions of random varaibles are vectors next.


33/38


Table 3: Some one-dimensional functions of two random variables and the correspodning CDF

and PDF.

CDF PDF

;

4.4.3Two functions of two random varaibles:

As a step twards generlized representation of functions of random varaibles, let us consider

vector representations of the random varaibles

, i.e.,

and a functional form

for that vector representation will be quitestraight forward in the case where is a linear mapping of [ ] , for some constants If the transformation is inverible; i.e., exists. Hence, = . The


34/38


Jacobian of the transformation is defined as: . Then the joint probability densityfunction of may be written in terms of the joint density of randomvaraibles where |.| stands for the determinenat.Example 17: Consider two jointly continuous random variables with probability densityfunction Consider the polar coordinates:

, , ( ). Wemay define theJacobian of the polar transformation as follows: =

[ ] Hence,

.

4.5 Simulation of Random Variables

In this section we discuss how to simulate a one and two-dimensional random variables of

general distributions, starting from standard uniform distribution. This will bring about the

linkage between a distribution or density function and deterministic functional forms, which

represents instants or realizations of the random variables. Generating such instants may be used

in a reverse fashion to estimate the density function using histograms. Simulation is an art and a

well-studied subject in statistical design; stochastic systems are studied based on simulations. We

only scratch the surface here to bring about some feeling of the outputs of random experiments in

a form of realizations of the random variables that map the sample space into Borel sets on the

real line. The fundamental step in the simulation is the simple observations that since the

distribution function is monotone and non-decreasing, an inverse may be possible; hence


35/38


mapping from uniform distribution to any other form is possible. We illustrate this as follows:

Let are two random variables, not necessarily on the same sample space or have thesame form of distribution. Suppose the two random variables are related to each through the

monotone transformation: The distribution function of with respect to thedistribution of is as follows: () = , if is uniformly distributed. Hence, we may take Therefore, if we generate a realization from , a realization from is obtained by inversemapping.

4.6 Summary

This chapter introduced the basic concepts of statistical modeling which serves as building block

for the material in the book. We defined the concept of random variable from an operational

point view and illustrated its significance in describing statistical experiments. Characteristics of

a single random have been studied, including the probability distribution and density function

and the concept of statistical expectation. Extensions to function of single random variables and

joint distribution of two random variables were considered. The chapter concluded with basics of

simulation of one and two-dimensional random variables.

4.7 Computer Laboratory

A random variable of can be easily converted to any other distribution such as normaldistribution. The methodology of generating random numbers has several forms. However, the

one used in great majority is the linear congruential generators (LCG), whose recursive formula

is given by (A4.1), where is a sequence of integer values ranging from 0 to , startingfrom a seed point : (A4.1)Where m is the modulus,

is the multiplier,

is the increment, and all are non-negative values.

In order to obtain numbers in the interval the generated integer sequence should be dividedby to obtain .There are many variations of LCGs, the form used in this project isthe multiplicative LCG were the addition of is not needed, so the recursive formula of thegenerator can be given by (A4.2)We may convert the generated uniform streams to follow a Gaussian distribution of zero meanand unit variance using the Box-Muller approach described as follows;


36/38


If and are independent random variates (realization from random variable) from ,then and are independent standard normal deviates (realization from ) obtained asfollows:

211 2cosln2 uuZ (A4.3a)

2122sinln2 uuZ (A4.3b)

The approach is to run the random generator to get a very large number of samples of a uniform

random variable, then extract even-ordered terms to be and the odd-ordered terms to be , inorder to guarantee their independence.

Task 1:(a)Generate three streams (1D, 2D and 3D) of random numbers with 100,000 samples; you

can use the Matlab function for random number generation.(b)Visualize the generated samples,Hint: you can use a scatter plot.(c)Compute the histogram of the three streams, and then normalize them to become a

probability density function (pdf).

(d)Visualize the pdfs of the three streams. Are the samples uniformly distributed? Do thepdfs represent a standard uniform distribution? Comment.

Task 2: use the streams generated in task 1 to do the following,

(a)Convert the generated streams to a standard normal distribution(b)Visualize the data samples. Are they normally distributed with mean 0 and unit variance?(c)Compute the normalized histogram (pdfs) of the data samples from (a) and visualize the

distribution, comment.

Task 3: use the streams generated in task 2 to do the following;

(a)Convert the 1D normal variate to a normal distribution with mean -2 and standarddeviation 3.

(b)Convert the 2D normal variate to a normal distribution with mean vector

2

1M and

Covariance matrix

94

44.

(c)Convert the 3D normal variate to a normal distribution with mean vector5

5

6

M

and

Covariance matrix

5 2 1

2 5 0

1 0 4

.


37/38


(d)For (a), (b) and (c) estimate the mean and covariance matrix and compare the estimatedvalues with the ground truth ones, comment.

4.8 Exercises

1. Given a probability space Let be the events such that Prove/disprove the following:(a) (b) (c) .

2. Given a sample space (a)Construct the smallest algebra that contains the two sets

and

(b)Construct the smallest algebrathat contains the two sets and (c)What members ofthat and measurable?3. Let be a nonnegative finitely additive set function on the field . Show that

4. Let

be arbitrary subsets of a set . Describe explicitly the smallest

field

containing

. How many sets are there in

? (i.e., give an upper bound thats

attainable.5. Derive the entries in Table 2.6. Derive the entries in Table 3.


38/38


d)4.9 References1. Robert Ash, Real Analysis and Probability, Academic Press, New York, 1972.2. Athanasios Papoulis, Probability, Random Variables and Stochastic Processes, 3rd

Edition, McGraw-Hill, New York, 1991.

3. Paul E. Pfeiffer, Concepts of probability Theory, 2nd Edition, Dover, New York, 1978.4. Wilbur Davenport, Random Processes, McGraw-Hill, New York, 19705. William Feller, An Introduction to Probability Theory and Its Applications, Vol.1,

Wiley, New York, 1968.

6. Reuven Rubinstein, Simulation and the Monte Carlo Method, Wiley, New York, 1981.

farag - random variables

Documents