lect_2. source coding

Upload: biruke6

Post on 03-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 LECT_2. Source Coding

    1/26

    Graduate Program

    Department of Electrical and Computer Engineering

    Chapter 2: Source Coding

  • 7/29/2019 LECT_2. Source Coding

    2/26

    Sem. I, 2012/13

    Overview

    Source coding and mathematical models

    Logarithmic measure of information Average mutual information

    Digital Communications Chapter 2: Source Coding 2

  • 7/29/2019 LECT_2. Source Coding

    3/26

    Sem. I, 2012/13

    Source Coding

    Information sources may be analogor discrete (digital)

    Analog sources: Audio or video Discrete sources: Computers, storage devices such as magnetic or

    optical devices

    Whatever the nature of the information sources, digitalcommunication systems are designed to transmitinformation in digital form

    Thus the output of sources must be converted into a formatthat can be transmitted digitally

    Digital Communications Chapter 2: Source Coding 3

  • 7/29/2019 LECT_2. Source Coding

    4/26

    Sem. I, 2012/13

    Source Coding

    This conversion is generally performed by the source

    encoder, whose output may generally assumed to be asequence of binary digits

    Encoding is based on mathematical models of informationsources and the quantitative measure of information

    emitted by a source We shall first develop mathematical models for information

    sources

    Digital Communications Chapter 2: Source Coding 4

  • 7/29/2019 LECT_2. Source Coding

    5/26

    Sem. I, 2012/13

    Mathematical Models for Information Sources

    Any information source produces an output that is random

    and which can be characterized in statistical terms A simple discrete source emits a sequence of letters from a

    finite alphabet Example: A binary source produces a binary sequence such as

    1011000101100; alphabet: {0,1}

    Generally, a discrete information with an alphabet of L possibleletters produces a sequence of letters selected from this alphabet

    Assume that each letter of the alphabet has a probability ofoccurrence p

    k

    such that

    pk =P{X =xk}, 1 k L; and

    pk = 1

    Digital Communications Chapter 2: Source Coding 5

  • 7/29/2019 LECT_2. Source Coding

    6/26

    Sem. I, 2012/13

    Two Source Models

    1. Discrete Sources

    Discrete Memoryless source (DMS): Output sequences from thesource are statistically independent

    Current output does not depend on any of the past outputs

    Stationary source:The discrete source outputs are statisticallydependent but statistically stationary

    That is, two sequences of length n, (a1, a2, ..an) and (a1+m, a2+m,..an+m) each have joint probabilities that are identical for all n 1and for all shifts m

    2. Analog sources has an output waveform x(t) that is a

    sample function of a stochastic process X(t), where weassume X(t) is a stationary processwith autocorrelationRxx() and power spectral densityxx(f)

    Digital Communications Chapter 2: Source Coding 6

  • 7/29/2019 LECT_2. Source Coding

    7/26

    Sem. I, 2012/13

    Mathematical Models

    When X(t) is bandlimited stochastic process such that

    xx(f) = 0 for f W; by the sampling theorem

    Where {X(n/2W)}denote the samples of the process at thesampling rate (Nyquist rate) off = 2W samples/s

    Thus applying the sampling theorem we may convert theoutput of an analog source into an equivalent discrete-time source

    Digital Communications Chapter 2: Source Coding 7

    =

    =

    W2

    ntW2

    W2

    ntW2sin

    )w2

    nX(X(t)

    n

  • 7/29/2019 LECT_2. Source Coding

    8/26

    Sem. I, 2012/13

    Mathematical Models

    The output is statistically characterized by thejoint pdf

    p(x1, x2 , xm) for m 1 where xn=X(n/2W); 1 n m, arethe random variables corresponding to the samples of X(t)

    Note that the samples {X(n/2W)}are in general continuousand cannot be represented digitally without loss of

    precision Quantization may be used such that each sample is a

    discrete value, but it introduces distortion (More on thislater)

    Digital Communications Chapter 2: Source Coding 8

  • 7/29/2019 LECT_2. Source Coding

    9/26

    Sem. I, 2012/13

    Overview

    Source coding and mathematical models

    Logarithmic measure of information Average mutual information

    Digital Communications Chapter 2: Source Coding 9

  • 7/29/2019 LECT_2. Source Coding

    10/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    Consider two discrete random variables X andY such that

    X = {x1, x2,xn}andY = {y1,y2,.ym} Suppose we observeY = yj and wish to determine,

    quantitatively, the amount of informationY = yj providesabout the event

    X = xi , i = 1,2,..n

    Note that ifX andY are statistically independentY = yjprovides no informationabout the occurrence ofX = xi

    If they are fully dependentY = yj determines theoccurrence ofX = xi The information content is simply that provided by X = xi

    Digital Communications Chapter 2: Source Coding 10

  • 7/29/2019 LECT_2. Source Coding

    11/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    A suitable measure that satisfies these conditions is the

    logarithmof the ratio of the conditional probabilitiesP{X = xiY = yj}= P{xiyj}and P{X = xi}= p(xi)

    Information content provided by the occurrence ofY = yjabout the event X = xi is defined as

    I(xi , yj) is called the mutual information between xi and yj

    Unit of the information measure is the nat(s) if the naturallogarithm is used and bit(s) if base 2 is used

    Note that ln a =ln 2. log2a = 0.69315 log2a

    Digital Communications Chapter 2: Source Coding 11

    )P( x

    )yP( xyxI

    i

    ji

    ji

    /log),( =

  • 7/29/2019 LECT_2. Source Coding

    12/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    If X and Y are independent I(xi , yj) = 0

    If the occurrence ofY = yj uniquely determines theoccurrence ofX = xi,

    I(xi , yj) =log (1/ p(xi))

    i.e., I(xi , yj) = -log p(xi) =I(xi) self information of event X

    Note that a high probability event conveys lessinformation than a low probability event

    For a single event x where p(x) = 1, I(x) = 0 theoccurrence of a sure event does not conveyany information

    Digital Communications Chapter 2: Source Coding 12

  • 7/29/2019 LECT_2. Source Coding

    13/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    Example

    1. A binary source emits either 0 or 1 every s seconds withequal probability. Information content of each output isthen given by

    I(xi) = -log2 p = -log21/2 = 1 bit, xi = 0,1

    2. Suppose successive outputs are statisticallyindependent (DMS) and consider a block of binary digitsfrom the source in time interval ks

    Possible number ofk-bit block = 2k= M, each of which is

    equally probablewith probability 1/M = 2-kI(xi) = -log22

    -k = k bits in ks sec(Note the additive property)

    Digital Communications Chapter 2: Source Coding 13

  • 7/29/2019 LECT_2. Source Coding

    14/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    Now consider the following relationships

    and since

    Information provided by yj about xi is the same as thatprovided by xi about Y = yj

    Digital Communications Chapter 2: Source Coding 14

    )(yp)(xp

    )y,(xplog

    )(yp)(xp

    )p(y)y(xplog

    )xp(

    )yx(plog)y,I(x

    ji

    ji

    ji

    jji

    i

    ji

    ji ===

    )p(x

    )y,p(x

    )xp(yi

    ji

    ij =

    ),()(

    )(),( ij

    j

    ij

    ji xyIyp

    xypyxI ==

  • 7/29/2019 LECT_2. Source Coding

    15/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    Using the same procedure as before we can also define

    conditional self-informationas

    Now consider the following relationships

    Which leads to

    Note that is self information about X = xi after havingobserved the event Y = yj

    Digital Communications Chapter 2: Source Coding 15

    )I(x)y,I(x)p(x

    )p(x)yp(xlog iji

    i

    iji=

    )()(),( jiiji yxIxIyxI =

    )( ji yxI

    )(log)(

    1log)( 22 ji

    ji

    ji yxpyxp

    yxI ==

  • 7/29/2019 LECT_2. Source Coding

    16/26

    Sem. I, 2012/13

    Logarithmic Measure of Information

    Further, note that

    and thus

    Indicating that the mutual information between two eventscan either be positive or negative

    Digital Communications Chapter 2: Source Coding 16

    whenyxI ji 0),( )()( jii yxIxI >

    0)(0)( jii yxIandxI

  • 7/29/2019 LECT_2. Source Coding

    17/26

    Sem. I, 2012/13

    Overview

    Source coding and mathematical models

    Logarithmic measure of information Average mutual information

    Digital Communications Chapter 2: Source Coding 17

  • 7/29/2019 LECT_2. Source Coding

    18/26

    Sem. I, 2012/13

    Average Mutual Information

    Average mutual information between X andY is given by

    Similarly, average self-information is given by

    where X represents the alphabetof possible output lettersfrom a source

    H(X) represents the average self information per sourceletter and is called the entropy of the source

    Digital Communications Chapter 2: Source Coding 18

    = == =

    ==n

    i

    m

    j ji

    ji

    jijij

    n

    i

    m

    j

    iypxp

    yxpyxpyxIyxpI

    1 1

    2

    1 1

    0)()(

    ),(log),(),(),();( YX

    == )(log)()()()( 2 iiii xpxpxIxpH X

  • 7/29/2019 LECT_2. Source Coding

    19/26

    Sem. I, 2012/13

    Average Mutual Information

    Ifp(xi) = 1/n for all i, then the entropy of the source

    becomes

    In general, H(X) log n for any given set of source letter

    probabilities Thus, the entropy of a discrete source is maximum when the output

    letters are all equally probable

    Digital Communications Chapter 2: Source Coding 19

    nlogn

    1log

    n

    1H(X) 22 ==

  • 7/29/2019 LECT_2. Source Coding

    20/26

    Sem. I, 2012/13

    Average Mutual Information

    Consider a binary source where the letters {0,1}are

    independentwith probabilitiesp(x0=0)=q or p(x1=1)=1-q

    The entropy of the source is given by

    H(X)= -qlog2q (1-q) log2(1-q) = H(q)

    whose plot as a function is shown below

    Digital Communications Chapter 2: Source Coding 20

    Binary Entropy Function

  • 7/29/2019 LECT_2. Source Coding

    21/26

    Sem. I, 2012/13

    Average Mutual Information

    In a similar manner as above, we can define conditional

    self-information or conditional entropy as follows

    Conditional entropy is the information or uncertainty in XafterY is observed

    From the definition of average mutual information one canshow that

    Digital Communications Chapter 2: Source Coding 21

    )/(

    1log),()/(

    1 1 ji

    n

    i

    m

    j

    jiyxp

    yxpH = =

    =YX

    [ ]

    )(

    )(log)/(log),(1 1

    X(X/Y)

    Y)I(X,

    HH

    xpyxpyxpn

    i

    m

    j

    ijiji

    +=

    = = =

  • 7/29/2019 LECT_2. Source Coding

    22/26

    Sem. I, 2012/13

    Average Mutual Information

    Thus I(X,Y) =H(X) H(X/Y) and H(X) H(X|Y), with

    equality when X and Y are independent H(X/Y) is called the equivocation

    It is interpreted as the amount of average uncertaintyremaining in X after observation of Y

    H(x) Average uncertainty prior to observation

    I(X;Y) Average information provided about the set X bythe observation of the set Y

    Digital Communications Chapter 2: Source Coding 22

  • 7/29/2019 LECT_2. Source Coding

    23/26

    Sem. I, 2012/13

    Average Mutual Information

    The above results can be generalized for more than

    two random variable Suppose we have a block ofk random variables

    x1,x2,..xkwith joint probability P(x1,x2,..xk); theentropy of the block will then be given by

    Digital Communications Chapter 2: Source Coding 23

    )......,(log)......,(....)( 21

    1

    1

    2

    221 jkjj

    n

    j

    n

    j

    nk

    jkjkjj xxxPxxxPXH =

  • 7/29/2019 LECT_2. Source Coding

    24/26

    Sem. I, 2012/13

    Continuous Random Variables - Information Measure

    If X and Y be continuous random variables with joint pdf

    f(x,y) and marginal pdfs f(x) and f(y) the average Mutualinformation between X and Y is defined as

    The concept of self information does not exactly carry overto continuous random variables since these would requireinfinite number of binary digits to exactly represent them

    and thus making their entropies infinite

    Digital Communications Chapter 2: Source Coding 24

    dydxyfxf

    xfxyfxyfxfYXI

    =)()(

    )()/(log)/()(),(

  • 7/29/2019 LECT_2. Source Coding

    25/26

    Sem. I, 2012/13

    Continuous Random Variables

    We can however define differential entropy for continuous

    random variable as

    And the average conditional entropy as

    Average mutual information is then given by I(X,Y)= H(X) -

    H(X/Y) or I(X,Y) = H(Y) - H(Y/X)

    Digital Communications Chapter 2: Source Coding 25

    = dxxfxfxH )(log)()(

    = dxdyyxfyxfYXH ),(log),()/(

  • 7/29/2019 LECT_2. Source Coding

    26/26

    Sem. I, 2012/13

    Continuous Random Variables

    If X is discrete and Y is continuous, the density of Y is

    expressed as

    The mutual information about X =xi provided by theoccurrence of the event Y =y is given by

    The average mutual information between X and Y is

    Digital Communications Chapter 2: Source Coding 26

    =

    =n

    i

    ii xpxyfyf1

    )()/()(

    )(

    )/(log

    )()(

    )()/(log);(

    yf

    xyf

    xpyf

    xpxyfYxI i

    i

    iii ==

    )(

    )/(log)()/();(

    1 yf

    xyfxpxyfYXI i

    n

    iii

    =

    =