mit6_441s10_lec01

8/3/2019 MIT6_441S10_lec01

1/8

LECTURE 1

Introduction

2Handouts

Lecture outlineGoals and mechanics of the classnotationentropy: definitions and propertiesmutualinformation: definitionsandproperties

Reading: Ch. 1, Scts. 2.1-2.5.

8/3/2019 MIT6_441S10_lec01

2/8

GoalsOur goals in this class are to establish anunderstanding of the intrinsic properties oftransmission of information and the relation between coding and the fundamentallimits of information transmission in thecontext of communicationsOur class is not a comprehensive introduction to the field of information theory andwill not touch in a significant manner onsuch important topics as data compressionand complexity, which belong in a source-coding class

Notation random variable (r.v.) : X sample value of a random variable : x set of possible sample values x of the

r.v. X : X Probability mass function (PMF) of a

discrete r.v. X : PX (x) Probability density function (pdf) of a

continuous r.v. : pX (x)

8/3/2019 MIT6_441S10_lec01

3/8

EntropyEntropy is a measure of the average uncertainty associated with a random variableTheentropyofadiscreter.v. X isH(X) =

PX (x)log2 (PX (x))xXentropy is always non-negativeJoint entropy: the entropy of two discreter.v.sX,Y withjointPMFPX,Y(x,y)is:

H(X,Y) =xX,yYPX,Y(x,y)log2PX,Y(x,y)

Conditional entropy: expected value ofentropies calculated according to conditionaldistributionsH(Y X) =EZ[H(Y X =| |Z)] for r.v. Z independent of X andidentically

distributed

with

X.

Intuitively,

this is the average of the entropy of Ygiven X over all possible values of X.

8/3/2019 MIT6_441S10_lec01

4/8

Conditional entropy: chain rule

H(Y X) =EZ[H(Y X =Z)]=

|

PX (x) PY||

X(y|x)log2[PY|X(y|x)]xX yY

= PX,Y(x,y)log2[PY|X (y|x)]xX,yY

Compare withjoint entropy:H(X,Y)

= PX,Y(x,y)log2[PX,Y(x,y)]xX,yY

=

PX,Y(x,y)log2[PY|X (y|x)PX(x)]xX,yY

= PX,Y(x,y)log2[PY|X (y|x)]xX,yY

PX,Y(x,y)log2[PX (x)]xX,yY

= H(Y X) +H(X)|This is theChainRule for entropy:H(X1,...,Xn) =ni=1H(Xi|X1...Xi1). Question: H(Y X) =H(X Y)?| |

8/3/2019 MIT6_441S10_lec01

5/8

Relative entropy

Relative

entropy

is

a

measure

of

the

distancebetweentwodistributions,alsoknownas the Kullback Leibler distance betweenPMFs PX(x) and PY(y).Definition:D(PX ||PY) =xX PX (x)log

PX(x)

PY(x)

in effect we are considering the log to be ar.v. of which we take the mean (note thatwe assume 0 log(0)=0 andplog(p) =p 0

8/3/2019 MIT6_441S10_lec01

6/8

Mutual information

Mutual

Information:

let

X,Y

be

r.v.s

with

joint PMF PX,Y(x,y) and marginal PMFsPX (x) and PY(y)Definition:

I(X;Y) PX,Y(x,y) = PX,Y (x,y)log

PX(x)PY(y)xX,yY= DPX,Y(x,y)||PX(x)PY(y)

intuitively: measure of how dependent ther.v.s

are

Useful expression for mutual information:

I(X;Y) =H(X) +H(Y)H(X,Y)= X)H(Y)H(Y|= Y)H(X)H(X|= I(Y;X)

Question: what is I(X;X)?

8/3/2019 MIT6_441S10_lec01

8/8

MIT OpenCourseWarehttp://ocw.mit.edu

6.441 Information Theory

Spring 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
http://ocw.mit.edu/http://ocw.mit.edu/termshttp://ocw.mit.edu/termshttp://ocw.mit.edu/

mit6_441s10_lec01

Documents