week 03-informtion sources and source coding

Upload: fahadshamshad

Post on 03-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    1/25

    Dr. M. Arif Wahla

    EE Dept

    [email protected]

    Military College of SignalsNational University of Sciences & Technology (NUST), Pakistan

    Class webpage: http://learn.mcs.edu.pk/courses/

    Arithmetic Coding

    Lemple Ziv Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    2/25

    12:18 PM 2

    Lecture #3

    Arithmetic Coding

    Lempel Ziv Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    3/25

    For large n,the implementation of Huffman Coding: TheRetired Champion can easily become unwieldy or unduly

    restrictive. The problem includes:

    The size of the Huffman code table is qn, representing anexponential increase in memory and computational

    requirements.

    The code table needs to be transmitted to the receiver.

    The source statistics are assumed stationary. If there are

    changes, an adaptive scheme is required which re-estimatesthe probabilities, and recalculates the Huffman code table.

    The solution to this problem is Arithmetic Coding.

    Fall 2011 3

    Arithmetic Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    4/25

    Consider theN-length source message Si1, Si2,,SiN where {Si: i=1,2,,q} are

    the source symbols and Sij indicates that the jthcharacter in the message is the

    source symbolsi. Arithmetic coding assumes that following probabilities are available.

    The goal of arithmetic coding is to assign a unique interval along the unit number

    line or probability line [0,1] of length equal to the probability of the givensource message, with its position on the number line

    given by the cumulative probability of the given source message.

    Fall 2011 4

    Arithmetic Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    5/25

    Arithmetic coding completely bypasses the idea of

    replacing an input symbol with a specific code.

    Instead, it takes a stream of input symbols and replaces it

    with a single floating point output number.

    The longer (and more complex) the message, the more bits

    are needed in the output number.

    It was not until recently that practical methods were found

    to implement this on computers with fixed sized registers.

    Fall 2011 5

    Arithmetic Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    6/25

    Example

    If we pick in such a way that it is possible to later decompose bback into

    original sequence, the code can be decoded

    Fall 2011 6

    Arithmetic Coding

    }05.0,15.0,3.0,5.0{)(

    ],,,[ 3210

    APP

    aaaaA

    A

    111)(05.0

    110)(15.0

    10)(3.0

    0)(5.0

    33

    22

    11

    00

    aCp

    aCp

    aCp

    aCp

    ]195.0[Interval05.0

    ]95.08.0[Interval15.0

    ]8.05.0[Interval3.0]5.00[Interval5.0

    33

    22

    11

    00

    Ip

    Ip

    IpIp

    10intervalinlyingnumbersrealintoencodediseachSuppose iii Aa

    ...

    asofversionsscaledtheaddingby...sequencetheencodeweSuppose

    22110

    10

    b

    ss i

    1and0betweenintervalin thelyingalsoscale

    decreasingllymonotonicaisand toingcorrespondnumbercodetheis iii S

    i

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    7/25

    Construct the code interval to represent a block of symbols

    as

    Any convenient b within this range is a suitable codeword

    representing the entire block of symbols.

    Algorithm on next slide

    Fall 2011 7

    Arithmetic Coding Process

    ],,[ HbLHLIb

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    8/25

    Fall 2011 8

    Arithmetic Coding Algorithm

    1

    1

    Assume each has been assigned an interval [ , ]

    initialize 0, 0 and 1

    REPEAT

    read next

    .

    .

    1

    UNTIL all have been encoded

    i

    i

    i

    j j

    j j

    i

    j j l

    j j h

    i

    a A I S S i l hi i

    j L H

    H La

    L L S

    H L S

    j j

    a

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    9/25

    Encode the

    sequence a1a0a0a3a2

    Fall 2011 9

    Example 1.6.2:

    ]195.0[Interval05.0

    ]95.08.0[Interval15.0

    ]8.05.0[Interval3.0

    ]5.00[Interval5.0

    33

    22

    11

    00

    Ip

    Ip

    Ip

    Ip

    1 1

    ______________________________________

    i j j j jj a L H L H

    10 0 1 1 0.5 0.8a

    01 0.5 0.8 0.3 0.5 0.65a

    02 0.5 0.65 0.15 0.5 0.575a

    33 0.5 0.575 0.075 0.57125 0.575a

    24 0.57125 0.575 0.00375 0.57425 0.5748125a

    any withinthe final interval will suffice for a codeword

    one choice is = 0.5748125

    1

    1

    0 and 1

    initialize 0

    .

    .

    1

    j j

    i

    i

    j j

    j j l

    j j h

    L H

    j

    H L

    L L S

    H L S

    j j

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    10/25

    Fall 2011 10

    Arithmetic Coding Algorithm

    1

    1

    Assume each has been assigned an interval [ , ]

    initialize 0, 0 and 1

    REPEAT

    read next

    .

    .

    1

    UNTIL all have been encoded

    i

    i

    i

    j j

    j j

    i

    j j l

    j j h

    i

    a A I S S i l hi i

    j L H

    H La

    L L S

    H L S

    j j

    a

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    11/25

    Encode the sequence a1a0a0a3a2

    Fall 2011 11

    Example 1.6.2:

    ]195.0[Interval05.0

    ]95.08.0[Interval15.0

    ]8.05.0[Interval3.0

    ]5.00[Interval5.0

    33

    22

    11

    00

    Ip

    Ip

    Ip

    Ip

    57481250ischoiceone

    codewordaforsufficewillintervalfinalewithin thany

    .57481250.57425000375.0575.057125.04

    0.575.571250075.0575.05.03

    .5750.5015.065.05.02

    65.0.500.38.05.01.80.501100

    ______________________________________

    2

    3

    0

    0

    1

    11

    .b

    b

    a

    a

    a

    aa

    HLHLaj jjjji

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    12/25

    Fall 2011 12

    Decoding Arithmetic Codes - Algorithm

    initialize 0 , 1 and H-L

    REPEAT

    find such that

    OUTPUT Symbil

    .

    .

    UNTIL last ymbol have been decoded

    i

    i

    i

    i

    h

    l

    L H

    i

    b LI

    a

    H L

    H L S

    L L S

    H L

    s

    :followsasisproceduredecodingthevaluecodeGiven the b

    Tricks: Use a special stop symbol for

    sequences of variable length

    Pay attention to precision in

    calculations.

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    13/25

    Decode b =0.5748125

    Solution

    Fall 2011 13

    Example 1.6.3

    i next next next

    ______________________________________________________________

    0 1 1

    iL H I H L a

    1 1

    0 0

    0.8 0.5 0.3

    0.5 0.8 0.3 0.65 0.5 0.15

    0.5 0.65

    I a

    I a

    0 0

    3 3

    0.15 0.575 0.5 0.075

    0.5 0.575 0.075 0.575 0.57125 0.00375

    0.57125 0.575 0.0

    I a

    I a

    2 20375 0.5748125 0.57425 0.0005625I a

    ]195.0[Interval05.0

    ]95.08.0[Interval15.0

    ]8.05.0[Interval3.0

    ]5.00[Interval5.0

    33

    22

    11

    00

    Ip

    Ip

    Ip

    Ip

    ib L I

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    14/25

    Huffman codes require the knowledge of probabilitydistribution of source symbols, which may not always be

    available.

    Dictionary codes dynamically construct their owncoding/decoding table on the fly by looking at the present

    data stream. Probability distribution is not known.

    The strings are coded instead of symbols.

    These codes are only efficient for long files.

    LZ codes belong to a practical class of dictionary codes.

    15

    Dictionary Codes and Lempel-Ziv Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    15/25

    Lempel-Ziv Codes suffer no significant decoding delay at receiver.

    Prior knowledge of decoding table is not required. Requiredinformation is transmitted within the message.

    Huffman codes assign variable length code tofixed symbol size,

    whereas LZ codes, encode thestring of variable lengthwith fixedcode size.

    LZ coding is a mirror image of Huffman coding.

    TheLZalgorithm which we will consider in our course is a slightmodification of originalLZW algorithm.

    16

    Lempel-Ziv Coding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    16/25

    Addr-

    ess

    m

    Dictionary

    Entry

    n ai

    0 0 Null

    1 0 ao

    2 0 a1

    m 0 am

    M 0 aM

    Initializing LZ Algorithm

    To define the structure of dictionary

    Each entry (n,ai) in dictionary is given an address

    m.

    ai is a symbol drawn from souce A and n is a

    pointer to an other location.

    nis represented by a fixed length word ofbbits.

    Dictionary contains total number of entries less

    than or equal to 2b.

    The algo is initialized by constructing first M+1

    entries.

    The 0 address entry is anull symbol. It is used to let

    decoder know the end of string.

    Pointer n=0 for first M+1 entries, it points to the

    null entry at address 0.

    m=m+1points to next blank location in dictionary

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    17/25

    Initialize pointer n =0 and

    m=M+1

    1. Fetch next source symbol ai ; where i=0,1,2,,M-1.

    2. If the ordered pair is already in dictionary then

    Next n = dictionary address of entry ;

    elsetransmit n

    create new dictionary entry at dictionary address m

    m= m+1

    n = dictionary address of entry ;

    3. Return to step 1.

    18

    LZ Algorithm

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    18/25

    Addr

    -ess

    m

    Dic Entry

    n ai

    0 0 null

    1 0 0

    2 0 1

    19

    Present

    n

    Source

    ai

    Present

    m

    Transmit

    n

    Next

    n

    Dic Entry

    n , ai

    0 1 3 22

    1 3 2 2, 1240

    22 2, 0

    5

    11

    0 1 1, 0160

    155

    6 5 2 5, 11

    70 4

    7

    4

    1

    2

    4 4, 1

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    19/25

    0 1 3 2

    2 1 3 2 2 2,1

    2 0 4 2 1 2,0

    1 0 5 1 1 1,0

    1 0 6 5

    5 1 6 5 2 5,1

    2 0 7 4

    4 1 7 4 2 4,1

    2 1 8 3

    3 0 8 3 1 3,0

    1 0 9 5

    5 1 9 6

    6 0 9 6 1 6,0

    1 1 10 1 2 1,1

    20

    Present

    n

    Source Present

    m

    Transmit

    n

    Next

    n

    Dic Entry

    n , ai

    Address

    m

    Dic Entry

    n ai

    0 0 null

    1 0 0

    2 0 1

    3 2 1

    4 2 0

    5 1 0

    6 5 1

    7 4 1

    8 3 0

    9 6 0

    10 1 1

    l i di

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    20/25

    Decoder must construct a dictionary similar to the encoder

    We know that encoder doesnt transmit as many code-words as it has source symbols

    21

    Lempel-Ziv Decoding

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    21/25

    0 1 3 2

    2 1 3 2 2 2,1

    2 0 4 2 1 2,0

    1 0 5 1 1 1,0

    1 0 6 5

    5 1 6 5 2 5,1

    2 0 7 4

    4 1 7 4 2 4,1

    2 1 8 3

    3 0 8 3 1 3,0

    1 0 9 5

    5 1 9 6

    6 0 9 6 1 6,0

    1 1 10 1 2 1,1

    22

    Present

    n

    Source Present

    m

    Transmit

    n

    Next

    n

    Dic Entry

    n , ai

    Address

    m

    Dic Entry

    n ai

    0 0 null

    1 0 0

    2 0 1

    3 2 1

    4 2 0

    5 1 0

    6 5 1

    7 4 1

    8 3 0

    9 6 0

    10 1 1

    L l Zi D di

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    22/25

    Decoder must construct a dictionary similar to

    the encoder

    We know that encoder doesnt transmit asmany code-words as it has source symbols

    Decoding Operation goes as:

    Reception of any code-word means that a new

    dictionary entry must be constructed Pointer n for this new entry is the same as the

    received codeword

    Source symbol aifor this entry is not yet known

    because it is the route symbol for next string (not yet

    transmitted). Such entry is called partial entry at address m

    This entry can fill in the missing symbol aiof

    previous entry at address m-1

    23

    Lempel-Ziv Decoding

    Address

    m

    Dic Entry

    n ai

    0 0 null

    1 0 0

    2 0 1

    3 2 1

    4 2 0

    5 1 0

    6 5 1

    7 4 1

    8 3 0

    9 6 0

    10 1 1

    L l Zi D di

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    23/25

    Source symbol aifor this entry is not yet known because it is the route symbol

    for next string (not yet transmitted). Such entry is called partial entry

    at address m

    This entry can fill in the missing symbol aiof previous entry at address m-1

    It can also decode the source string associated with codeword n

    Root symbol is the first symbol of the string having pointer 0

    The last symbol of the string is the symbol belonging to entry at address

    pointed by the pointer of last updated entry. m=m+1should updated probably just after completing the entry at address m

    If pointer npoints to the entry having pointer n=0then we decode string

    If pointer npoints to the entry having pointer nnonzero, then this non zero

    pointer will further connect us to another address. This will continue until we

    reach a zero pointer

    24

    Lempel-Ziv Decoding

    L l Zi D di

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    24/25

    Lempel-Ziv Decoding

    25

    Address

    m

    Dictionary

    entry

    n ai

    Address

    m

    Dic Entry

    n ai

    0 0 null

    1 0 0

    2 0 1

    3 2 1

    4 2 0

    5 1 0

    6 5 1

    7 4 1

    8 3 0

    9 6 0

    10 1 1Decoded

    message

    3 2,?

    4 2, ?

    11

    5 1,?

    0

    6 5,?

    1

    000

    0

    7 4,?

    11

    0

    8

    0 0, null

    1 0,0

    2 0,1

    3,?

    H ff C di Effi i

  • 8/12/2019 Week 03-Informtion Sources and Source Coding

    25/25

    Huffman coding require the knowledge of apriori, otherwise we have to

    determine the apriorithrough estimation such as: = +

    For the source with M alphabets, ther average # of bits/code for the two cases will

    be

    C f P b bili d E 26

    Huffman Coding Efficiency

    2

    2

    2

    2 2

    1 1 and

    1 1

    ( )

    practicaly , so

    1

    Let the mean sqaured error be and using Lagrange multiplier

    1 ( )

    i i i i

    i i

    i i i i i

    i i

    i i

    i i

    i

    i i

    i i

    L p l L p lM M

    L L L p l l e lM M

    l l

    L e lM

    L l lM

    2

    0L