x-media audio, image and video coding · contents 3 •introduction to x-media coding generic...

149
1 X-Media Audio, Image and Video Coding Laurent Duval (IFP Energies nouvelles) Eric Debes (Thalès) Philippe Morosini (Supélec) Supports Moodle/NTNoe http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html

Upload: others

Post on 30-Jul-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

  • 1

    X-MediaAudio, Image and Video Coding

    Laurent Duval (IFP Energies nouvelles)Eric Debes (Thalès)

    Philippe Morosini (Supélec)

    SupportsMoodle/NTNoe

    http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html

  • 2

    General information

  • 3Contents

    • Introduction to X-Media coding� generic principles

    • Audio� data recording/sampling, physiology

    • Data, audio coding� LZ*, Mpeg-Layer 3

    • Image coding� JPEG vs. JPEG 2000

    • Video coding� MPEG formats, H-264

    • Bonuses, exercices

  • 4Initial motivation

  • 5Initial motivation

    • Coding/compression as a DSP engineerdiscipline� information reduction, standards & adaption,

    complexity, integrity & security issues, interaction with SP pipeline

    • Composite domain, yet ubiquitous� sampling (Nyquist-Shannon, filter banks), statistical

    SP (KLT, decorrelation), transforms (Fourier, wavelets), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling

  • 6Initial motivation

    Gray-level Histogram

    Spatial

    DFT DCT

    Spectral

    Digital Image Characteristics

    Point Processing Masking Filtering

    Enhancement

    Degradation Models Inverse Filtering Wiener Filtering

    Restoration

    Pre-Processing

    Information Theory

    LZW (gif)

    Lossless

    Transform-based (jpeg)

    Lossy

    Compression

    Edge Detection

    Segmentation

    Shape Descriptors Texture Morphology

    Description

    Digital Image Processing

  • 7Initial motivation

    • Exemplar� underline importance of specific steps (related to

    other lectures) to make stuff work

    � which algorithm for which task?� process text? sound? images? video?

    � "the toolbox quote" (Juran)

    • Evolving� steady evolution of standards, tools, overview of

    future directions (and tools)

    • Central to SP task� what is really important in my data?� (stored) info. overflow + degradations (decon.)

  • 8

    Principles

  • 9Principles

    • What is data (image) Compression?� Data compression is the art and science of

    representing information in a compact form.

    � Data is a sequence of symbols taken from a discrete alphabet.� Text: sequence of characters/bytes (0.5 D)� Sound: collection of arrays of values representing

    intensities (1 D to 1.5 D) � Still image data: collection of arrays (one for each color

    plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D)

    � Video: sequence of still images (3 D)� Next: 3D-TV (audio, video, ambiance, smell?)

  • 10

    Flurry of formats ?

  • 11Examples of data compression extensions

    • Some (standard) extensionsGIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K,

    BH, AVI, R(A)M, LHA, OGG, ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP

    http://en.wikipedia.org/wiki/List_of_archive_formats

    • Targeted for what kind of data?

  • 12

    Why?

  • 13

    Still Image• One page of A4 format at 600 dpi is > 100 MB.• One color image in digital camera generates 10-30 MB.• Scanned 3”×7” photograph at 300 dpi is 30 MB.

    HDTV Video• 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s• HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction• Equivalent : 0.35 bits/pixel

    • Infotrends (2008/01): total #of digital pictures sincebeginning ~ 180 billions. Should grow to 347 billions in 2012

    Why do we need Image Compression ?

  • 14

    1) Storage2) Transmission (cable, satellite, wifi)3) Data access

    1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB)but seek time : 15 ms � 10 ms (3-4 ms)and transfer rate: 1MB/sec ->2 MB/sec.

    (much better for SSD)

    Compression improves overall response time in some applications

    Why do we need Image Compression ?

  • 15

    •Image scanner

    •Digital camera

    •Video camera,

    •Ultra-sound (US), Computer Tomography (CT),

    Magnetic resonance image (MRI), digital X-ray (XR),

    Infrared.

    •Remote sensing, Seismics, Satellite, Radar, SAR

    Source of images

  • 16

    IMAGECOMPRESSION

    UNIVERSALCOMPRESSION

    Videoimages

    Gray-scale images

    True colour images

    Binaryimages

    Colour palette images

    Textual data

    Why do we need specific algorithms?

    Data types

  • 17Binary image: 1 bit/pel

  • 18Grayscale image: 8 bits/pel

    Intensity = 0-255

  • 19

    6 bits(64 gray levels)

    4 bits(16 gray levels)

    2 bits(4 gray levels)

    384××××256

    192××××128

    96××××64

    48××××32

    Parameters of digital images

  • 20True color image: 3*8 bits/pel

  • 21Goals of compression

    • Balance redundancy and irrelevancy�sources of redundancy

    � temporal� spatial� color� other?

    �sources of irrelevancy� perceptually unimportant information

    � issues: redundancy/irrelevancy� examples?� lossless/lossy choices

  • 22

    Lossless compression: reversible, information preservingtext compression algorithms, binary images, palette images

    Lossy compression: irreversible grayscale, color, video

    Near-lossless compression: medical imaging, remote sensing.

    1) Why do we need lossy compression?2) When we can use lossy compression?

    Lossy vs . Lossless compression

  • 23

    Bitrate:

    Compression ratio:

    N

    C=image in the pixels

    file compressed theof size

    C

    kN ⋅=file compressed theof size

    file original theof size

    bits/pel

    Rate measures

  • 24

    Mean average error (MAE): ∑=

    −=N

    iii xyN 1

    1MAE

    Mean square error (MSE): ( )∑=

    −=N

    iii xyN 1

    21MSE

    [ ]MSElog10PSNR 210 A⋅=

    Signal-to-noise ratio (SNR):

    Pulse-signal-to-noise ratio (PSNR):

    [ ]MSElog10SNR 210 σ⋅=(decibels)

    (decibels)

    A is amplitude of the signal: A = 28-1=255 for 8-bits signal.

    Other measures: l_p norms, SSIM, MOS

    Distortion measures

  • 25Other issues

    • Coder and decoder computation complexity• Memory requirements• Fixed rate or variable rate• Error resilience• Symmetric or asymmetric algorithms• Decompress at multiple resolutions• Decompress at various bit rates• Standard or proprietary

    Reduce redundancy/irrelevancy at "each" stepconsidering performance and quality

  • 26What is an image?

  • 27What is an image?

  • 28What is an image?

  • 29Ultimate storage

    • Image compression: why?• storage, transmission, database indexing• processing (denoising, scaling, rotation)

    • How come?• 512 x 512 pix 8-bit image → 2,097,152 bits• Far less than 10100 atoms in the entire Universe• 1015 directions, magn., depth and exposure params

    • Tautavel Man takes 1000 pix/s (since 450,000 BC)• A collection of 1,42.10176 pix.

    • typical compressed image size?• # of bits needed: 17, 586, 10.253, 12.087.300, more?

  • 30What is a compression system?

    Compression Model

    f(x,y) Transform QuantizeEncode• Source• Channel

  • 31What is a compression system?

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Data

    Adaptive Compression

    BitStream

    Embedded Coding

  • 32Compression scheme (1)

    model error

    parameters

    +

    =

    modeling

  • 33Compression scheme (2)

    reduction

    xO = 1 3 7 2 -5 -1 0 0 0

    xR = 0 2 6 2 -6 0 0 0 0

    xC = 0 1 3 1 -3 4

    coding

    transform

  • 34Preprocessing

    Adaptive Compression

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Image

    BitStream

    Image analysis Filtering/enhancementRGB to YUVExtension

  • 35Preprocessing

    • Image Analysis

    • Filtering/enhancement

    • RGB to YUV

    • Image Extension

  • 36

    Red Green Blue

    RGB color space

  • 37

    R, G, B -- red, green, blueY -- the luminanceU,V -- the chrominance components

    Most of the information is collected to the Y component,

    while the information content in the U and V is less.

    YRV

    YBU

    BGRY

    −=−=

    ⋅+⋅+⋅= 1.06.03.0

    −−=

    −−−=

    r

    b

    r

    b

    C

    C

    Y

    B

    G

    R

    B

    G

    R

    C

    C

    Y

    .

    0772.10.1

    71414.034413.00.1

    402.100.1

    .

    08131.041869.05.0

    5.033126.016875.0

    114.0587.0299.0

    RGB →→→→ YUV

  • 38

    Y U V

    YUV color space

  • 39Blocking

    JPEG blockingIrregular tilingSegmentation

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Image

    BitStream

    Adaptive Compression

  • 40Blocking

    • Goals:

    • To exploit image unstationarities

    • To reduce the computational cost

    • To exploit inter-block dependencies (2D-3D)

    • To select objects of interest (moving)

  • 411D Signal Blocking

    • How do we perform decomposition?• block by block

    • with overlap

  • 422D Image Blocking

    JPEG blocking SegmentationIrregular tiling

  • 43Transform Coding

    Karhunen -LoèveFourier, DCTWalsh, HartleyWavelet (packets)

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

  • 44Transform Coding• Goals

    • Efficient representation I(u,v) of an image i(x,y)

    • Data decorrelation (KLT optimality?)

    • Properties• Linear transforms (matrix op.)• Orthogonality

    • Fast algorithms for compression/decompression• Laplace-Gauss distribution (var. length coding)

  • 45Karhunen -Loeve (Hotelling ) Transform

    { }{ }

    i

    ii

    i

    Txxx

    x

    eA

    λe

    Nixλ

    mxmxEC

    xEm

    Nx

    of rowsh Matrix wit :

    toingcorrespond rsEigenvecto :

    ,...,2,1 ,of sEigenvalue :

    matrix Covariance :))((

    rmean vecto :

    vector1:

    =−−=

    )( xmxAy −=

    Hotelling transform of x

  • 46Singular Value Decomposition

  • 47Transform Coding• An optimality result:

    • Simplest stationary source model AR(1)

    DCT ~ KLT

    • Troubles:• KLT calculations + overhead

    • After a transformation, correlation can be made very small, but coefficients are far from being independent!

    • Transform affects coding

  • 48Transform Coding

    • Choices• Good decorrelation properties• Low complexity, HW implementation (DCT, Fourier)

    • Side effects (extension, zero or linear-phase)

    • A great deal of nice transforms• Walsh-Hadamard system (with fast transforms)• Wavelet (Haar-1910, Mallat, Daubechies 1988)

    • Lapped transforms (Malvar, Meyer)

  • 49Transform Coding

  • 50Transform Coding: performance

  • 51Performance measure : coding gain

  • 52DCT and size

  • 53Transform coding : optimization

  • 542D Hadamard -16

  • 552D – DCT 16

  • 56Standard transforms

    • Limitations� fixed vector size

    �constrained shape (orthogonality)�data-driven adaptation

    �shifts and rotations robustness�higher dimension generalizations�analogy with vision aspects

    � needed for compression : inverses/redundancy/sparsity

    �pre- and post-processing

    � coder complexity

  • 57Transform coding

    • A common waveform

    • A common representation

  • 58Transform coding

    • A less common waveform

    • A less common representation

  • 59Transform coding

    • A not so common waveform

    • A not so common representation

  • 60Novel transforms

    FB-II

    low frequencies

    high frequencies

    Wavelet

  • 61Transform Coding

  • 62Transform Coding

  • 63Processing

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

    Time/freq. Filtering Image analysis Texture analysis

  • 64Classifier

    Spectrum allocationTexture extractionSegmentation

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

  • 65Texture Synthesis

  • 66Bit Reduction

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

    ThresholdingSubsamplingScalar quantizationVector quantizationIterations (fractals)

  • 67Bit Reduction

    • The lossy stage• Human eyes see a limited range of tones/freq• Real-world pictures are already imperfect

    • Methods• Scalar quant.: uniform, log., optimal (Lloyd-Max algo)• Vector quant.: Voronoï diagrams, nearest neighbour

    • Adaptivity• Pre-stored tables• Training set based tables• On-the-fly quantization estimation

  • 68Quantization

    8-bits last 5 bits last 4 bits

  • 69Quantization

    last 3 bits last 2 bits last bit

  • 70Quantization

    8 3

    1

    4

    2

  • 71Quantization

  • 72Quantization (non -uniform )

  • 73Quantization (adaptive)

  • 74Quantization (vector)

    • Outline for images

  • 75

    xk

    Image

    closest matching code vector

    Image vectors : Xj

    Codebook

    codevectors : Vi , V1

    V2

    Vk

    VL

    2

    1

    ))()((1

    ),( iViXn

    VXdn

    ikjKj ∑

    =

    −=

    2

    1

    ))()(((1

    ),( iViXWn

    VXdn

    ikjiKj ∑

    =

    −=

    Mean Square Error (MSE) (Euclidean Distance)

    Weighted MSE

    Li ≤≤1

    Quantization (vector)

  • 76Quantization (vector)

    • LBG algorithm

  • 77Quantization (vector)

    • Simple codebooks� (parrots)

    • Codebook for a specific feature� ex. edges, smooth areas, etc.

    • Codebooks could be of different sizes

  • 78Ordering

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

    Raster scanZig-zag, Hilbert scanZerotreesObject based coding

  • 79Ordering

  • 80Tree-coding ordering

  • 81Ordering

  • 82Wavelet/Blocking equivalence

  • 83Entropy Coding

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

    RLE, HuffmanArithmetic, LZ*Form dictionaries

  • 84Entropy Coding

    After reduction:

    • Lower coefficient variance • A lot zeroed out

  • 85Arithmetic Coding

    S ym b o l # H u f f . L o w U pA 1 4 0 0 .1C 1 5 0 .1 0 .2D 1 5 0 .2 0 .3E 3 1 0 .3 0 .6N 2 2 0 .6 0 .8T 2 3 0 .8 1

    4.32 10-8 ≈ 24.46 bits against 27 bits

    AAAAAAAAA END : 7 against 11 bits

    ANTECEDENT Incoming Low UpStart 0 1A 0 0.1N 0.06 0.08T 0.076 0.08E 0.0772 0.0784C 0.07732 0.07744E 0.077356 0.077392D 0.07736632 0.0773668E 0.07736428 0.07736536N 0.077364928 0.077365144T 0.0773651008 0.077365144Codeword 0.07736511

  • 86Rate allocation

    Exact bit-rateProgressive codingDistortion matching

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

  • 87Quality measure

    Adaptive Compression

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Objective measures (SNR)Subjective measuresHVS model

  • 88Embedded coding

    Embedded coding

    Image

    BitStream

    Pre -Processing

    Blocking Transform Processing

    Classifier

    Rate Allocator

    ReductionOrderingEntropy Coding

    Quality

    Adaptive Compression

  • 89Embedded quantization

    Sign s s s s s s s sMsb 4 1 1 1 0 0 0 0

    3 x x x 1 1 0 02 x x x x x 0 01 x x x x x 1 1

    Lsb 0 x x x x x x x

  • 90Ordering

  • 91Wavelet/Blocking equivalence

  • 92

    JPEG

  • 93

    8X8DCT

    Quantizer Coefficients-to-SymbolsMap

    Entropy Coder

    Encoder

    JPEG Principles

  • 94

    Input Image, Size=512 x 512 x 8 bits

    JPEG Principles

  • 95JPEG Principles

  • 96

    69 71 74 76 89 106 111 122

    59 70 61 61 68 76 88 94

    82 70 77 67 65 63 57 70

    97 99 87 83 72 72 68 63

    91 105 90 95 85 84 79 75

    92 110 101 106 100 94 87 93

    89 113 115 124 113 105 100 110

    104 110 124 125 107 95 117 116

    16

    )12(cos

    16

    )12(cos],[

    4

    ][][],[

    7

    0

    7

    0

    ππ vnumnmx

    vCuCvuX

    m n

    ++= ∑∑= =

    ≤≤

    ==

    711

    ,0,2/1

    u

    u

    v

    uC

    717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

    -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

    51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

    30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

    22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

    15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

    0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

    -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

    Step 1: DCT

    JPEG Principles

  • 97JPEG Principles (Transform : DCT8)

  • 98JPEG Principles

    717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

    -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

    51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

    30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

    22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

    15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

    0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

    -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

    45 0 0 -1 0 0 0 0

    -8 -3 2 1 0 0 0 0

    4 -5 0 0 0 0 0 0

    2 -1 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    16 11 10 16 24 40 51 61

    12 12 14 19 26 58 60 55

    14 13 16 24 40 57 69 56

    14 17 22 29 51 87 80 62

    18 22 37 56 68 109 103 77

    24 35 55 64 81 104 113 92

    49 64 78 87 103 121 120 101

    72 92 95 98 112 100 103 99]

    ÷÷÷÷ Q

    Step 2: Quantization

  • 99

    45 0 0 -1 0 0 0 0

    -8 -3 2 1 0 0 0 0

    4 -5 0 0 0 0 0 0

    2 -1 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    Result = 45,0,-8,4,-3,0,-1,2,-5,2,1,-1,0,1,0,0,0,0,0,0,1,0x23 times

    Step 3: Coefficient-to-Symbol Mapping

    Input Zigzag scan procedure

    Symbols defined as [run of zeros, nonzero terminating value]

    Step 4: Entropy Coding• Symbols are encoded using mostly Huffman coding.• Huffman coding is a method of variable length coding in

    which shorter codewords are assigned to the more frequently occurring symbols.

    JPEG Principles

  • 100

    [ ] [ ] ∗= HfVF∗− = VV 1 ∗− = HH 1

    ( ) ( )∑∑∑∑−

    =

    =

    =

    =

    =1

    0

    1

    0

    21

    0

    1

    0

    2,,

    N

    m

    N

    n

    N

    u

    N

    v

    nmfvuF

    [ ] [ ]HFVf ∗=

    an image NxN[ ]f

    Classical orthogonal transforms

  • 101One JPEG block

  • 102JPEG compression on several blocks

  • 103

    Blocking artifact (DCT)Blocking artifact (DCT)

    Blocking ArtifactsBlocking Artifacts

  • 104Examples using different quality factors to scale matrix Q

    Size: 15159 Bytes17x Compression

    Size: 18032 Bytes15x Compression

    Size: 20922 Bytes13x Compression

    Size: 5728 Bytes46x Compression

    Size: 11956 Bytes22x CompressionSize: 263224Bytes

    Original Image

  • 105

    Another DCT Application: Video Compression

    8X8DCT

    Quantizer Entropy Coder

    8X8IDCT

    InverseQuantizer

    DelayedFrame Memory

    MotionCompensation

    MotionEstimation

    Entropy Coder

    InputFrame

    Prediction

    PredictionError

    Intraframe – OpenInterframe - Close

    Motion Vectors

    JPEG and DCT: motion

  • 106JPEG and DCT: motion

  • 107Classical orthogonal transforms

  • 108

    32 x 32 superbloc

    Principles of overlapping

    8 x 8 coefficients8 x 8 pixels

    Original image Transformed image

  • 109Lapped transforms

  • 110

    Ringing artifact (Wavelet, Ringing artifact (Wavelet, GenLOTGenLOT))

    Blocking and Ringing Artifacts (Nagai 2001)Blocking and Ringing Artifacts (Nagai 2001)

  • 111

    Input

    ……

    Quantization Error

    Freq

    ue

    nc

    y hig

    h

    Basis of Basis of GenLOTGenLOT

    Basis of GULLOTBasis of GULLOT

    Freq

    ue

    nc

    y hig

    h

    Output

    Output

    How to Reduce the Ringing ?How to Reduce the Ringing ?

  • 112

    GULLOT

    1-D base example

  • 113Results

    GULB (26.29 dB)GULB (26.29 dB) GenLOTGenLOT(26.06 dB)(26.06 dB)

    Coded Yogi image at 0.3bpsCoded Yogi image at 0.3bps

  • 114Wavelet/Blocking equivalence

  • 115

    Principles of wavelet image codingSPIHT/JPEG-2000

  • 116

    1-Level Wavelet Decomposition (2D DWT)

    H1

    H2

    H1

    H2

    2

    2

    2

    2

    H1

    H2

    2

    2

    Row-wise operations Column-wise operations

    Hi

    ∑=

    −=L

    ki khknxny

    0

    ][][][

    x[n] y[n]2

    Keep one out of two pixels

    Filter Decimator

    Input Image

    LLComponent

    HLComponent

    LHComponent

    HHComponent

    (Low pass)

    (Low pass)

    (Low pass)

    (High pass)

    (High pass)

    (High pass)

    SPIHT/JPEG 2000 principles

  • 117

    Multi-Level Wavelet Decomposition

    LL HL1

    LH1 HH1

    2D-DWT

    2D-DWT

    LL HL2

    HH2LH2HL1

    LH1 HH1

    SPIHT/JPEG 2000 principles

  • 118

    Bitplanes and Self-Similarity Across Scales

    SPIHT/JPEG 2000 principles

  • 119

    Spatial Orientation Trees

    Some Definitions: •O (i,j): set of coordinates of all offspring of node (i,j).

    • D (i,j): set of coordinates of all descendants of node (i,j).

    • H : set of coordinates of all spatial orientation tree roots.

    • L (i,j): D (i,j) - O (i,j).

    SPIHT/JPEG 2000 principles

  • 120

    Coding Algorithm (SPIHT)

    • Three list are defined:

    1. LIS: List of Insignificant Sets

    2. LIP: List of Insignificant Pixels3. LSP: List of Significant Pixels

    • Type A: Entries are elements D (i,j)• Type B: Entries are elements L (i,j)

    • Significance Test:

    =∈

    otherwise

    c

    TS

    nji

    Tji

    n

    ,0

    2|}{|max,1

    )(

    ,),(

    • Key Ideas:

    • Ordered bit plane transmission.• Multi-pass zero-tree coding.• Exploitation across scales of the 2-D DWT.

    SPIHT/JPEG 2000 principles

  • 121

    4-43065115

    6363-46-32

    40-232-303

    2-26447-19-5

    23-248-14-7-9

    93-75-1231415

    -1643-131423-31

    7-121371049-3463

    480

    0-7(1,3)

    0-9(0,3)

    002(3,5)

    00-3(2,5)

    480

    147(3,4)

    00-1(2,4)

    114(1,2)

    015(0,2)

    1-31(0,1)

    0-34(1,0)

    023(1,1)

    00-7(1,3)

    00-9(0,3)

    0014(1,2)

    0015(0,2)

    1-31(0,1)

    00-13(3,1)

    0014(2,1)

    0010(3,0)

    480

    149(2,0)

    1-34(1,0)

    0023(1,1)

    00-31(0,1)

    -481

    1-34(1,0)

    163(0,0)

    Reconstruction Value

    BinarySymbols

    Coefficient Value

    CoefficientCoordinatesSPIHT – Example

  • 122Some ExamplesOriginal Image

    96x Compression 48x Compression

    22x Compression 16x Compression

  • 123

    JPEG-2000

  • 124Introduction

    • Image Compression has to:� Reduce storage and bandwidth requirements� Allow different extraction modes

    • JPEG-2000 provides:� Low bit-rate compression performance� Progressive transmission by quality, resolution, component,

    or spatial locality� Lossy and Lossless compression� Random access to the bitstream� Region of Interest coding� Robustness to bit errors

  • 125Codec Structure

  • 126Source Image Model

    • One or Several Components in the image• Components can be at different resolutions

    (different sizes)

  • 127Intercomponent Transform

    • Reduces the correlation between components• Maps image data from RGB to YCrCb• Advantages:

    � Improve coding efficiency�Allow visually relevant quantization

    • Two Transforms:� Irreversible color transform (ICT)�Reversible color transform (RCT)

  • 128Reversible Component Transformation

    • Used for Lossless or Lossy coding

    • Advantages:� Reasonable Color Space� Ability of having lossless

    compression

    ++

    +−

    =

    −−

    ++

    =

    GU

    GV

    VUY

    B

    R

    G

    GB

    GR

    BGR

    U

    V

    Y

    r

    r

    rrr

    r

    r

    r

    4

    4

    2

  • 129Wavelet Transform (6)

    An example:

  • 130Wavelet Transform (7)

    • 5/3 Transform: reversible� Integer to Integer transform

    �Can be used both for lossless or lossy coding

    • 9/7 Transform: nonreversible�Real to Real transform

    �Can only be used for lossy coding

  • 131Quantization

    • A uniform scalar quantization with dead-zone about the origin

    A zero output may be produced for larger values on the input, to avoid recording noise

  • 132Progression

    • Different ordering of the packets in the code stream• 4 Types of Progression:

    � Resolution� Quality � Spatial Location� Component

    • Progression Type can be changed during coding

  • 133Progression (2)

    • Progression by Resolution

  • 134Progression (3)

    • Progression by Quality

  • 135Region of Interest

    • Coding different regions of the image with different quality

    • Used when certain parts of the image are of higher importance

    • ROI coding:� General Scaling-Based method� MAXSHIFT method

  • 136General Scaling -Based

    • Idea:Scale (shift) coefficients s.t. the bits associated with ROI are in higher bit-planes

    • Some bits of ROI might be encoded together with nonROI bits

  • 137

    General Scaling -Based (2)• Steps:

    • Wavelet Transform• ROI Mask is derived• Quantization• nonROI coefficients are

    downscaled• Entropy coding

    • Scaling Value and ROI coordinates are included

  • 138MAXSHIFT Method

    • Scaling value S is chosen such that:The minimum ROI coefficient is larger than the maximum nonROIcoefficient

    • Advantages:� Allows arbitrary shaped ROIs� No ROI mask is needed

  • 139Scalability

    • The ability to achieve coding of more than one qualities and/or resolution simultaneously

    • Two important types:� SNR Scalability� Spatial or Resolution Scalability

    • Advantages: � No need to know target bit rate/resolution � No need for multiple compressions� Resilience to transmission errors

  • 140SNR Scalability

    • The bit stream can be decompressed at different quality levels (SNR)

    Decompressed image “bike” at (a) 0.125 b/p, (b) 0.25 b/p, (c) 0.5 b/p

  • 141Spatial Scalability

    • The bit stream can be decompressed at different resolution level

  • 142Scalability (2)

    • Combination of Spatial and SNR• Changing the progression type

  • 143JPEG-2000 V.S. JPEG

    (a) (b)

    Compression at 0.25 b/p by means of (a) JPEG (b) JPEG-2000

  • 144JPEG-2000 V.S. JPEG

    Compression at 0.2 b/p by means of (a) JPEG (b) JPEG-2000(a) (b)

  • 145Comparison of JPEG 2000 with JPEG

    • Much smaller files• Much better quality

    Figure: 0.08bpp J2K Image (8KB); 0.1563bpp JPEG Image (16KB);

  • 146Illustration

    • Region of Interest (ROI) Encoding

    Figure: Raw Image; 0.07bpp J2K Image with ROI; 0.07bpp J2K Image without ROI

  • 147References• M.D. Adams, “The JPEG-2000 Still Image Compression Standard”, ISO/IEC JTC1/SC29/WG1

    (ITU-T SG8), 2001

    • D. Taubman, E. Ordentlcih, I. Ueno, “Embedded Block Coding”, Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II, 33-36, 2000

    • M.W. Marcellin, M.J. Gormish, A. Bilgin, M.P. Boliek, “An Overview of JPEG-2000”, Proc. Of IEEE Data Compression Conference, pp. 523-541, 2000

    • A. Skodras, C. Christopulos, T. Ebrahimi, “The JPEG 2000 Still Image Compression Standard”, IEEE Signal Processing Magazine, pp. 36-60, September 2001

    • D.S. Taubman, and M.W. Marcellin, "Jpeg2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2001.

    • G. Cena, P. Montuschi, L. Ciminiera, A. Sanna, “A Q-Coder Algorithm with Carry Free Addition”, Proc. 13th IEEE Symposium on Computer Arithmetic, pp. 282-290, July 1997

    • S.Y. Choo, G. Chew, “JPEG 2000 and Wavelet Compression” , http://www-ise.stanford.edu/class/psych221/00/shuoyen/

  • 148JPEG-2000 Parts

  • 149JPEG Principles

    717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6

    -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7

    51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4

    30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0

    22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9

    15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7

    0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8

    -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1

    45 0 0 -1 0 0 0 0

    -8 -3 2 1 0 0 0 0

    4 -5 0 0 0 0 0 0

    2 -1 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0

    16 11 10 16 24 40 51 61

    12 12 14 19 26 58 60 55

    14 13 16 24 40 57 69 56

    14 17 22 29 51 87 80 62

    18 22 37 56 68 109 103 77

    24 35 55 64 81 104 113 92

    49 64 78 87 103 121 120 101

    72 92 95 98 112 100 103 99]

    ÷÷÷÷ Q

    Step 2: Quantization