briefly introduction to image/ video coding standard and fgs for mpeg-4 卓傳育

Briefly introduction to image/ video coding standard and FGS for MPEG-4

Video Compression Standards ITU-T International Telecommunication Union Telecommunication Standardization (ITU-T) MPEG Moving Picture Experts Group

International Telecommunication Union Telecommunication Standardization (ITU-T) CCITT H.261 ITU-T Study Group 15 Videophone and video conferencing 1988-1990: p x 64 kbps (p = 1 30) ITU-T H.263 PSTN and mobil network: 10 to 24 kbps 1994: H.263, H.263+ ITU-T H.26l Merging to JVT in MPEG-4 Part 10

MPEG: Moving Picture Experts Group Coding of Moving Video and Audio MPEG-1: CD-I, for Digital Storage, -1992 MPEG-2: + TV, HDTV, for Broadcast 1994 MPEG-3: HDTV -> merged into MPEG-2 MPEG-4: Coding of Audiovisual Objects-V.1:1998; V.2:1999 Extensions ongoing MPEG-7: MM Description Interface Fall 2001 Describing audiovisual material MPEG-21: Digital Multimedia Framewrok 1 st parts early 2002 The Big Picture and The Glue

Block-Based Coding Why divide to blocks? Image->Blocks

H.261 Video Formats Video Forma t Luminance (Y)Chrominance(Cb, Cr) pixels/lin e lines/fram e pixels/linelines/fram e CIF352288176144 QCIF1761448872 Y pixel Cb, Cr pixel Block boundary

Arrangement of H.261 12 34 56 78 910 1112 176 352 48 288 1 3 5 176 48 QCIF CIF

Arrangements of data structure in H.261 1 3 5 176 144 QCIF picture 1234567891011 1213141516171819202122 2324252627282930313233 176 48 GOB (Group Of Block) Y1Y2 Y3Y4 UV 8 8 8 8 16 MB (Macro Block)

Transform coding Encoder Decoder TQ Entropy coding Entropy coding Q -1 T -1 Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Inverse Zigzag Scan (1D->2D) Reconstructed Transform Coefficients Reconstructed Image block

Transform (0,1) (1,0) (-1,1)(1,1) (0.2,1.8) = 0.2(1,0)+1.8(0,1) = 1(1,1)+0.8(-1,1)

Basis of Transform Basis vectors{v 1,v 2, ,v n } Orthogonal : (v i ) (v j ) = 0 if i!=j Normalized : (v i ) (v i ) = 1 Orthonormal : orthogonal and normalized eg. orthonormal : {(0,1),(1,0)} Orthogonal : {(1,1),(-1,1)}

Why DCT is used for image compressing KLT(Karhunen-Loeve transform): Statistically optimal transform: minimal MSE for any specific bandwidth reduction KLT depends on the type of signal statistics No fast algorithm DCT approaches KLT for highly correlated signals: sample values typically vary slowly from point to point across an image =>Highly correlated signals Fast algorithm(but not optimal)

DCT-basis

DCT :Discrete Cosine Transform Frequency DomainSpatial Domain [8,8,8,8,8,8,8,8] [8,8,8,8,8,8,8,9] [8,8,10,9,7,8,8,9] [8,90,-100,3,4,-10,2,80] DCT [44,0,0,0,0,0,0,0] [44,-2,0,-2,0,-2,0,-2] [46,-2,-2,-4,-2,2,0,-2] [48,-56,146,6,74,-148,-158,-136]

DCT 52 55 61 66 70 61 64 73 63 59 66 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of DCT

Quantization IQ 1 1 1 4 4 4 7 7 7 10 10 10 3 ( IQ) 0 0 0 3 3 3 6 6 6 9 9 9 Q( 3) 0 0 0 1 1 1 2 2 2 3 3 3 0 1 2 3 4 5 6 7 8 9 10 11

-415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 -415/16 = -26 Example of JPEG Coding(Encoder)

-26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of JPEG Coding(Encoder)

Zigzag Scan 2D->1D DC term AC term BACK

-26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 3 1 3 2 6 2 4 1 4 1 1 5 0 2 0 0 1 2 0 0 0 0 0 1 1 EOB 2D->1D Example of JPEG Coding(Encoder) Transform coding(DCT) Quantization Zigzag Scan Zigzag Scan Entropy Coding (bit stream)

Entropy Coding (Variable-Length Coding) Huffman coding Run-length coding Arithmetic coding

Huffman Coding (word) (code) 111110100Variable- Length Code 11100100Fixed- Length Code 1/24 1/63/4 DCBA 1*(3/4)+2*(1/6)+3*(1/24)+3 *(1/24) = 1.333 2*(3/4)+2*(1/6)+2*(1/24)+2 *(1/24) = 2

DPCM : Differential PCM (word) coding coding AAFFFFFCCC PCM => 65,65,70,70,70,70,70,67,67,67 or 0,0,5,5,5,5,5,2,2,2 DPCM => 0,0,5,0,0,0,0,-3,0,0

Run-Length Coding 0 0 0 1 6 0 3 EOB ^^^^^^^ ^ ^^^ (3,1) (0,6) (1,3) 001111 001000010 001001010 10 3 1 0011 1s 3 2 0010 0100 s 0 5 0010 0110 s 0 6 0010 0001 s 1 2 0001 10s 1 3 0010 0101 s

Video Compression Encoder For Still Image TQ Entropy coding Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Encoder For Video Sequence Q -1 T -1 Reconstructed Transform Coefficients Reconstructed Image block MC -

H.261 Intra frame frame information Inter frame reference frame motion vector

H.261 Coder DCTQ Inverse DCT Motion Compensation Loop Filter Video in Inverse Q

Motion Estimation (32,16) (-10,4) (22,20) Referenced frame Current frame Macro block 16*16 31*31

Full-search algorithm Current original frame Current referenced frame Maximum check 31*31=961

3-step search algorithm Current original frame Current referenced frame 8->4->2->1 maximum check 1+8+8+8+8=33

NTSS(new 3-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

FSS(4-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

Overview of Fine Granularity Scalability in MPEG-4 Video Standard Weiping Li, Fellow, IEEE

Illustration of video coding performance

Multi-layer Coding

SNR scalability decoder defined in MPEG-2

Layered scalable coding Tech. Temporal scalability

Layered scalable coding Tech. Spatial scalability

BIT-PLANE CODING OF THE DCT COEFFICIENTS

FGS USING BIT-PLANE CODING OF DCT COEFFICIENTS Overall Coding Structure of FGS Some Details of FGS Coding Profile Definitions in the Amendment of MPEG-4

Overall Coding Structure of FGS FGS encoder structure

Overall Coding Structure of FGS FGS decoder structure

Some Details of FGS Coding 1)Different Numbers of Bit-Planes for Individual Color Components 2)Variable-Length Codes 3)Decoding Truncated Bitstreams

Different Numbers of Bit- Planes for Individual Color Components

Variable-Length Codes Statistics of the (RUN, EOP) symbols in the four VLC tables

Coding patterns for syntax element fgs_cbp

Decoding Truncated Bitstreams Decoding of the truncated bitstream is not standardized in MPEG-4. One possible method To look ahead 32 bits at every byte-aligned position in the bitstream. If the 32 bits are not fgs vop start code, the first 8 bits of the 32 bits are information bits of the FGS frame to be decoded. The decoder slides the bitstream pointer by one byte and looks ahead another 32 bits to check for fgs vop start code.

briefly introduction to image/ video coding standard and fgs for mpeg-4 卓傳育

Documents