briefly introduction to image/ video coding standard and fgs for mpeg-4 卓傳育

Click here to load reader

Post on 22-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

  • Slide 1
  • Briefly introduction to image/ video coding standard and FGS for MPEG-4
  • Slide 2
  • Video Compression Standards ITU-T International Telecommunication Union Telecommunication Standardization (ITU-T) MPEG Moving Picture Experts Group
  • Slide 3
  • International Telecommunication Union Telecommunication Standardization (ITU-T) CCITT H.261 ITU-T Study Group 15 Videophone and video conferencing 1988-1990: p x 64 kbps (p = 1 30) ITU-T H.263 PSTN and mobil network: 10 to 24 kbps 1994: H.263, H.263+ ITU-T H.26l Merging to JVT in MPEG-4 Part 10
  • Slide 4
  • MPEG: Moving Picture Experts Group Coding of Moving Video and Audio MPEG-1: CD-I, for Digital Storage, -1992 MPEG-2: + TV, HDTV, for Broadcast 1994 MPEG-3: HDTV -> merged into MPEG-2 MPEG-4: Coding of Audiovisual Objects-V.1:1998; V.2:1999 Extensions ongoing MPEG-7: MM Description Interface Fall 2001 Describing audiovisual material MPEG-21: Digital Multimedia Framewrok 1 st parts early 2002 The Big Picture and The Glue
  • Slide 5
  • Block-Based Coding Why divide to blocks? Image->Blocks
  • Slide 6
  • H.261 Video Formats Video Forma t Luminance (Y)Chrominance(Cb, Cr) pixels/lin e lines/fram e pixels/linelines/fram e CIF352288176144 QCIF1761448872 Y pixel Cb, Cr pixel Block boundary
  • Slide 7
  • Arrangement of H.261 12 34 56 78 910 1112 176 352 48 288 1 3 5 176 48 QCIF CIF
  • Slide 8
  • Arrangements of data structure in H.261 1 3 5 176 144 QCIF picture 1234567891011 1213141516171819202122 2324252627282930313233 176 48 GOB (Group Of Block) Y1Y2 Y3Y4 UV 8 8 8 8 16 MB (Macro Block)
  • Slide 9
  • Transform coding Encoder Decoder TQ Entropy coding Entropy coding Q -1 T -1 Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Inverse Zigzag Scan (1D->2D) Reconstructed Transform Coefficients Reconstructed Image block
  • Slide 10
  • Transform (0,1) (1,0) (-1,1)(1,1) (0.2,1.8) = 0.2(1,0)+1.8(0,1) = 1(1,1)+0.8(-1,1)
  • Slide 11
  • Basis of Transform Basis vectors{v 1,v 2, ,v n } Orthogonal : (v i ) (v j ) = 0 if i!=j Normalized : (v i ) (v i ) = 1 Orthonormal : orthogonal and normalized eg. orthonormal : {(0,1),(1,0)} Orthogonal : {(1,1),(-1,1)}
  • Slide 12
  • Why DCT is used for image compressing KLT(Karhunen-Loeve transform): Statistically optimal transform: minimal MSE for any specific bandwidth reduction KLT depends on the type of signal statistics No fast algorithm DCT approaches KLT for highly correlated signals: sample values typically vary slowly from point to point across an image =>Highly correlated signals Fast algorithm(but not optimal)
  • Slide 13
  • DCT-basis
  • Slide 14
  • DCT :Discrete Cosine Transform Frequency DomainSpatial Domain [8,8,8,8,8,8,8,8] [8,8,8,8,8,8,8,9] [8,8,10,9,7,8,8,9] [8,90,-100,3,4,-10,2,80] DCT [44,0,0,0,0,0,0,0] [44,-2,0,-2,0,-2,0,-2] [46,-2,-2,-4,-2,2,0,-2] [48,-56,146,6,74,-148,-158,-136]
  • Slide 15
  • DCT 52 55 61 66 70 61 64 73 63 59 66 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of DCT
  • Slide 16
  • Quantization IQ 1 1 1 4 4 4 7 7 7 10 10 10 3 ( IQ) 0 0 0 3 3 3 6 6 6 9 9 9 Q( 3) 0 0 0 1 1 1 2 2 2 3 3 3 0 1 2 3 4 5 6 7 8 9 10 11
  • Slide 17
  • -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 -415/16 = -26 Example of JPEG Coding(Encoder)
  • Slide 18
  • -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -415 -29 -62 25 55 -20 -1 3 7 -21 -62 9 11 -7 -6 6 -46 8 77 -25 -30 10 7 -5 -50 13 35 -15 -9 6 0 3 11 -8 -13 -2 -1 1 -4 1 -10 1 3 -3 -1 0 2 -1 -4 -1 2 -1 2 -3 1 -2 -1 -1 -1 -2 -1 -1 0 -1 Example of JPEG Coding(Encoder)
  • Slide 19
  • Zigzag Scan 2D->1D DC term AC term BACK
  • Slide 20
  • -26 -3 -6 2 2 0 0 0 1 -2 -4 0 0 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 3 1 3 2 6 2 4 1 4 1 1 5 0 2 0 0 1 2 0 0 0 0 0 1 1 EOB 2D->1D Example of JPEG Coding(Encoder) Transform coding(DCT) Quantization Zigzag Scan Zigzag Scan Entropy Coding (bit stream)
  • Slide 21
  • Entropy Coding (Variable-Length Coding) Huffman coding Run-length coding Arithmetic coding
  • Slide 22
  • Huffman Coding (word) (code) 111110100Variable- Length Code 11100100Fixed- Length Code 1/24 1/63/4 DCBA 1*(3/4)+2*(1/6)+3*(1/24)+3 *(1/24) = 1.333 2*(3/4)+2*(1/6)+2*(1/24)+2 *(1/24) = 2
  • Slide 23
  • DPCM : Differential PCM (word) coding coding AAFFFFFCCC PCM => 65,65,70,70,70,70,70,67,67,67 or 0,0,5,5,5,5,5,2,2,2 DPCM => 0,0,5,0,0,0,0,-3,0,0
  • Slide 24
  • Run-Length Coding 0 0 0 1 6 0 3 EOB ^^^^^^^ ^ ^^^ (3,1) (0,6) (1,3) 001111 001000010 001001010 10 3 1 0011 1s 3 2 0010 0100 s 0 5 0010 0110 s 0 6 0010 0001 s 1 2 0001 10s 1 3 0010 0101 s
  • Slide 25
  • Video Compression Encoder For Still Image TQ Entropy coding Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Encoder For Video Sequence Q -1 T -1 Reconstructed Transform Coefficients Reconstructed Image block MC -
  • Slide 26
  • H.261 Intra frame frame information Inter frame reference frame motion vector
  • Slide 27
  • H.261 Coder DCTQ Inverse DCT Motion Compensation Loop Filter Video in Inverse Q
  • Slide 28
  • Motion Estimation (32,16) (-10,4) (22,20) Referenced frame Current frame Macro block 16*16 31*31
  • Slide 29
  • Full-search algorithm Current original frame Current referenced frame Maximum check 31*31=961
  • Slide 30
  • 3-step search algorithm Current original frame Current referenced frame 8->4->2->1 maximum check 1+8+8+8+8=33
  • Slide 31
  • NTSS(new 3-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
  • Slide 32
  • FSS(4-step search) algorithm -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
  • Slide 33
  • BBGDS
  • Slide 34
  • Overview of Fine Granularity Scalability in MPEG-4 Video Standard Weiping Li, Fellow, IEEE
  • Slide 35
  • Illustration of video coding performance
  • Slide 36
  • Multi-layer Coding
  • Slide 37
  • SNR scalability decoder defined in MPEG-2
  • Slide 38
  • Layered scalable coding Tech. Temporal scalability
  • Slide 39
  • Layered scalable coding Tech. Spatial scalability
  • Slide 40
  • BIT-PLANE CODING OF THE DCT COEFFICIENTS
  • Slide 41
  • Slide 42
  • Slide 43
  • FGS USING BIT-PLANE CODING OF DCT COEFFICIENTS Overall Coding Structure of FGS Some Details of FGS Coding Profile Definitions in the Amendment of MPEG-4
  • Slide 44
  • Overall Coding Structure of FGS FGS encoder structure
  • Slide 45
  • Overall Coding Structure of FGS FGS decoder structure
  • Slide 46
  • Some Details of FGS Coding 1)Different Numbers of Bit-Planes for Individual Color Components 2)Variable-Length Codes 3)Decoding Truncated Bitstreams
  • Slide 47
  • Different Numbers of Bit- Planes for Individual Color Components
  • Slide 48
  • Variable-Length Codes Statistics of the (RUN, EOP) symbols in the four VLC tables
  • Slide 49
  • Coding patterns for syntax element fgs_cbp
  • Slide 50
  • Slide 51
  • Decoding Truncated Bitstreams Decoding of the truncated bitstream is not standardized in MPEG-4. One possible method To look ahead 32 bits at every byte-aligned position in the bitstream. If the 32 bits are not fgs vop start code, the first 8 bits of the 32 bits are information bits of the FGS frame to be decoded. The decoder slides the bitstream pointer by one byte and looks ahead another 32 bits to check for fgs vop start code.
  • Slide 52