low bit rate h.263 + video coding: efficiency, scalability...
TRANSCRIPT
Low Bit Rate H.263 + VideoLow Bit Rate H.263 + VideoCoding: Efficiency, Scalability andCoding: Efficiency, Scalability and
Error ResilienceError Resilience
Faouzi Kossentini
Signal Processing and Multimedia GroupDepartment of Electrical and Computer Engineering
University of British Columbia
http://spmg.ece.ubc.ca/
OutlineOutline
À Introduction: Low bit rate video codingÀ The H.263/H.263+ standards and their optional modesÀ Efficiency: Performance and complexity of individual modes
and combinations of modesÀ Efficiency: Real-time software-only encodingÀ Efficiency: Rate-distortion optimized video codingÀ Scalability: Description & CharacteristicsÀ Scalability: Rate-distortion optimized frameworkÀ Error resilience: Synchronization & error concealmentÀ Error resilience: Multiple description video codingÀ Conclusions: H.263/H.263+, MPEG-4 and research directions
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
2
Copyright (C) 1998Faouzi Kossentini
3
H.263 version 2 (H.263+): H.263 version 2 (H.263+): Higher coding efficiency, more flexibility,
scalability support, error resilience support
Video coding standards:Video coding standards:À ITU-T H.261(1990), H.262 (1994), H.263 (1995), H.263+ (1998)À ISO/IEC MPEG1 (1992), MPEG2 (1994), MPEG4 (1999)
Low Bit Rate Video codingLow Bit Rate Video coding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Video coding algorithms:Video coding algorithms:À Waveform based coding: MC+DCT/wavelets, 3D subband, etc.À Object- and model-based coding : shape coding, wireframes, etc.
Why?: Why?: Increasing demand for video conferencing and telephonyapplications, limited bandwidth in PSTN and wireless networks
Copyright (C) 1998Faouzi Kossentini
ME
IQ
MUXVLC
0
Bit StreamVideo in
Inter/Intra
QDCT
IDCT
PRED
The H.263 StandardThe H.263 Standard
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
4
Copyright (C) 1998Faouzi Kossentini
5
The H.263 StandardThe H.263 StandardStructure
ÀInter picture prediction to reduce temporal redundancies
ÀDCT coding to reduce spatial redundancies in difference frames
I P P P
Major enhancements to H.261
ÀMore standardized input picture formatsÀHalf pixel motion compensationÀOptimized VLC tablesÀBetter motion vector predictionÀFour (optional) negotiable modes
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998Faouzi Kossentini
6
Unrestricted Motion Vector (UMV) mode, Annex D: Motion vectors(MVs) allowed to point outside the picture, MV range extended to ±31.5
Syntax-based Arithmetic Coding (SAC) mode, Annex E : SAC usedinstead of VLCs
Advanced Prediction (AP) mode, Annex F: Four MVs permacroblock, over-lapped motion compensation, MVs allowed to pointoutside the picture area
PB-frames (PB) mode, Annex G: A Frame with a P-picture and a B-picture used as a unit
I B P B P
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
The H.263 Standard: Optional ModesThe H.263 Standard: Optional Modes
Copyright (C) 1998Faouzi Kossentini
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Unrestricted Motion Vector (UMV) mode, Annex D: MV range extendedto ±256 depending on the picture size, reversible VLCs used for MVs
Advanced Intra Coding (AIC) mode, Annex I: Inter block predictionfrom neighboring intra coded blocks, modified quantization, optimized VLCs
Deblocking Filter (DF) mode, Annex J: Deblocking filter inside the codingloop, four motion vectors per macroblock, MVs outside picture boundaries
Slice Structured (SS) mode, Annex K: Slices used instead of GOBs
Supplemental Enhancement Information (SEI) mode, Annex L:Supplemental information included in the stream to offer display capabilities
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
7
Copyright (C) 1998Faouzi Kossentini
Improved PB-frames (IBP) mode, Annex M: Forward, backwardand bi-directional prediction supported, delta vector not transmitted
Reference Picture Selection (RPS) mode, Annex N: Referencepicture selected for prediction to suppress temporal errorpropagation
Temporal, SNR, and Spatial Scalability mode, Annex O: Syntaxto support temporal, SNR, and spatial scalability
Reference Picture Resampling (RPR) mode, Annex P: Warpingof the reference picture prior to its use for prediction
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
8
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Copyright (C) 1998Faouzi Kossentini
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Reduced Resolution Update (RRU) mode, Annex Q: Encoder allowedto send update information for a picture encoded at a lower resolution,while still maintaining a higher resolution for the reference picture
Independently Segmented Decoding (ISD) mode, Annex R:Dependencies across the segment boundaries not allowed
Alternative Inter VLC (AIV) mode, Annex S: The intra VLC tabledesigned for encoding quantized intra DCT coefficients in the AIC modeused for inter coding
Modified Quantization (MQ) mode, Annex T: Quantizer allowed tochange at the macroblock layer, finer chrominance quantization employed,range of representable quantized DCT coefficients extended to [-127, +127]
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
9
Copyright (C) 1998Faouzi Kossentini
Performance & Complexity: Compression efficiency for intra macroblocksincreased, encoding time increased by ~5%, required memory slightly increased
Advanced Intra Coding ModeFirst Intra frame of Akiyo
29
31
33
35
37
4000 6000 8000 10000 12000 14000Y bits
Y P
SN
RAdvanced Intra CodingNo option
DCDC
DC
Block AboveVertical Prediction
Block to the LeftHorizontal
CurrentBlock
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
10
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Individual Modesof Individual Modes
Advanced Intra Coding (AIC) mode, Annex I: Inter block prediction fromneighboring intra coded blocks, modified quantization, optimized VLCs
Copyright (C) 1998Faouzi Kossentini
(a) (b)Without (a) and with (b) DF mode set on
Performance: Subjective quality improved (blocking & mosquitoartifacts removed), 5-10% additional encoding time
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
11
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Individual Modesof Individual Modes
Deblocking Filter (DF) mode, Annex J: Deblocking filter insidethe coding loop, four motion vectors per macroblock, MVs outsidepicture boundaries
Copyright (C) 1998Faouzi Kossentini
Performance: Picture rate doubled, complexity increased, morememory required, one frame delay incurred
Improved PB Frames mode: Y PSNR vs Rate for FOREMAN
26
28
30
32
34
36
20 40 60 80 100 120 140Rate (kbps)
Y P
SNR
P picture, w/o mode
B picture, PB mode
P picture, PB mode
B picture, IPB mode
P picture, IPB mode
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
12
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Individual Modesof Individual Modes
Improved PB-frames (IBP) mode, Annex M: Forward, backward andbi-directional prediction supported, delta vector not transmitted
Copyright (C) 1998Faouzi Kossentini
Performance: Useful at high bit rates, bit savings of as much as 10%achieved, less than 2% additional encoding/decoding time required
Sequence QuantizerStep Size
Bit savingsusing AIV mode
AKIYO 4 5%12 0%
FOREMAN 4 7%8 4%16 1%
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
13
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Individual Modesof Individual Modes
Alternative Inter VLC (AIV) mode, Annex S: The intra VLC tabledesigned for encoding quantized intra DCT coefficients in the AIC modeused for inter coding
Copyright (C) 1998Faouzi Kossentini
Performance: Chrominance PSNR increased substantially at lowbit rates, more flexible rate control supported, very little complexityand computation time added to the coder
MQ mode : Chrominance PSNR vs Rate
35
36
37
38
39
10 30 50 70 90 110Rate (kbps)
Chr
omin
ance
P
SNR
w/o modeMQ mode
MQ mode : Average PSNR vs Rate
29
31
33
35
10 30 50 70 90 110Rate (kbps)
Ave
rage
P
SNR
w/o modeMQ mode
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
14
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Individual Modesof Individual Modes
Modified Quantization (MQ) mode, Annex T: Quantizer allowed tochange at the macroblock layer, finer chrominance quantization employed,range of representable quantized DCT coefficients extended to [-127, +127]
Copyright (C) 1998Faouzi Kossentini
H.263 PSNR = 32.44 H.263+ PSNR = 33.29
H.263+ modes: AIC,MQ, UMV, DF, AP
H.261 PSNR = 31.91
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
15
Efficiency: Performance & ComplexityEfficiency: Performance & Complexityof Combinations of Modesof Combinations of Modes
Copyright (C) 1998Faouzi Kossentini
FOREMAN (fps) AKIYO (fps)
H.263+ Baseline coding (Full ME) 1.8 2.4
H.263+ Baseline coding (Fast ME) 4.3 5.8
H.263+: MQ, DF, AIC, AIV, IPB,UMV, AP modes turned on (Fast ME)
3 3.8
ÀRequired for real-time video telephony and conferencing:encoding at a minimum of 8-10 fps for QCIF resolution
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
16
Copyright (C) 1998Faouzi Kossentini
ÀAlgorithmic approach: Developing efficient platform-
independent video coding algorithms
ÀHardware dependent approach: Mapping SIMD oriented
coding components onto Intel’s MMX architecture
Function Computational Load
Reference IDCT 25%
Integer Pixel ME 12%
Fast DCT 11%
Half Pixel ME 10%
Quantization 9%
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998Faouzi Kossentini
17
18
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
Efficient Video Coding Algorithms:Efficient Video Coding Algorithms:
ÀIDCT for blocks with all or most of the coefficients equal tozero avoided or efficiently computed (respectively)
ÀSAD for most macroblocks computed efficiently via prediction
ÀDCT & Quantization for most blocks avoided via prediction
ÀLinear approximation methods used for half-pixel MEÀQuantization: look-up tables employed
Some Image and Video Processing Research Projects @Some Image and Video Processing Research Projects @the UBC Signal Processing and Multimedia Laboratorythe UBC Signal Processing and Multimedia Laboratory
Intel’s MMX Architecture:Intel’s MMX Architecture:
ÀFeatures :ÀSIMD structure
ÀFour new data types
À57 new instructions
ÀSaturation arithmetic
Packed byte (Eight 8-bit elements) 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
Packed word (Four 16-bit elements) 63 48 47 32 31 16 15 0
Packed doubleword (Two 32-bit elements) 63 32 31 0
Quadword (64-bit element) 63 0
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
19
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
a1a2a3a4a5a6a7a8
- (psubusb)
b1b2b3b4b5b6b7b8
b1b2b3b4b5b6b7b8 a1a2a3a4a5a6a7a8
- (psubusb)
= =y1000y50000x2x3x40x6x7x8
por
y1x2x3x4y5x6x7x8
MMX Mapping of SAD Computations:MMX Mapping of SAD Computations:
ÀSaturation arithmeticis used to performabsolute differenceoperations
ÀResult: 4-5 times faster SAD computations
ÀSAD computation function implemented in MMX for 16x16 blocks(baseline coding) and 8x8 blocks (optional modes)
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
20
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
DCT MMX Implementation:DCT MMX Implementation:
ÀFloating point arithmetic to fixed point arithmeticÀFour rows processed at a timeÀResults: 4 times faster using fixed point arithmetic, 3 times fasterusing MMX as compared to the integer “C” implementation
Foreman
48 Kb/sec
Akiyo
8 Kb/sec
Foreman
64 Kb/sec
Akiyo
28 Kb/sec
PSNR with floating point DCT 30.47 32.63 31.56 37.92
Variation in PSNR with
integer or MMX DCT
-0.01 +0.02 -0.01 0.00
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
21
Efficiency: Real-time Software-only EncodingEfficiency: Real-time Software-only Encoding
Other MMX Implementations:Other MMX Implementations: Interleaved transfers for half pixel
ME (7x), interp. (3x), motion comp. (2x), block data transfers (2x)
Copyright (C) 1998Faouzi Kossentini
Overall Performance:Overall Performance: Function Computational Load
DCT 9%
IDCT 9%
Quantization 8%
Load ME Area 7%
Scan 6%
Interpolation 5.5%
Fast ME 5%
Reconstruct Full Image 5%
Half Pixel ME 4%
Motion Compensation 4%
encoded frame ratebefore optimization
encoded frame rateafter optimization
FOREMAN AKIYO FOREMAN AKIYO
H.263+ Baseline coding (Full ME) 2.0 2.4 4.1 4.4
H.263+ Baseline coding (Fast ME) 6.4 7.5 14.3 17.0
À14-17 fps baseline H.263+encoding with fast ME
À8-10 fps with all of theH.263+ optional modes
ÀMore than a 100% increasein speed using the efficientalgorithms and more than a25% additional increase inspeed using MMX
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
22
Copyright (C) 1998Faouzi Kossentini
Motivation:ÀBest possible tradeoffs between quality (distortion) and bit rate
obtained via Rate-Distortion (RD) optimization, through the use ofthe Lagrangian cost measure
Solution:ÀMotion vector selected that yields the best rate-quality tradeoff
ÀCoding mode (Intra, Inter16, Inter8, Skipped) selected that yieldsthe best rate-quality tradeoff based on the already determinedmotion vectors
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
23
Efficiency: Rate-Distortion Optimized Video CodingEfficiency: Rate-Distortion Optimized Video Coding
Copyright (C) 1998Faouzi Kossentini
Result:À10 - 30% savings in bit rate obtained using the RD optimized coder for
most sequences over our reference software
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
24
Comparison of RD Performance RD Optimized Coder vs TMN-3.1.2
Foreman CIF
28.00
30.00
32.00
34.00
36.00
38.00
0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 900.00 1000.00
Rate
Y P
SN
R RD Coder
TMN-3.1.2 Coder
Efficiency: Rate-Distortion Optimized Video CodingEfficiency: Rate-Distortion Optimized Video Coding
Copyright (C) 1998Faouzi Kossentini
Description:ÀMulti-representation coding at different quality and resolution
levels allowed
ÀThree general types (SNR, spatial, temporal) of scalabilitysupported
Characteristics:ÀEach layer built incrementally on its previous layer, increasing
the decoded picture quality, resolution, or both
ÀOnly the first (base) layer independently decoded
ÀInformation from temporally previous pictures in the same layeror temporally simultaneous pictures in a lower layer used
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
25
Scalability: Description & CharacteristicsScalability: Description & Characteristics
Copyright (C) 1998Faouzi Kossentini
Example:ÀEnhancement layer 1: Example of SNR scalability
ÀEnhancement layer 2: Example of spatial and temporal scalability
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
26
EI EP
EnhancementLayer 2
EnhancementLayer 1
EI B
I PBaseLayer
EP
EI
P
EP
EP
P
Scalability: Description & CharacteristicsScalability: Description & Characteristics
Copyright (C) 1998Faouzi Kossentini
Motivation:ÀCompression efficiency of a higher layer dependent on that of all
preceding (lower) layers
ÀHigher compression efficiency and more flexible joint rate-distortioncontrol achieved via an RD optimized framework
Current research direction:ÀEmploy (1) multiple rate constraints (explicit for each layer) OR (2) a
single rate constraint with explicit distortion constraints for each layer.
ÀApply RD optimization to temporal, spatial and SNR scalable codingalgorithms
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
27
Scalability: RD Optimized FrameworkScalability: RD Optimized Framework
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
28
Error Resilience: Synchronization& Error Concealment
Use of Synchronization Markers:
ÀDifferent synchronization markers (between GOBs, slices,etc.) used for higher error resilience
Error concealment:
ÀMissing motion vectors for a macroblock received in errorrecovered from neighboring macroblocks (via spatial methods)
ÀTexture information from the previous frame is compensatedusing the recovered motion vectors
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
29
Experimental Results:ÀPacketization: one packet consists of a complete GOB.ÀUsing synchronization markers and error concealmentyields PSNR gains of as much as 9 dB.
Error Concealment and Packetization Performance for Foreman
1214161820222426283032
0 5 10 15 20 25
Packet Loss Rate (PLR)
Y-P
SN
R
Concealment
No Concealment
Error Resilience: Synchronization& Error Concealment
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
30
Use of Reference Picture Selection Mode: Use of differentpictures (or picture segments) for prediction allowed by H.263+,hence multiple description video coding
Multiple Description:ÀDifferent descriptions of the source sent on different channels,may be available at the decoderÀRedundancy among descriptions minimizedÀUseful even if only one description is available
Temporal Reference for Prediction:ÀDifferent temporal reference pictures employed forprediction of different picturesÀError propagation confined to those pictures dependent onthe missing reference
Error Resilience: Multiple DescriptionVideo Coding
Copyright (C) 1998Faouzi Kossentini
H.263+ versus H.263:ÀH.263+ backward compatible with the H.263 (same baseline)
ÀEfficient ways provided by H.263+ of trading additional complexity formore compression efficiency, scalability, resilience and flexibility
H.263/H.263+ versus MPEG-4:ÀH.263 output decodable by MPEG-4, H.263+/MPEG-4 sharing tools
ÀAlthough somewhat available via chroma keying in H.263+, object-based functionality better provided by MPEG-4
ÀH.263/H.263+ public-domain implementation: http://spmg.ece.ubc.ca/
Future directions:
ÀStill higher coding efficiency and better scalability/resilience possible
ÀMore object-based tools and capabilities, better (efficient) shape coding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998 Faouzi Kossentini
31
Conclusions