video coding standard

47
CMPT365 Multimedia Systems 1 Media Compression - Video Coding Standards Fall 2005 CMPT 365 Multimedia Systems

Upload: videoguy

Post on 13-Jan-2015

2.085 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Video Coding Standard

CMPT365 Multimedia Systems 1

Media Compression- Video Coding Standards

Fall 2005

CMPT 365 Multimedia Systems

Page 2: Video Coding Standard

CMPT365 Multimedia Systems 2

Video Coding Standards

H.264/AVC

Page 3: Video Coding Standard

CMPT365 Multimedia Systems 3

Coding Rate and Standards

8 16 64 384 1.5 5 20

kbit/s Mbit/s

Very low bitrate Low bitrate Medium bitrate High bitrate

Mobilevideophone

Videophoneover PSTN

ISDNvideophone

Digital TV HDTVVideo CD

MPEG-4 MPEG-1 MPEG-2H.261H.263

Page 4: Video Coding Standard

CMPT365 Multimedia Systems 4

Standardization Organizations

ITU-T VCEG (Video Coding Experts Group)

standards for advanced moving image coding methods appropriate for conversational and non-conversational audio/visual applications.

ISO/IEC MPEG (Moving Picture Experts Group)

standards for compression and coding, decompression, processing, and coded representation of moving pictures, audio, and their combination

Relation ITU-T H.262~ISO/IEC 13818-2(mpeg2) Generic Coding of Moving Pictures and

Associated Audio. ITU-T H.263~ISO/IEC 14496-2(mpeg4)

WG - work groupSG – sub group ISO/IEC JTC 1/SC 29/WG 1 Coding of Still Pictures

ISO/IEC JTC 1/SC 29/WG 11

Page 5: Video Coding Standard

CMPT365 Multimedia Systems 5

Introduction

H.261 MPEG-1 MPEG-2 H.263 MPEG-4 H.264

Page 6: Video Coding Standard

CMPT365 Multimedia Systems 6

H.261

Earliest DCT-based video standard: 1990 ITU Recommendation for videoconferencing and

videophones over ISDN Targeted bit rate: p x 64 kbps (p=1, …, 30)

Videophone: low rate, e.g., 64kbps Videoconferencing: high rate, e.g., 384kbps (p=6) Max: 1.92Mbps (p=30)

Picture format: CIF (Common Intermediate Format, 352 x 288) QCIF (Quarter CIF): 176 x 144.

Max delay: 150 ms (for bidirectional interactivity) Sequential search Amenable to low-cost VLSI implementation No B mode

Page 7: Video Coding Standard

CMPT365 Multimedia Systems 7

Layered Structure for Video Data Video multiplex arrangement:

Picture layer GOB layer MB layer block layer Group of Blocks (GOB):

3 rows of 11 macroblocks (MBs)(Y: 176 x 48, UV: 88 x 24)

QCIF: 3 GOBs CIF: 12 GOBs MB: 16 x 16 luma

One GOB

QCIF: 176 x 144

CIF: 352 x 288

Cr Cb

MB

Y1 Y2

Y3 Y4

8x8

Page 8: Video Coding Standard

CMPT365 Multimedia Systems 8

Entropy coding

Similar to JPEG Zigzag scan (Run, Level) coding EOB

Page 9: Video Coding Standard

CMPT365 Multimedia Systems 9

Introduction

H.261 MPEG-1 MPEG-2 H.263 MPEG-4 H.264

Page 10: Video Coding Standard

CMPT365 Multimedia Systems 10

MPEG-1

Committee formed in 1988 Finalized in 1991 Used for VCD Random access, fast forward/reverse search Delay: 1 sec (for unidirectional video access) 1/2-pixel ME/MC No deblocking filter B frames Software-only decoding is possible MPEG-1 Audio coding:

3 layers of encoding:• Layer 1: 4 : 1 compression ratio with CD quality• Layer 2: 6 : 1 to 8 : 1• Layer 3 (MP3): 10 : 1 to 12 : 1

Page 11: Video Coding Standard

CMPT365 Multimedia Systems 11

MPEG-1 Video

Progressive video only Layered structure:

Sequence, Group of picture (GOP), Picture, Slice, Macroblock, Block

I B B P … B B P

……

GOP

I B B P … B B P

……

GOP

Page 12: Video Coding Standard

CMPT365 Multimedia Systems 12

Quantization and Entropy Coding

Stepsize varies by frequency for I blocks Similar to JPEG Scaling is adjusted on a MB basis 8 16 19 22 26 27 29 34

16 16 22 24 27 29 34 3719 22 26 27 29 34 34 3822 22 26 27 29 34 37 4022 26 27 29 32 35 40 4826 27 29 32 35 40 48 5826 27 29 34 38 46 56 6927 29 35 38 46 56 69 83

Entropy coding: Similar to JPEG and H.261

Page 13: Video Coding Standard

CMPT365 Multimedia Systems 13

B frames

Temporal prediction for B pictures:

1 ,1 ,5.0 ,0 , ,ˆˆˆ21212211 ccb

bC1

C2

Frame k-1

Frame k

Frame k+1

prediction nalbidirectio :5.0

prediction backward:1 ,0

prediction forward:0,1

21

21

21

Page 14: Video Coding Standard

CMPT365 Multimedia Systems 14

Introduction

Rate Control and in-loop deblocking filter H.261 MPEG-1 MPEG-2 H.263 MPEG-4 H.264

Page 15: Video Coding Standard

CMPT365 Multimedia Systems 15

MPEG-2

Completed in 1994 Extension of MPEG-1 Standard for DVD, SDTV, HDTV Support interlaced inputs Support scalable coding Flexible frame size Low delay Support a wide range of applications Source format: 4:4:4:, 4:2:2, 4:2:0 1/2-pixel ME/MC (bilinear interpolation) B frames MPEG-2 Audio:

Support 5.1 channels AAC: 30% fewer bits than MP3

Page 16: Video Coding Standard

CMPT365 Multimedia Systems 16

Profiles and Levels

Defined to manage the large number of coding tools and the broad range of formats and bit rates supported

Profiles and levels define a set of conformance points, each targeting a class applications

Maximize interoperability and limiting the complexity Profile: a subset of the entire bit stream syntax Levels: a specified set of constraints imposed on values

of the syntax elements in the bit stream (maximum bit rate, buffer size, pic. resolution)

Page 17: Video Coding Standard

CMPT365 Multimedia Systems 17

MPEG-2 Levels

Level Max Pixels Max Lines Max Frame/s

Low 352 288 30

Main 720 576 30

High 1440 1440 1152 60

High 1920 1152 60

Page 18: Video Coding Standard

CMPT365 Multimedia Systems 18

Introduction

Rate Control and in-loop deblocking filter H.261 MPEG-1 MPEG-2 H.263 MPEG-4 H.264

Page 19: Video Coding Standard

CMPT365 Multimedia Systems 19

H.263 Derived from H.261 Intended for very low bit-rate application

Better quality at 18-24kbps than H.261 at 64 kbps Used in MS NetMeeting, Messenger …

Can handle high resolution (up to 16CIF: 1408 x 1152) No loop filter 1/2-pixel ME/MC Optional coding modes (defined in 8 Annexes):

Unrestricted motion vector (Annex D):• MV can point outside of picture boundary by extrapolating the

boundary pixels (repeat padding is usually used)• MV range: [-31.5, 31.5]

Arithmetic coding Advanced prediction (Annex F):

• Overlapped block motion compensation• 4MV: 1 for each 8x8 block

Page 20: Video Coding Standard

CMPT365 Multimedia Systems 20

Advanced Prediction (4MV) Each 8x8 block in a MB can have its own MV Suitable when there is complicated motion in the MB Need more bits to encode the MVs Need to compare the performance fo 1 MV and 4MV

MV2

MV1 MV

MV3 MV2

MV1 MV

MV3

MV2

MV1 MV

MV3

MV1 MV

MV2 MV3

Page 21: Video Coding Standard

CMPT365 Multimedia Systems 21

Run-Level-Last Entropy Coding

3-D VLC: (LAST, RUN, LEVEL): LAST: 1 for last non-zero coefficient of a block 0 otherwise RUN: number of zeros before the current coefficient LEVEL: value of the current non-zero coefficient

No EOB as in JPEG

Page 22: Video Coding Standard

CMPT365 Multimedia Systems 22

H.263+ and H.263++

H.263+: Second version of H.263 Some further optional features: Annex I to T. Annex J: in-loop deblocking filter H.263++: three more optional modes (2000)

Annex V: Data partitioned slice mode • For enhanced resilience to transmission error

Page 23: Video Coding Standard

CMPT365 Multimedia Systems 23

Introduction

Rate Control and in-loop deblocking filter H.261 MPEG-1 MPEG-2 H.263 MPEG-4 H.264

Page 24: Video Coding Standard

CMPT365 Multimedia Systems 24

MPEG-4

Based on H.263 A new concept rather than an improved algorithm Deal with a variety of multimedia contents: audio, visual

, image, graphic. Part 2: Visual

Based on H.263 Object-based coding Coding of animated objects Scalability: Fine Granular Scalability (FGS) Texture coding: wavelet-based

Part 10: Advanced Video Coding H.264

Page 25: Video Coding Standard

CMPT365 Multimedia Systems 25

Video Objects (VO)

MPEG-4 treats a video sequence as a collection of video objects

Each scene is decomposed into multiple objects The segmentation method is not part of the standard

Each object is specified by shape, motion, and texture. Natural visual Objects:

Image, video, sprite (background) Synthetic visual object:

Face and body 2-D mesh 3-D mesh

The decoder can compose different scenes by using different number of decoded objects

Page 26: Video Coding Standard

CMPT365 Multimedia Systems 26

Scene Composition

The decoder can compose different scenes by using different number of decoded objects

Page 27: Video Coding Standard

CMPT365 Multimedia Systems 27

MPEG-4 Structure

A/Vobject

Decoder

MUX

Com

posito

r

Bitstream Audio/Video scene

A/Vobject

Decoder

A/Vobject

Decoder

Page 28: Video Coding Standard

CMPT365 Multimedia Systems 28

A video frame

Background VOP

VOP

VOP

More MPEG-4 Example

Instead of ”frames”: Video Object Planes Shape Adaptive DCT

Alpha map

SA DCT

Page 29: Video Coding Standard

CMPT365 Multimedia Systems 29

Object 2

Object 1

Object 3

Object 4

Example

Problems, comments?

Page 30: Video Coding Standard

CMPT365 Multimedia Systems 30

Example

Page 31: Video Coding Standard

CMPT365 Multimedia Systems 31

Status

Microsoft, RealVideo, QuickTime, ... But only recentagular

frame based H.264 = MPEG-4 part 10

(2003)

Page 32: Video Coding Standard

CMPT365 Multimedia Systems 32

Summary of StandardsStandard Digitisation

formatCompressed rate Example applications

H. 261 CIF/ QCIF X 64 kbps Video conferencing over LANs

H. 263 S-QCIF/ QCIF <64kbps Video conferencing over low bits rate channels

MPEG 1 SIF <1.5Mbps VHS quality video storage

MPEG 2LowMain

High 1440

High

SIF4:2:04:2:24:2:04:2:24:2:04:2:0

<4Mbps<15Mbps<20Mbps<60Mbps<80Mbps<80Mbps<100Mbps

VHS quality video recordingDigital video broadcasting

High definition TV (4/3)

High definition TV (16/9)

MPEG 4 Various 5kbps – tens Mbps Versatile multimedia coding standard

H.264 Various Various Various

SIF: Standard Interchange Format, 352x240 pixels at 30 Hz.

Page 33: Video Coding Standard

CMPT365 Multimedia Systems 33

What’s Next ? - H.264

1998: Call for proposal for H.26L issued by ITU-T VCEG (Video Coding Expert Group)

Objective: 50% bit rate savings compared to MPEG-2 High quality video at both low and high bit rates More error resilience tools

Oct. 1999: First draft design Dec. 2001: VCEG and MPEG formed the Joint

Video Team (JVT) Approved in 2003:

ITU-T H.264 and ISO/IEC MPEG-4 Part 10 Advanced Video Coding (AVC)

Page 34: Video Coding Standard

CMPT365 Multimedia Systems 34

Applications Bit rate: 64kbps to 240Mbps Broadcast over cable, satellite, DSL … Interactive/serial storage on optical/magnetic devices, DVD … Conversational services over network Video on demand, streaming media over network Multimedia messaging service over network

Three Profiles: Baseline, Main, and Extended 15 levels Four new profiles in Fidelity Range Extenstions (FRExt):

High, High 10, High 4:2:2, High 4:4:4

Page 35: Video Coding Standard

CMPT365 Multimedia Systems 35

Two-Layer Structure Video Coding Layer (VCL)

Effectively represent the video content Network Adaptation Layer (NAL)

• Enable simple and effective customization of the VCL• allows H.264 to be transported over different networks

Video Coding Layer

Data Partitioning

Network Adaptation Layer

H.320 MP4FF H.323/IP MPEG-2 etc.

Coded Macroblock

Coded Slice/Partition

Page 36: Video Coding Standard

CMPT365 Multimedia Systems 36

Block Diagram

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

De-blockingFilter

OutputVideoSignal

Page 37: Video Coding Standard

CMPT365 Multimedia Systems 37

Video Coding Layer: Slice coding

Slice 1

Slice 2

Slice 3

Slices can have different shapes and sizes Each slice is self-contained

Can be decoded without knowing data other slices Useful for:

Error resilience and concealment Parallel processing

Page 38: Video Coding Standard

CMPT365 Multimedia Systems 38

Intra-Picture Prediction

Performed in spatial domain instead of in transform domain

Two basic prediction modes: Intra 4x4: for areas with details Intra 16x16: for smooth areas I_PCM: No prediction, raw samples are sent directly.

• To limit the maximum number of bits for each block

Page 39: Video Coding Standard

CMPT365 Multimedia Systems 39

Intra-Picture Prediction Intra_4x4 Prediction (9 modes)

Predict each 4 x 4 block Suitable for details

8

1

6

4507

3

Prediction Directions(Mode 2: DC prediction)

Current 4x4 block

Neighbors used for prediction

Mode 0 Mode 3 Mode 4

Page 40: Video Coding Standard

CMPT365 Multimedia Systems 40

Intra-Picture Prediction cont’d

Intra_16x16 prediction (4 modes) Predict the entire 16 x 16 luma block Suitable for smooth areas

Page 41: Video Coding Standard

CMPT365 Multimedia Systems 41

Inter-Picture Prediction P macro-blocks can be partitioned into smaller regions

Up to 16 MVs MVs are differentially encoded. Need lot of optimization efforts to decide the best mode.

16 x 16 16 x 8 8 x 16 8 x 8

8 x 4 4 x 8 4 x 4

Page 42: Video Coding Standard

CMPT365 Multimedia Systems 42

Multiple Reference Pictures

More than one previously decoded pictures can be used as reference

Page 43: Video Coding Standard

CMPT365 Multimedia Systems 43

4x4 Integer Transform

1-22-1

2-1-12

11-1-1

1111

H

Fast implementation Smaller size leads to less noise around edges.

10/1

10/1

4/1

1/4

1/10-2/10- 1/41/4

2/10 1/10-1/4-1/4

2/10-1/101/4-1/4

1/10 2/101/41/4

1 THH

16 x 16

8x8

Hierarchical Transform: For further decorrelation Apply 4x4 WHT to Luma DC Apply 2x2 WHT to chroma DC

1 1-1-1

1-1-1 1

1-1 1-1

1 1 1 1

2

12H

Page 44: Video Coding Standard

CMPT365 Multimedia Systems 44

Entropy Coding

CAVLC: Context-adaptive VLC CABAC: Context adaptive binary arithmetic

coding 9-14% more efficient than CAVLC

Page 45: Video Coding Standard

CMPT365 Multimedia Systems 45

Context Modeling

Encode the next symbol based on context info Collect probability distribution for each possible

context: p (x | Ci) Four types of context models:

Use neighboring block info Use previous bins (b0, b1, …bi-1) as context for bi. Use scanning position (for transform coeff coding) Use accumulated number of encoded levels with specific

value (for transform coeff coding)

Page 46: Video Coding Standard

CMPT365 Multimedia Systems 46

New Directions for H.264

SNR scalability Multi-view coding (3D Audio-visual coding)

Page 47: Video Coding Standard

CMPT365 Multimedia Systems 47

Reference

D. Marpe, H. Schwarz, T. Wiegand, Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard, IEEE Transactions on Circuits and Systems for Video Technology, Volume: 13 , Issue: 7 , July 2003, Pages: 620 – 636.