emerging technologies in multimedia communications

51
Emerging Technologies in Multimedia Communications 電電電電電電 電電電 電電電電電電 電電電 Dean of EECS College: Hsueh- Ming Hang 台台台台台台 Taipei Univ. of Technology

Upload: luna

Post on 14-Jan-2016

59 views

Category:

Documents


0 download

DESCRIPTION

Emerging Technologies in Multimedia Communications. 電資學院院長 杭學鳴 Dean of EECS College : Hsueh-Ming Hang 台北科技大學 Taipei Univ. of Technology. Contents. Audio and video standards Video standards evolution Emerging techniques in video coding Audio standards evolution. Video Coding Standards. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Emerging Technologies in Multimedia Communications

Emerging Technologies in Multimedia Communications

電資學院院長 杭學鳴電資學院院長 杭學鳴Dean of EECS College: Hsueh-Ming Hang

台北科技大學Taipei Univ. of Technology

Page 2: Emerging Technologies in Multimedia Communications

Contents

Audio and video standards Video standards evolution Emerging techniques in video coding Audio standards evolution

Feb 2009 2hmhang/EECS, NTUT

Page 3: Emerging Technologies in Multimedia Communications

Video Coding Standards

Standards Typical rates Applications

ITU-T (CCITT) H.261 128 384k bits/s Videophone over ISDN

ISO MPEG-1 (11172-2) 1.2 Mbits/s Video CD

ISO MPEG-2 (13818-2) 4–10 Mbits Digital TV/HDTV

(ITU-T H.262) 20 Mbits/s Over air/networks

ITU-T H.263 < 64k bits/s Videophone

ISO MPEG-4 (14496-2) Low/high-rates Object-oriented

ISO MPEG-7 (15938) Database Content description

ITU-T H.263 v2 < 64k bits/s PSTN/wireless Videophone

ITU-T H.264 (JVT,AVC) < 40k bits/s Net/wireless Videophone

ITU-T H.264 ext (SVC) Multi-layer Net/wireless streaming

ISDN: Integrated Services Digital Network

Feb 2009 3hmhang/EECS, NTUT

Page 4: Emerging Technologies in Multimedia Communications

MPEG Audio Standards

MPEG-1 Layer 1: 1992 (good: 256k /2ch) 1-2 chs

MPEG-1 Layer 2: 1992 (good: 192k /2ch) 1-2 chs

MPEG-1 Layer 3: 1993 (MP3) (good: 128k /2ch) 1-2 chs

MPEG-2 Layers 1,2,3: 1994 1-5.1 chs

MPEG-2 AAC: 1997

(Advanced Video Coding)

(good: 96k /2ch) 1-96 chs

MPEG-4 (v1) subpart 3 General Audio Coding, AAC: 1999

(new tools: PNS, LTP, TwinVQ)

1-96 chs

MPEG-4 Amd 1: (2003) Bandwidth extension (SBR -- Spectral Band Replication)

HE-AAC, AAC+ (good: 48k)

MPEG-4 Amd 2: (2004) Parametric Audio extension MPEG surround (MPEG-D 2006)

(good: 24k?)

Feb 2009 4hmhang/EECS, NTUT

Page 5: Emerging Technologies in Multimedia Communications

Image/Video Standards

ISO/IEC JTC1 SC29 – ISO and IEC Joint Technical Committee (on Information Technology) Subcommittee 29 (Coding of audio, picture, multimedia and hypermedia)

– Working Group (WG) 1: JBIG (Joint Bi-level Image Group) – 1-bit to 4/5-bit still

pictures JPEG (Joint Photographic Experts Group) – 8-bit or more

still pictures ISO/IEC JTC1 SC29 – WG 11: MPEG (Moving Picture Experts Group) – Motion

pictures – WG 12: MHEG (Multimedia-Hypermedia Experts Group) –

Multi/Hyper-media exchange format

Feb 2009 5hmhang/EECS, NTUT

Page 6: Emerging Technologies in Multimedia Communications

Standards Organizations

CCITT – Comité Consultaitif International Télégraphique et Téléphonique (International Telegraph and Telephone Consultative Committee)

ITU – International Telecommunication Union ISO – International Standardization

Organization IEC – International Electrotechnical

Commission

Feb 2009 6hmhang/EECS, NTUT

Page 7: Emerging Technologies in Multimedia Communications

MPEG Committee Convener: Leonardo Chiariglione

Standards:-- MPEG-1: done-- MPEG-2: done-- MPEG-4: done?!-- MPEG-7: done?!-- MPEG-21: done?

-- MPEG A,B,C,D,E: on-going

Feb 2009 7hmhang/EECS, NTUT

Page 8: Emerging Technologies in Multimedia Communications

MPEG Chair Dr. Chiariglione at NCTU (2003.12)

http://www.chiariglione.org

Feb 2009 8hmhang/EECS, NTUT

Page 9: Emerging Technologies in Multimedia Communications

MPEG-A,B,C MPEG-A (ISO/IEC 23000) Multimedia Application Formats

Part 1 Purpose for Multimedia Application formatsPart 2 Music Player Application FormatPart 3 Photo Player Application Format … Part 12

MPEG-B (ISO/IEC 23001) MPEG Systems TechnologiesPart 1 Binary MPEG format for XML … Part 5

MPEG-C (ISO/IEC 23002) MPEG Video TechnologiesPart 1 Accuracy specification for implementation of

integer-output IDCTPart 2 Fixed point implementation of DCT/IDCTPart 3 Auxiliary Video Data Representation

Part 4 Video Tool LibraryFeb 2009 9hmhang/EECS, NTUT

Page 10: Emerging Technologies in Multimedia Communications

MPEG-D,E MPEG-D (ISO/IEC 23003) MPEG Audio Technologies

Part 1 MPEG Surround Part 2 Spatial Audio Object Coding Part 3 Unified Speech and Audio Coding

MPEG-E (ISO/IEC 23004) MPEG Multimedia MiddlewarePart 1 Architecture Part 2 Multimedia APIPart 3 Component ModelPart 4 Resource and Quality ManagementPart 5 Component DownloadPart 6 Fault ManagementPart 7 System Integrity Management

Part 8 Reference Software and ConformanceFeb 2009 10hmhang/EECS, NTUT

Page 11: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 11

How I Got Involved?

1984: Joined AT&T Bell Labs – Visual Comm. Dept.

H.261 video standard started 1988.1: MPEG started 1991.12: I joined NCTU discontinued

standard activities 1999.9: NCTU formed a small group to

participate in the MPEG activities

Page 12: Emerging Technologies in Multimedia Communications

NCTU MPEG Activity

Tihao Chiang (蔣迪豪 ), C.J. Tsai (蔡淳仁 ), Wen Peng (彭文孝 ) and H.-M. Hang (杭學鳴 ) attend MPEG meetings constantly

Tihao Chiang : Co-editor, MPEG-4 Part 7 Optimised Reference Software (Done)

C.J. Tsai : Co-editor, MPEG-21 Part 12 Multimedia Test Bed for Resource Delivery (Done)

107 contributions (input and output documents) in the past 5 years (2002 -- 2007). [Dr. Y.-S. Tung, NTU]

Example: Call for Proposal on Scalable Video Coding (Feb. 2004) – 2 out of 14 proposalsFeb 2009 12hmhang/EECS, NTUT

Page 13: Emerging Technologies in Multimedia Communications

Image & Video Compression:Image & Video Compression:JPEG JPEG AVC (H.264) AVC (H.264)

Feb 2009 13hmhang/EECS, NTUT

Page 14: Emerging Technologies in Multimedia Communications

Progress of Image/Video Coding

H.261 (CCITT/ITU;1984, 88, 90) – video (videoconf.) JPEG (1986, 89, 92) – image (Digital Camera) MPEG-1 (1988 – 92) – video (VCD) MPEG-2 (1990 – 94) – video (DVD, DTV) MPEG-4 part 2 (1992 – 99) – video (Internet, WL) H.263 (1993 – 95; ver.3: 2000) – video (WL) JPEG2000 (1996 – 2001) – image H.264 (MPEG-4 part 10) AVC (1998 – 03) – video

(WL, HD-DVD) AVC Amd.1 (2003 – 2008) – scalable video coding

Feb 2009 14hmhang/EECS, NTUT

Page 15: Emerging Technologies in Multimedia Communications

Scalable Bitstream

Progressive approximation

GOP Header Motion Info. Image Data

300kbpsPSNR=32.2 dB

500kbpsPSNR=34.6 dB

1000kbpsPSNR=38.2 dB

Feb 2009 15hmhang/EECS, NTUT

Page 16: Emerging Technologies in Multimedia Communications

Spatial/SNR Scalability

176x144, 256Kbs 352x288, 750Kbs

704x576, 1.5Mbs704x576, 6Mbs

Feb 2009 16hmhang/EECS, NTUT

Page 17: Emerging Technologies in Multimedia Communications

Scalable Video CodingScalable Video Coding

Why scalable video coding?Why scalable video coding? RReliablyeliably deliver video to deliver video to diversediverse clients over clients over

heterogeneousheterogeneous networks using available networks using available system system resourcesresources

Types of ScalabilityTypes of Scalability SNR scalability (quality)SNR scalability (quality) Spatial scalability (frame resolution)Spatial scalability (frame resolution) Temporal scalability (frame rate)Temporal scalability (frame rate)

Combined scalability

17

Feb 2009 17hmhang/EECS, NTUT

Page 18: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 18

MPEG SVC Activity 2003.10 Call-for-Proposal (CfP) 2004.2 Proposals received (14 submitted)

(M10737) (NCTU submitted two proposals) 2004.3 Evaluations: two categories (M10480)

Category 1: MCTF+Wavelet (10)Category 2: AVC based (incl. AVC/MCTF) (4)

2004.3/7/10 Proposals and Refinements evaluated

2005.1 AVC became Amd 1 of MPEG-4 Part 10 Standard in 2008

Page 19: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 19

H.264/SVC Encoder

Page 20: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 20

Hierarchical B Pictures

Lower temporal layers are generated first Use reconstructed frames for prediction

I0/P0 B1B2B3 I0/P0 I0/P0B3 B3 B3B3 B3 B3 B3B2 B2 B2B1

0 1221 8 167 9 153 5 11 136 10 144display order

group of pictures (GOP) group of pictures (GOP)

I0/P0 B1B2B3 I0/P0 I0/P0B3 B3 B3B3 B3 B3 B3B2 B2 B2B1

0 1221 8 167 9 153 5 11 136 10 144display order

group of pictures (GOP) group of pictures (GOP)

Page 21: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 21

Spatial Scalability

Same concepts in MPEG-2/4 and H.263 -- Each spatial layer is coded with texture/

motion refinement

scaling

scaling

Coding

Coding

Coding

Scalable stream

prediction

prediction

Page 22: Emerging Technologies in Multimedia Communications

Fast Algorithm: Intra Prediction Base layer is coded with good quality (small Qp)

Enh. Layer: IntraBL dominates Base layer is coded with poor quality (large Qp)

Enh. Layer: Intra4x4/IntraBL Intra4x4

20 25 30 35 400

20

40

60

80

100

QpE with Qp

B = 40

Pro

b. o

f Int

ra M

ode

(%)

10 15 20 25 300

20

40

60

80

100

QpE with Qp

B = 30

Intra16x16Intra8x8Intra4x4IntraBL

H.-C. Lin, W.-H. Peng, and H.-M. Hang, IEEE ICIP07; IEEE ICME08. 22Feb 2009hmhang/EECS, NTUT

Page 23: Emerging Technologies in Multimedia Communications

Examples of Research Topics:Examples of Research Topics:Interframe Wavelet, Interframe Wavelet,

Contourlet CodingContourlet Coding

Feb 2009 23hmhang/EECS, NTUT

Page 24: Emerging Technologies in Multimedia Communications

Interframe Wavelet

Algorithm proposed and improved by Profs. Jens Ohm (Achen U.) and John Woods (RPI)

Motion compensated temporal filtering (MCTF) + wavelet zero-tree coding

Key advantage: “full” scalability – temporal + spatial + SNR

Disadvantage: long delay (storage) High bit rates: good (~ advanced video coding)

Low rates: needs improvement Many variations now

Feb 2009 24hmhang/EECS, NTUT

Page 25: Emerging Technologies in Multimedia Communications

MCTF + Wavelet

MCTF: Motion Compensated Temporal Filtering

SpatialAnalysis

MCTF(analysis)

Entropy Coding

Packetizer

MotionEstimation

Motion Info.Encoding

SpatialSynthesis

MCTF(synthesis

)

Entropy Decoding

Depacketizer

Motion Info.Decoding

Encoder

Decoder

InputVideo

OutputVideo

Feb 2009 25hmhang/EECS, NTUT

Page 26: Emerging Technologies in Multimedia Communications

MCTF

MCTF = Motion Compensated Temporal Filtering

1 2 3 4 5Feb 2009 26hmhang/EECS, NTUT

Page 27: Emerging Technologies in Multimedia Communications

Temporal Subband Decomposition

GOP (Group of Pictures)Corresponding to temporal level=4decomposition

Temporal Low-pass frame

Temporal High-pass frame

Frames that remain aftertemporal decomposition

MCTF

MCTF

MCTF

MCTF

Video Sequence

Feb 2009 27hmhang/EECS, NTUT

Page 28: Emerging Technologies in Multimedia Communications

Spatial Scalability Wavelet

decomposition provides spatial scalability

~ JPEG 2000

Rate-control!

Bit-planeCoder

Feb 2009 29hmhang/EECS, NTUT

Page 29: Emerging Technologies in Multimedia Communications

R-D Optimization in Interframe Wavelet Video

Wavelet coding structureWavelet coding structure

30

Predictor

Entropy coding

bits

Open-loop

Motion Coder

Quantizer

R-D No Feedback Path!!!

MVR

1R

2R

3R

Multiple R-D operation points!!

Block-based

subband-basedInter-scaled hybrid

coding!!

C.-Y. Tsai and H.-M. Hang, “rho-GGD source modeling for wavelet coefficients in image/video coding,” in IEEE ICME, 2008 Feb 2009 30hmhang/EECS, NTUT

Page 30: Emerging Technologies in Multimedia Communications

Contourlet Transform

Inefficiency of separable transform

Wavelet Xlet

Feb 2009 31hmhang/EECS, NTUT

Page 31: Emerging Technologies in Multimedia Communications

↓ 2

↓ 2

↓ 2

↓ 2

↓ 2

↓ 2

LL

LH

HL

HHhorizontal

vertical2-D wavelet transform DFB

Contourlet Representation

Image decomposition using Directional Filter Bank (DFB)

Feb 2009 32hmhang/EECS, NTUT

Page 32: Emerging Technologies in Multimedia Communications

An Example of DFB

Decomposed by a DFB with 4 levels that leads to 16 subbands

Original image Image after DFB with 4 levels

Feb 2009 33hmhang/EECS, NTUT

Page 33: Emerging Technologies in Multimedia Communications

DFB-Based Coding

One example of mixed 2D wavelet decomposition

LL HL

LH HH

HL4-0

HL4-0

HL4-1

HL4-1

HL4-2

HL4-2

HL4-3

HL4-3

HH4-0

HH4-0

HH4-1

HH4-1

HH4-2

HH4-2

HH4-3

HH4-3

LH4-0

LH4-0

LH4-1

LH4-1

LH4-2

LH4-2

LH4-3

LH4-3

C.-H. Hung and H.-M. Hang, “Image Coding Using Short Wavelet-based Contourlet Transform,” IEEE ICIP, 2008

Feb 2009 34hmhang/EECS, NTUT

Page 34: Emerging Technologies in Multimedia Communications

Audio Compression:Audio Compression:MP3 MP3 MPEG Surround MPEG Surround

Feb 2009 35hmhang/EECS, NTUT

Page 35: Emerging Technologies in Multimedia Communications

MPEG Audio Standards

MPEG-1 Layer 1: 1992 (good: 256k /2ch)

1-2 chs 32k – 448k bits

MPEG-1 Layer 2: 1992 (good: 192k /2ch)

1-2 chs 32k – 384k bits

MPEG-1 Layer 3: 1993

(MP3)

(good: 128k /2ch)

1-2 chs 32k – 320k bits

MPEG-2 Layers 1,2,3: 1994 1-5.1 chs

MPEG-2 AAC: 1997

(Advanced Video Coding)

(good: 96k /2ch) 1-96 chs

8-64k/ch

MPEG-4 (v1) subpart 3 General Audio Coding, AAC: 1999

(new tools: PNS, LTP, TwinVQ)

1-96 chs

8-64k/ch

Feb 2009 36hmhang/EECS, NTUT

Page 36: Emerging Technologies in Multimedia Communications

MPEG Audio Standards (2)

MPEG-4 (v1) subpart 2: Code-Excited Linear Prediction (CELP)

Speech 3.8k – 23.8 k bits

MPEG-4 (v1) subpart 2: Harmonic Vector eXcitation Coding (HVXC)

Speech 2k – 4k bits

MPEG-4 (v1) subpart 4: Structured Audio

Synthesized audio

0k – 3k bits

MPEG-4 (v2) Parametric Audio Coding: HILN (Harmonic, Individual Line plus Noise)

6k – 16k bits/ch

MPEG-4 (v2) Fine Granule Audio: BSAC (Bit Sliced Arithmetic Coding)

Fine scale granularity: 1k

Feb 2009 37hmhang/EECS, NTUT

Page 37: Emerging Technologies in Multimedia Communications

MPEG Audio Standards (3)

MPEG-4 Amd 1: (2003) Bandwidth extension (SBR -- Spectral Band Replication)

HE-AAC, AAC+ (48k)

MPEG-4 Amd 2: (2004) Parametric Audio extension

Audio

MPEG-4 Amd 4: Audio lossless coding (ALS) Lossless

MPEG-4 Amd 5: Scalable to lossless audio coding (SLS)

Scalable coding

MPEG-4 Amd 6: Lossless coding of 1 bit oversampled audio signals

ISO/IEC 2300-3-1 (MPEG-D): MPEG Surround (FDIS 2006.7)

Spatial audio coding

Feb 2009 38hmhang/EECS, NTUT

Page 38: Emerging Technologies in Multimedia Communications

Spectral Band Replication: SBR

Typical audio signal spectrum

Feb 2009 39hmhang/EECS, NTUT

Page 39: Emerging Technologies in Multimedia Communications

SBR (2)

The high frequencies are reconstructed and adjusted

Feb 2009 40hmhang/EECS, NTUT

Page 40: Emerging Technologies in Multimedia Communications

Spatial Hearing Three parameters describing how human

locate sound source in the horizontal place Interaural Level Difference (ILD) Interaural Time Difference (ITD) Interaural Coherence (IC)

Feb 2009 41hmhang/EECS, NTUT

Page 41: Emerging Technologies in Multimedia Communications

MPEG Surround Low-bitrate parametric coding technology for

multi-channel audio signal 64 kb/s or less

Backward compatibility to stereo equipment Standardization

CfP on Spatial Audio Coding (SAC) in March 2004

Reference Model 0 (RM0) defined in 2005 Rename to ”MPEG Surround” in 2005 Finalize in July, 2006 (ISO/IEC 23003-1)

Feb 2009 42hmhang/EECS, NTUT

Page 42: Emerging Technologies in Multimedia Communications

MPEG Surround Encoder

Capture the spatial image of a multi-channel audio signal

Generate a mono/stereo downmixed signal

T/F TransformT/F Transform

T/F Transform

Downmix

SpatialParameterEstimation

AudioEncoder

CompressedAudio

Bitstream

Spatial Parameters

1s

2s

1x

2x

Nx

F/T Transform

F/T Transform

MPEG Surround Encoder

Feb 2009 43hmhang/EECS, NTUT

Page 43: Emerging Technologies in Multimedia Communications

MPEG Surround Decoder

Synthesis multi-channel output signal Backward compatibility

CompressedAudio

Bitstream

AudioDecoder

SurroundSynthesis

Spatial Parameters

Legacy Decoding

1x

2x

Nx

1s

2s

F/T TransformF/T Transform

F/T Transform

T/F Transform

T/F Transform

MPEG Surround Decoder

Feb 2009 44hmhang/EECS, NTUT

Page 44: Emerging Technologies in Multimedia Communications

Multimedia Multimedia Communication SystemCommunication System

Feb 2009 45hmhang/EECS, NTUT

Page 45: Emerging Technologies in Multimedia Communications

Feb 2009hmhang/EECS, NTUT 46

Server-Client Structure

Server Network Client

Page 46: Emerging Technologies in Multimedia Communications
Page 47: Emerging Technologies in Multimedia Communications
Page 48: Emerging Technologies in Multimedia Communications
Page 49: Emerging Technologies in Multimedia Communications
Page 50: Emerging Technologies in Multimedia Communications
Page 51: Emerging Technologies in Multimedia Communications