emerging technologies in multimedia communications
DESCRIPTION
Emerging Technologies in Multimedia Communications. 電資學院院長 杭學鳴 Dean of EECS College : Hsueh-Ming Hang 台北科技大學 Taipei Univ. of Technology. Contents. Audio and video standards Video standards evolution Emerging techniques in video coding Audio standards evolution. Video Coding Standards. - PowerPoint PPT PresentationTRANSCRIPT
Emerging Technologies in Multimedia Communications
電資學院院長 杭學鳴電資學院院長 杭學鳴Dean of EECS College: Hsueh-Ming Hang
台北科技大學Taipei Univ. of Technology
Contents
Audio and video standards Video standards evolution Emerging techniques in video coding Audio standards evolution
Feb 2009 2hmhang/EECS, NTUT
Video Coding Standards
Standards Typical rates Applications
ITU-T (CCITT) H.261 128 384k bits/s Videophone over ISDN
ISO MPEG-1 (11172-2) 1.2 Mbits/s Video CD
ISO MPEG-2 (13818-2) 4–10 Mbits Digital TV/HDTV
(ITU-T H.262) 20 Mbits/s Over air/networks
ITU-T H.263 < 64k bits/s Videophone
ISO MPEG-4 (14496-2) Low/high-rates Object-oriented
ISO MPEG-7 (15938) Database Content description
ITU-T H.263 v2 < 64k bits/s PSTN/wireless Videophone
ITU-T H.264 (JVT,AVC) < 40k bits/s Net/wireless Videophone
ITU-T H.264 ext (SVC) Multi-layer Net/wireless streaming
ISDN: Integrated Services Digital Network
Feb 2009 3hmhang/EECS, NTUT
MPEG Audio Standards
MPEG-1 Layer 1: 1992 (good: 256k /2ch) 1-2 chs
MPEG-1 Layer 2: 1992 (good: 192k /2ch) 1-2 chs
MPEG-1 Layer 3: 1993 (MP3) (good: 128k /2ch) 1-2 chs
MPEG-2 Layers 1,2,3: 1994 1-5.1 chs
MPEG-2 AAC: 1997
(Advanced Video Coding)
(good: 96k /2ch) 1-96 chs
MPEG-4 (v1) subpart 3 General Audio Coding, AAC: 1999
(new tools: PNS, LTP, TwinVQ)
1-96 chs
MPEG-4 Amd 1: (2003) Bandwidth extension (SBR -- Spectral Band Replication)
HE-AAC, AAC+ (good: 48k)
MPEG-4 Amd 2: (2004) Parametric Audio extension MPEG surround (MPEG-D 2006)
(good: 24k?)
Feb 2009 4hmhang/EECS, NTUT
Image/Video Standards
ISO/IEC JTC1 SC29 – ISO and IEC Joint Technical Committee (on Information Technology) Subcommittee 29 (Coding of audio, picture, multimedia and hypermedia)
– Working Group (WG) 1: JBIG (Joint Bi-level Image Group) – 1-bit to 4/5-bit still
pictures JPEG (Joint Photographic Experts Group) – 8-bit or more
still pictures ISO/IEC JTC1 SC29 – WG 11: MPEG (Moving Picture Experts Group) – Motion
pictures – WG 12: MHEG (Multimedia-Hypermedia Experts Group) –
Multi/Hyper-media exchange format
Feb 2009 5hmhang/EECS, NTUT
Standards Organizations
CCITT – Comité Consultaitif International Télégraphique et Téléphonique (International Telegraph and Telephone Consultative Committee)
ITU – International Telecommunication Union ISO – International Standardization
Organization IEC – International Electrotechnical
Commission
Feb 2009 6hmhang/EECS, NTUT
MPEG Committee Convener: Leonardo Chiariglione
Standards:-- MPEG-1: done-- MPEG-2: done-- MPEG-4: done?!-- MPEG-7: done?!-- MPEG-21: done?
-- MPEG A,B,C,D,E: on-going
Feb 2009 7hmhang/EECS, NTUT
MPEG Chair Dr. Chiariglione at NCTU (2003.12)
http://www.chiariglione.org
Feb 2009 8hmhang/EECS, NTUT
MPEG-A,B,C MPEG-A (ISO/IEC 23000) Multimedia Application Formats
Part 1 Purpose for Multimedia Application formatsPart 2 Music Player Application FormatPart 3 Photo Player Application Format … Part 12
MPEG-B (ISO/IEC 23001) MPEG Systems TechnologiesPart 1 Binary MPEG format for XML … Part 5
MPEG-C (ISO/IEC 23002) MPEG Video TechnologiesPart 1 Accuracy specification for implementation of
integer-output IDCTPart 2 Fixed point implementation of DCT/IDCTPart 3 Auxiliary Video Data Representation
Part 4 Video Tool LibraryFeb 2009 9hmhang/EECS, NTUT
MPEG-D,E MPEG-D (ISO/IEC 23003) MPEG Audio Technologies
Part 1 MPEG Surround Part 2 Spatial Audio Object Coding Part 3 Unified Speech and Audio Coding
MPEG-E (ISO/IEC 23004) MPEG Multimedia MiddlewarePart 1 Architecture Part 2 Multimedia APIPart 3 Component ModelPart 4 Resource and Quality ManagementPart 5 Component DownloadPart 6 Fault ManagementPart 7 System Integrity Management
Part 8 Reference Software and ConformanceFeb 2009 10hmhang/EECS, NTUT
Feb 2009hmhang/EECS, NTUT 11
How I Got Involved?
1984: Joined AT&T Bell Labs – Visual Comm. Dept.
H.261 video standard started 1988.1: MPEG started 1991.12: I joined NCTU discontinued
standard activities 1999.9: NCTU formed a small group to
participate in the MPEG activities
NCTU MPEG Activity
Tihao Chiang (蔣迪豪 ), C.J. Tsai (蔡淳仁 ), Wen Peng (彭文孝 ) and H.-M. Hang (杭學鳴 ) attend MPEG meetings constantly
Tihao Chiang : Co-editor, MPEG-4 Part 7 Optimised Reference Software (Done)
C.J. Tsai : Co-editor, MPEG-21 Part 12 Multimedia Test Bed for Resource Delivery (Done)
107 contributions (input and output documents) in the past 5 years (2002 -- 2007). [Dr. Y.-S. Tung, NTU]
Example: Call for Proposal on Scalable Video Coding (Feb. 2004) – 2 out of 14 proposalsFeb 2009 12hmhang/EECS, NTUT
Image & Video Compression:Image & Video Compression:JPEG JPEG AVC (H.264) AVC (H.264)
Feb 2009 13hmhang/EECS, NTUT
Progress of Image/Video Coding
H.261 (CCITT/ITU;1984, 88, 90) – video (videoconf.) JPEG (1986, 89, 92) – image (Digital Camera) MPEG-1 (1988 – 92) – video (VCD) MPEG-2 (1990 – 94) – video (DVD, DTV) MPEG-4 part 2 (1992 – 99) – video (Internet, WL) H.263 (1993 – 95; ver.3: 2000) – video (WL) JPEG2000 (1996 – 2001) – image H.264 (MPEG-4 part 10) AVC (1998 – 03) – video
(WL, HD-DVD) AVC Amd.1 (2003 – 2008) – scalable video coding
Feb 2009 14hmhang/EECS, NTUT
Scalable Bitstream
Progressive approximation
GOP Header Motion Info. Image Data
300kbpsPSNR=32.2 dB
500kbpsPSNR=34.6 dB
1000kbpsPSNR=38.2 dB
Feb 2009 15hmhang/EECS, NTUT
Spatial/SNR Scalability
176x144, 256Kbs 352x288, 750Kbs
704x576, 1.5Mbs704x576, 6Mbs
Feb 2009 16hmhang/EECS, NTUT
Scalable Video CodingScalable Video Coding
Why scalable video coding?Why scalable video coding? RReliablyeliably deliver video to deliver video to diversediverse clients over clients over
heterogeneousheterogeneous networks using available networks using available system system resourcesresources
Types of ScalabilityTypes of Scalability SNR scalability (quality)SNR scalability (quality) Spatial scalability (frame resolution)Spatial scalability (frame resolution) Temporal scalability (frame rate)Temporal scalability (frame rate)
Combined scalability
17
Feb 2009 17hmhang/EECS, NTUT
Feb 2009hmhang/EECS, NTUT 18
MPEG SVC Activity 2003.10 Call-for-Proposal (CfP) 2004.2 Proposals received (14 submitted)
(M10737) (NCTU submitted two proposals) 2004.3 Evaluations: two categories (M10480)
Category 1: MCTF+Wavelet (10)Category 2: AVC based (incl. AVC/MCTF) (4)
2004.3/7/10 Proposals and Refinements evaluated
2005.1 AVC became Amd 1 of MPEG-4 Part 10 Standard in 2008
Feb 2009hmhang/EECS, NTUT 19
H.264/SVC Encoder
Feb 2009hmhang/EECS, NTUT 20
Hierarchical B Pictures
Lower temporal layers are generated first Use reconstructed frames for prediction
I0/P0 B1B2B3 I0/P0 I0/P0B3 B3 B3B3 B3 B3 B3B2 B2 B2B1
0 1221 8 167 9 153 5 11 136 10 144display order
group of pictures (GOP) group of pictures (GOP)
I0/P0 B1B2B3 I0/P0 I0/P0B3 B3 B3B3 B3 B3 B3B2 B2 B2B1
0 1221 8 167 9 153 5 11 136 10 144display order
group of pictures (GOP) group of pictures (GOP)
Feb 2009hmhang/EECS, NTUT 21
Spatial Scalability
Same concepts in MPEG-2/4 and H.263 -- Each spatial layer is coded with texture/
motion refinement
scaling
scaling
Coding
Coding
Coding
Scalable stream
prediction
prediction
Fast Algorithm: Intra Prediction Base layer is coded with good quality (small Qp)
Enh. Layer: IntraBL dominates Base layer is coded with poor quality (large Qp)
Enh. Layer: Intra4x4/IntraBL Intra4x4
20 25 30 35 400
20
40
60
80
100
QpE with Qp
B = 40
Pro
b. o
f Int
ra M
ode
(%)
10 15 20 25 300
20
40
60
80
100
QpE with Qp
B = 30
Intra16x16Intra8x8Intra4x4IntraBL
H.-C. Lin, W.-H. Peng, and H.-M. Hang, IEEE ICIP07; IEEE ICME08. 22Feb 2009hmhang/EECS, NTUT
Examples of Research Topics:Examples of Research Topics:Interframe Wavelet, Interframe Wavelet,
Contourlet CodingContourlet Coding
Feb 2009 23hmhang/EECS, NTUT
Interframe Wavelet
Algorithm proposed and improved by Profs. Jens Ohm (Achen U.) and John Woods (RPI)
Motion compensated temporal filtering (MCTF) + wavelet zero-tree coding
Key advantage: “full” scalability – temporal + spatial + SNR
Disadvantage: long delay (storage) High bit rates: good (~ advanced video coding)
Low rates: needs improvement Many variations now
Feb 2009 24hmhang/EECS, NTUT
MCTF + Wavelet
MCTF: Motion Compensated Temporal Filtering
SpatialAnalysis
MCTF(analysis)
Entropy Coding
Packetizer
MotionEstimation
Motion Info.Encoding
SpatialSynthesis
MCTF(synthesis
)
Entropy Decoding
Depacketizer
Motion Info.Decoding
Encoder
Decoder
InputVideo
OutputVideo
Feb 2009 25hmhang/EECS, NTUT
MCTF
MCTF = Motion Compensated Temporal Filtering
1 2 3 4 5Feb 2009 26hmhang/EECS, NTUT
Temporal Subband Decomposition
GOP (Group of Pictures)Corresponding to temporal level=4decomposition
Temporal Low-pass frame
Temporal High-pass frame
Frames that remain aftertemporal decomposition
MCTF
MCTF
MCTF
MCTF
Video Sequence
Feb 2009 27hmhang/EECS, NTUT
Spatial Scalability Wavelet
decomposition provides spatial scalability
~ JPEG 2000
Rate-control!
Bit-planeCoder
Feb 2009 29hmhang/EECS, NTUT
R-D Optimization in Interframe Wavelet Video
Wavelet coding structureWavelet coding structure
30
Predictor
Entropy coding
bits
Open-loop
Motion Coder
Quantizer
R-D No Feedback Path!!!
MVR
1R
2R
3R
Multiple R-D operation points!!
Block-based
subband-basedInter-scaled hybrid
coding!!
C.-Y. Tsai and H.-M. Hang, “rho-GGD source modeling for wavelet coefficients in image/video coding,” in IEEE ICME, 2008 Feb 2009 30hmhang/EECS, NTUT
Contourlet Transform
Inefficiency of separable transform
Wavelet Xlet
Feb 2009 31hmhang/EECS, NTUT
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
↓ 2
LL
LH
HL
HHhorizontal
vertical2-D wavelet transform DFB
Contourlet Representation
Image decomposition using Directional Filter Bank (DFB)
Feb 2009 32hmhang/EECS, NTUT
An Example of DFB
Decomposed by a DFB with 4 levels that leads to 16 subbands
Original image Image after DFB with 4 levels
Feb 2009 33hmhang/EECS, NTUT
DFB-Based Coding
One example of mixed 2D wavelet decomposition
LL HL
LH HH
HL4-0
HL4-0
HL4-1
HL4-1
HL4-2
HL4-2
HL4-3
HL4-3
HH4-0
HH4-0
HH4-1
HH4-1
HH4-2
HH4-2
HH4-3
HH4-3
LH4-0
LH4-0
LH4-1
LH4-1
LH4-2
LH4-2
LH4-3
LH4-3
C.-H. Hung and H.-M. Hang, “Image Coding Using Short Wavelet-based Contourlet Transform,” IEEE ICIP, 2008
Feb 2009 34hmhang/EECS, NTUT
Audio Compression:Audio Compression:MP3 MP3 MPEG Surround MPEG Surround
Feb 2009 35hmhang/EECS, NTUT
MPEG Audio Standards
MPEG-1 Layer 1: 1992 (good: 256k /2ch)
1-2 chs 32k – 448k bits
MPEG-1 Layer 2: 1992 (good: 192k /2ch)
1-2 chs 32k – 384k bits
MPEG-1 Layer 3: 1993
(MP3)
(good: 128k /2ch)
1-2 chs 32k – 320k bits
MPEG-2 Layers 1,2,3: 1994 1-5.1 chs
MPEG-2 AAC: 1997
(Advanced Video Coding)
(good: 96k /2ch) 1-96 chs
8-64k/ch
MPEG-4 (v1) subpart 3 General Audio Coding, AAC: 1999
(new tools: PNS, LTP, TwinVQ)
1-96 chs
8-64k/ch
Feb 2009 36hmhang/EECS, NTUT
MPEG Audio Standards (2)
MPEG-4 (v1) subpart 2: Code-Excited Linear Prediction (CELP)
Speech 3.8k – 23.8 k bits
MPEG-4 (v1) subpart 2: Harmonic Vector eXcitation Coding (HVXC)
Speech 2k – 4k bits
MPEG-4 (v1) subpart 4: Structured Audio
Synthesized audio
0k – 3k bits
MPEG-4 (v2) Parametric Audio Coding: HILN (Harmonic, Individual Line plus Noise)
6k – 16k bits/ch
MPEG-4 (v2) Fine Granule Audio: BSAC (Bit Sliced Arithmetic Coding)
Fine scale granularity: 1k
Feb 2009 37hmhang/EECS, NTUT
MPEG Audio Standards (3)
MPEG-4 Amd 1: (2003) Bandwidth extension (SBR -- Spectral Band Replication)
HE-AAC, AAC+ (48k)
MPEG-4 Amd 2: (2004) Parametric Audio extension
Audio
MPEG-4 Amd 4: Audio lossless coding (ALS) Lossless
MPEG-4 Amd 5: Scalable to lossless audio coding (SLS)
Scalable coding
MPEG-4 Amd 6: Lossless coding of 1 bit oversampled audio signals
ISO/IEC 2300-3-1 (MPEG-D): MPEG Surround (FDIS 2006.7)
Spatial audio coding
Feb 2009 38hmhang/EECS, NTUT
Spectral Band Replication: SBR
Typical audio signal spectrum
Feb 2009 39hmhang/EECS, NTUT
SBR (2)
The high frequencies are reconstructed and adjusted
Feb 2009 40hmhang/EECS, NTUT
Spatial Hearing Three parameters describing how human
locate sound source in the horizontal place Interaural Level Difference (ILD) Interaural Time Difference (ITD) Interaural Coherence (IC)
Feb 2009 41hmhang/EECS, NTUT
MPEG Surround Low-bitrate parametric coding technology for
multi-channel audio signal 64 kb/s or less
Backward compatibility to stereo equipment Standardization
CfP on Spatial Audio Coding (SAC) in March 2004
Reference Model 0 (RM0) defined in 2005 Rename to ”MPEG Surround” in 2005 Finalize in July, 2006 (ISO/IEC 23003-1)
Feb 2009 42hmhang/EECS, NTUT
MPEG Surround Encoder
Capture the spatial image of a multi-channel audio signal
Generate a mono/stereo downmixed signal
T/F TransformT/F Transform
T/F Transform
Downmix
SpatialParameterEstimation
AudioEncoder
CompressedAudio
Bitstream
Spatial Parameters
1s
2s
1x
2x
Nx
F/T Transform
F/T Transform
MPEG Surround Encoder
Feb 2009 43hmhang/EECS, NTUT
MPEG Surround Decoder
Synthesis multi-channel output signal Backward compatibility
CompressedAudio
Bitstream
AudioDecoder
SurroundSynthesis
Spatial Parameters
Legacy Decoding
1x
2x
Nx
1s
2s
F/T TransformF/T Transform
F/T Transform
T/F Transform
T/F Transform
MPEG Surround Decoder
Feb 2009 44hmhang/EECS, NTUT
Multimedia Multimedia Communication SystemCommunication System
Feb 2009 45hmhang/EECS, NTUT
Feb 2009hmhang/EECS, NTUT 46
Server-Client Structure
Server Network Client