Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 1
Audio Coding Quantization and Coding Methods
Prof. Dr.-Ing. Karlheinz
Brandenburg [email protected]
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 2
Quantization and Coding Methods (1)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 3
Quantization and Coding Methods (2)•
Objective:–
„Good“
representation of spectral data
•
Compactness (low bit rate)•
Smallest possible perceptible distortion (high subjective quality)
•
Overview:–
Quantization
–
Noiseless Coding–
Joint Quantization/ Coding Techniques
–
Encoding Strategies
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 4
Quantization (1)•
Basics:–
Data reduction by removing irrelevance
–
Explicit control of quantization distortion according to time/frequency-dependent masking threshold (perceptual coder)
–
High variability in local SNR •
(e.g. 0db ... >30db)–
Most popular case: Scalar quantization
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 5
Quantization
(2)
x
f(x) = x
• Simpler:- Uniform quantization
• MPEG-1/2 Layers I and II, ATRAC, AC-3
- Average distortion independent on size of coefficients
more precise control of quantization noise:
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 6
Quantization (3)•
More sophisticated:–
Non-uniform quantization•
MPEG-1/2 Layer III, MPEG-2/4 AAC:
–
More distortion for larger coefficients (subband signal x)
⎟⎟⎠
⎞⎜⎜⎝
⎛
Δ=
q
ii
xroundS75.0
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 7
Quantization
(4)
x
f(x) = x1/0.75
Decoder
expanding
x
Δq
f(x) = x0.75
Encoder
compression
resulting
step
sizeafter
decoding
Non-uniform quantization:
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 8
Quantization (5)•
Approach #1:–
Group of spectral coefficients (subband
signal
x) is normalized by means of a common multiplier („scalefactor“)
–
Side information: Scalefactor
and quantizer resolution (bits/sample)
–
„Block companding“
/ „Block floating point“–
MPEG-1/2 Layers I and II, ATRAC, AC-3
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 9
Quantization (6)•
Approach #2:–
Group of spectral coefficients is scaled („scalefactor“) and subsequently quantized by a fixed quantizer
–
Side information: Scalefactor–
Used with entropy coding
–
Quant. Precision controlled by scalefactor–
MPEG-1/2 Layers III, MPEG-2/4 AAC
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 10
Shaping of the noise according to psycho-acoustic model. Several subbands
in each scalefactor
band,
with same step size.
SignalMasking ThresholdEach step: scalefactor band
with scalefactor
Approximation of Masking Threshold by a step function → fewer bits for parameterization
fscalefactor
bandsubbands
scale-factor
Scalefactor
Bands and Masking
Threshold
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 11
Effect
of non-uniform
Quantizationin each
scalefactor
band with
SignalMasking ThresholdEach step: scalefactor band
with scalefactor
Additional noise shaping with non-uniform quantizer (x0.75):
quantization noise follows signal x
Approximation of Masking Threshold by a step function → fewer bits for parameterization
fscalefactor
bandsubbands
scale-factor
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 12
Different way to approximate the Masking Threshold, use polynomial approximation instead of step-function of scale factor bands.
Goal: Approximation of Masking Threshold as close as possible to reduce bit demand
Used in Ultra Low Delay Coder, predictive coder, no subbands. Is frequency response of linear filter.
SignalMasking ThresholdPolynomial Approximation (example: 12 coefficients)
Polynomial
Approximation of Masking
Threshold
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 13
Noiseless Coding (1)•
Even more efficient:–
Higher gain by using entropy coding•
Huffman coding (MPEG-1/2 Layer III, MPEG-
2/4 AAC)•
Arithmetic coding (MPEG-4 version 2 BSAC)
ScalefactorBands
3CodebookNumbers8 0 6 2
Sections
0 1 2 3 4 5 6 7 8 9 10
choose
„best“
Huffman code
from
codebook
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 14
Noiseless
Coding
(2)
00: highest
probability
reason
for
compression
gain
with
entropy
coding
shorter
codewords
for
the
more
likely
symbols
(e.g. 0)
shortest
codeword
for
0
→
→
→
Histogram
xq
p~e-|x|/c
typical probability distribution of audio samples: Laplace distribution
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 15
Noiseless Coding (3)•
Example for state-of-the art coding kernel (MPEG-
2/4 AAC)–
Multi-dimensional (2 or 4-dim.) entropy coding -> exploiting joint statistics of vector comp.
–
Huffman coding, several codebooks–
Signal information part of codebook or separate escape mechanism for large quant. values
–
Choice of Huffman codebook for arbitrary groups of scalefactor
bands („sections“)
–
can obtain less than 1 bit per sample with 2 or 4 dimensional vectors
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 16
MPEG Audio Layer-3: Huffman Code Tables(1)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 17
MPEG Audio Layer-3: Huffman Code Tables(2)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 18
MPEG Audio Layer-3: Huffman Code Tables(3)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 19
MPEG Audio Layer-3: Huffman Code Tables(4)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 20
Joint Quantization / Coding Techniques (1)•
Vector quantization–
Excellent coding efficiency at very low rates (<< 1 bit / sample), but
–
Perceptual control of distortion difficult→control only total distortion for “group”
•
Application range:–
Used for intermediate quality / very low bit rate coding•
e.g. MPEG-4 TwinVQ
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 21
Joint Quantization / Coding Techniques (2)
quantization
step
size
resulting quantization errorfor 2 subbands
x1
x2
subband
value
2
subband
value
1
x1
also possible for higher dimensions→ even higher gain
denser packing of quantization„spheres“
→
reduced average quantization
errors
x2
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 22
Joint Quantization / Coding Techniques (3)
WeightedVQ
WeightedVQ
WeightedVQ
WeightedVQ
Input Vector
Sub-Vectors(Interleaved)
Interleaving
Perceptual Weights
VectorQuantization
Index Values
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 23
Encoding Strategies (1)•
Today‘s coding schemes provide large degree of flexibility–
Quantization noise profile over frequency
–
Trade-off audio bandwidth vs. overall distortion–
Bit rate & coding mode
–
Usage of optional tools (Temporal Noise Shaping, prediction)
•
Encoding strategy „intelligent“
part of encoding; determines quality
•
Arena for specific know-how („secrets of audio coding“)
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 24
Encoding Strategies (2)•
Signal-to-Noise Ratio (SNR)–
Quadratic distortion metric
–
Does not allow prediction about quality !
•
Noise-to-mask Ratio (NMR)–
Ratio of distortion with respect to masking threshold
–
Determines audibility of distortions
•
Signal-to-Mask Ration (SMR)–
Relation between signal and masking threshold
–
Gives indication of bit demandCoder Bands
[dB]
Signalenergy
Maskingthreshold
Distortion(Noise)
SNR
SMR
NMR
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 25
Constant Quality Coding•
Goal:–
Coding with a pre-selected constant quality
•
Procedure:–
Determine time & frequency dependent masking threshold
–
Adjust quantizer step size to meet target distortion
Constant quality -> Variable bit rate
•
Applications:–
Storage media / transmission channels supporting variable rate e.g.•
Storage on digital media; music on internet
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 26
Constant Rate Coding (1)•
Paradigm 1 „Bit Allocation“–
Used typically in coding schemes with block companding
–
Direct translation between bit rate and local SNR available
Set initial SNRfor each band
Calc. bits/sample
#used_bits <<#available_bits ?
AdjustSNR
No
Yes
Perform Quantization
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 27
Constant Rate Coding (2)•
Paradigm 2 „Noise Allocation“–
Used in codecs
with entropy encoding
Set equal
step
sizefor
each
band initially
Rate Loop:Change global step
size
until#used_bits==available_bits
Adjust
step
size
fordeficient
bands
Coding
resultsatisfactory
?
Better
resultpossible
?
Yes
No
No
Yes
Quantization
info
/ step
sizes
#available_bits
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 28
Constrained Variable Bit Rate Coding (Bit Reservoir)•
Constant quality vs. Constant bit rate–
Constant quality -> variable bit rate (VBR)
–
Constant bit rate -> variable quality–
? How to achieve both ?
•
Compromise:–
Bit reservoir = constrained VBR Coding
–
Limit accumulated deviation of bit consumption from average („bit reservoir size“)
–
Additional delay–
Used in MPEG-1/2 Layer III, MPEG-2/4 AAC
-300-200-100
0100200
300400500600700
100 150 200 250 300 350 400 450 500 550 600frame number
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 29
Quantization / Coding: Codec OverviewFi lt e rb a n kch a nn els
Q u a nt iz a t io n Co di n g
MP EG- 1 /2 La y e r I / I I 32 u n i f o r m b lo ckco m pa n d ing
MP EG- 1 /2 Lay er I I I 576 / 1 92 n o n -u n if o rm H u f f m anco d in g
MP EG- 2 AA C 1024 / 128 n o n -u n if o rm H u f f m anco d in g
MP EG- 4 Tw inV Q 1024 / 128(960/1 20)
VQ VQ
A C-3 256 / 1 28 u n i f o r m b lo ckco m pa n d ing
AT RA C 96 .. 5 12 u n i f o r m b lo ckco m pa n d ing