speech processing (vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...various lpc-type...
TRANSCRIPT
![Page 1: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/1.jpg)
1
Vocoders
![Page 2: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/2.jpg)
2
The Channel Vocoder (analyzer): The channel vocoder employs a bank of
bandpass filters, Each having a bandwidth between 100 Hz and 300
Hz.
Typically, 16-20 linear phase FIR filter are used.
The output of each filter is rectified and lowpassfiltered. The bandwidth of the lowpass filter is selected to
match the time variations in the characteristics of the vocal tract.
For measurement of the spectral magnitudes, a voicing detector and a pitch estimator are included in the speech analysis.
![Page 3: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/3.jpg)
3
The Channel Vocoder (analyzer block diagram):
Bandpass
Filter
A/D
Converter
Lowpass
Filter
A/D
Converter
Lowpass
FilterRectifier
Rectifier
Bandpass
Filter
Voicing
detector
Pitch
detector
Encoder
S(n)To
Channel
![Page 4: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/4.jpg)
4
The Channel Vocoder (synthesizer):
16-20 linear-phase FIR filters
Covering 0-4 kHz
Each having a bandwidth between 100-300 Hz
20-ms frames, or 50 Hz changing of spectral magnitude
LPF bandwidth: 20-25 Hz
Sampling rate of the output of the filters: 50 Hz
![Page 5: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/5.jpg)
5
The Channel Vocoder (synthesizer):
Bit rate:
1 bit for voicing detector
6 bits for pitch period
For 16 channels, each coded with 3-4 bits,
updated 50 times per second
Then the total bit rate is 2400-3200 bps
Further reductions to 1200 bps can be
achieved by exploiting frequency correlations
of the spectrum magnitude
![Page 6: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/6.jpg)
6
The Channel Vocoder (synthesizer):
At the receiver the signal samples are passed through D/A converters.
The outputs of the D/As are multiplied by the voiced or unvoiced signal sources.
The resulting signal are passed through bandpass filters.
The outputs of the bandpass filters are summed to form the synthesized speech signal.
![Page 7: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/7.jpg)
7
The Channel Vocoder (synthesizer block diagram):
D/A
Converter
Decoder
D/A
Converter
Voicing
Information
Pitch
period
Pulse
generator
Random
Noise
generator
Bandpass
Filter
Bandpass
Filter
Switch
∑Output
speech
From
Channel
![Page 8: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/8.jpg)
8
The Phase Vocoder :
The phase vocoder is similar to the channel vocoder.
However, instead of estimating the pitch, the phase vocoder estimates the phase derivative at the output of each filter.
By coding and transmitting the phase derivative, this vocoder destroys the phase information .
![Page 9: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/9.jpg)
9
The Phase Vocoder(analyzer block diagram, kth channel)
nkcos nksin
nkcos
Lowpass
Filter
Encoder
Lowpass
Filter
Differentiator
Differentiator
Decimator
Decimator
Compute
Short-term
Magnitude
And
Phase
Derivative
To
Channel
S(n)
nksin
nkcos
nak
nbk
Short-term
magnitude
Short-term phase
derivative
![Page 10: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/10.jpg)
10
The Phase Vocoder(synthesizer block diagram, kth channel)
nkcos
Interpolator
Decoder
∑
From
ChannelCos
Integrator
InterpolatorSin
Decimated
Short-term
amplitude
Decimated
Short-term
Phase
derivative
nksin
![Page 11: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/11.jpg)
11
The Phase Vocoder :
LPF bandwidth: 50 Hz
Demodulation separation: 100 Hz
Number of filters: 25-30
Sampling rate of spectrum magnitude and phase
derivative: 50-60 samples per second
Spectral magnitude is coded using PCM or
DPCM
Phase derivative is coded linearly using 2-3 bits
The resulting bit rate is 7200 bps
![Page 12: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/12.jpg)
12
The Formant Vocoder :
The formant vocoder can be viewed as a
type of channel vocoder that estimates the
first three or four formants in a segment of
speech.
It is this information plus the pitch period
that is encoded and transmitted to the
receiver.
![Page 13: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/13.jpg)
13
The Formant Vocoder : Example of formant:
(a) : The spectrogram of the utterance “day one” showing the pitch and the harmonic structure of speech.
(b) : A zoomed spectrogram of the fundamental and the second harmonic.
(a) (b)
![Page 14: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/14.jpg)
14
The Formant Vocoder (analyzer block diagram):
F3
F2
F1
PitchAnd
V/UDecoder
F3
B3
F2
B2
F1
B1
V/U
F0
Fk :The frequency of the kth formant
Bk :The bandwidth of the kth formant
Input
Speech
![Page 15: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/15.jpg)
15
The Formant Vocoder (synthesizer block diagram):
F3
F2
F1
Excitation
Signal
F3
B3
F2
B2
F1
B1
V/U
F0
∑
![Page 16: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/16.jpg)
16
Linear Predictive Coding : The objective of LP analysis is to estimate
parameters of an all-pole model for the vocal tract.
Several methods have been devised for generating the excitation sequence for speech synthesizes.
Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation signal generated for speech synthesis.
![Page 17: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/17.jpg)
17
LPC 10 :
This methods is called LPC-10 because of
10 coefficient are typically employed.
LPC-10 partitions the speech into the 180
sample frame.
Pitch and voicing decision are determined
by using the AMDF and zero crossing
measures.
![Page 18: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/18.jpg)
18
A General Discrete-Time Model For Speech Production
DT
Impulse
generator
G(z)
Glottal
Filter
Uncorrelated
Noise
generator
H(z)
Vocal tract
Filter
R(z)
LP
Filter
Voiced
Unvoiced
Pitch Gain
Gain
V
U
U(n)
Voiced
Volume
velocity
s(n)
Speech
Signal
![Page 19: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/19.jpg)
54از 19صفحه
تعيين مرتبه پيشگويي
پيشگويي خطي
بي صدا
صدادار
![Page 20: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/20.jpg)
54از 20صفحه
تعيين مرتبه پيشگويي
پيشگويي خطي
![Page 21: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/21.jpg)
54از 21صفحه
تعيين مرتبه پيشگويي
پيشگويي خطي
m
Mmn
m
Mmn
ne
nsPG
1
2
1
2
][
][log10
صدادار
بي صدا
![Page 22: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/22.jpg)
54از 22صفحه
مثال
M=4
M=10
پيشگويي خطي
![Page 23: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/23.jpg)
54از 23صفحه
مثال
M=2
M=10
M=54
پيشگويي خطي
![Page 24: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/24.jpg)
54از 24صفحه
M=10M=50
ايده پيشگويي خطي بلند مدت
پيشگويي خطي
![Page 25: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/25.jpg)
54از 25صفحه
پيشگويي خطي بلند مدت
پيشگويي خطي
![Page 26: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/26.jpg)
26صفحه
54از
مشخصات عمومي
LPC10وكدر
ضريب پيشگويي خطي به 10بخاطر ارسالLPC10 معروف.است
باشدبيت بر ثانيه مي2400نرخ ارسال برابر.نمونه در نظر گرفته شده 180ها در هر فريم برابر تعداد نمونه
.است شودبيت به ازاي هر فريم ارسال مي54تعداد. هرتز نمونه برداري شده 8000سيگنال آنالوگ ورودي آن با نرخ
.شودبيت كوانتايز مي16و با
![Page 27: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/27.jpg)
54از 27صفحه
كد كننده
فريم بندي فيلتر پيش تاكيدآشكار سازصدا
تعيين ضرايبپيشگويي
فيلتر خطاي پيشگويي
تخمين بسامد گام
كد گذاريLPCضرايب
كد گشاييLPC
محاسبه بهره
كد گذاري بهرهكد گذاري دوره گام
Bit Encoder
سيگنال PCM
ودودي
انديسLPCضرايب
انديس بهره انديسدوره گام
رشته بيتارسالي
LPC10وكدر
![Page 28: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/28.jpg)
28صفحه
54از
تشخيص بسامد گام
روش خود همبستگي
روش تابع تفاضل دامنه
روش YMC
m
Nmnl]s[n]s[nR[l,m]
1
m
Nmn
lnsnsmlMDF
1
][][],[
mNmneNnsbns 1 ],[][ .][
![Page 29: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/29.jpg)
54از 29صفحه
آشكار ساز صدا
(باند پايين) محاسبه انرژي -1محاسبه نرخ عبور از صفر-2محاسبه بهره پيشگويي-3
بسامد گامتخمين
محاسبهMDF
ارسال يكي از مقادير:
T=20,21,…,39,40,42,…,80,84,…,154
كد كننده
LPC10وكدر
![Page 30: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/30.jpg)
54از 30صفحه
LPCكوانتيزاسيون ضرايب
دوربين-حل معادله نرمال به روش لوينسون-1RCمحاسبه ضرايب -2
كد كننده
LPC10وكدر
![Page 31: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/31.jpg)
31صفحه
54از
سنتز گفتار
سيگنال اصلي
بخش كد كننده
ريمبي صدا بودن ف/تعيين صدادار•
ت تعيين دوره گام فثط براي حال•صدادار
محاسبه بهره سيگنال•
G
ودقطار ضربه با پري
يراير دوره گام
نويز
تصادفي
مدل منبع
V/U
گفتار سنتز شده
كد گشابهره
كد كننده
LPC10وكدر
![Page 32: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/32.jpg)
32صفحه
54از
تقسيم بندي به دو قسمت صدادار و بي صدا-1
ها قطار ضربه تن)تحريك تصادفي و قطار ضربه پريوديك جهت استفاده از نويز -2(.نمي تواند تمامي صوتهاي واكدار را ايجاد كند
حفظ نشدن فاز سيگنال اصلي-3
.استARاستفاده از قطار ضربه يك تخطي از مدل -4
محدوديتها
LPC10وكدر
![Page 33: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/33.jpg)
33
Residual Excited LP Vocoder :
Speech quality can be improved at the
expense of a higher bit rate by computing
and transmitting a residual error, as done
in the case of DPCM.
One method is that the LPC model and
excitation parameters are estimated from
a frame of speech.
![Page 34: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/34.jpg)
34
Residual Excited LP Vocoder :
The speech is synthesized at the transmitter and
subtracted from the original speech signal to
form the residual error.
The residual error is quantized, coded, and
transmitted to the receiver
At the receiver the signal is synthesized by
adding the residual error to the signal generated
from the model.
![Page 35: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/35.jpg)
The residual signal is low-pass filtered at 1000 Hz in the
analyzer to reduce bit rate
In the synthesizer, it is rectified and spectrum flattened
(using a HPF), the lowpass and highpass signals are
summed and the resulting residual error signal is used to
excite the LPC model.
RELP vocoder provides communication-quality speech
at about 9600 bps.
35
Residual Excited LP Vocoder :
![Page 36: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/36.jpg)
36
RELP Analyzer (type 1):
Buffer
And
window
stLP
analysis
∑
Encoder
LP
Synthesis
model
S(n)
To
ChannelExcitation
parameters
LPParameters
f (n; m) e (n; m)
Residualerror
m)}(i;a{
estimatepitch ,P
decision V/U,
estimategain ,Θ0
![Page 37: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/37.jpg)
37
Buffer
And
window
S(n) f (n; m) Inverse
Filter
m)(z;A
Lowpass
FilterDecimator DFT Encoder
To
Channel
Prediction
Residual
m)(n;
stLP
analysis
LPParameters
m)}(i;a{
RELP Analyzer (type 2):
![Page 38: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/38.jpg)
38
Synthesizer for a RELP vocoder
DecoderFrom
Channel
Buffer
And
ControllerInterpolator Rectifier
Highpass
Filter
Residual
LP
synthesizer
LP
model
Parameter
updates
∑
Excitation
![Page 39: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/39.jpg)
39
RELP needs to regenerate the high-
frequency components at the decoder.
A crude approximation of the high frequencies
The multipulse LPC is a time domain
analysis-by-synthesis method that results
in a better excitation signal for the LPC
vocal system filter.
Multipulse LPC Vocoder
![Page 40: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/40.jpg)
40
Multipulse LPC Vocoder The information concerning the excitation sequence
includes: the location of the pulses
an overall scale factor corresponding to the largest pulse amplitude
The pulse amplitudes relative to the overall scale factor
The scale factor is logarithmically quantized into 6 bits.
The amplitudes are linearly quantized into 4 bits.
The pulse locations are encoded using a differential coding scheme.
The excitation parameters are updated every 5 msec.
The LPC vocal-tract parameters and the pitch period are updated every 20 msec.
The bit rate is 9600 bps.
![Page 41: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/41.jpg)
41
Analysis-by-synthesis coder
)/(ˆ
)(ˆ
)(ˆ
)/(ˆ)(
czA
zA
z
czzW
A stored sequence from a Gaussian
excitation codebook is scaled and used to
excite the cascade of a pitch synthesis filter
and the LPC synthesis filter
The synthetic speech is compared with the
original speech
Residual error signal is weighted
perceptually by a filter
![Page 42: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/42.jpg)
42
Obtaining the multipulse excitation:(Analysis by synthesis method)
LP
Synthesis
filter
Buffer
And
LP analysis
Multipulse
Excitation
generator
Error
minimization
Perceptual
Weighting
filter W(z)
∑m)(n;f
m)f(n;
s(n)
Input speech
m)(n;
m)(n;W
+-Pitch
Synthesis
(z)filterΘp
P
![Page 43: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/43.jpg)
43
Code Excited LP :
CELP is an analysis-by-synthesis method
in which the excitation sequence is
selected from a codebook of zero-mean
Gaussian sequence.
The bit rate of the CELP is 4800 bps.
![Page 44: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/44.jpg)
44
CELP (analysis-by-synthesis coder) :
Gaussian
Excitation
codebook
Pitch
Synthesis
filter
Spectral
Envelope
(LP)
Synthesis filter
∑
Perceptual
Weighting
Filter W(z)
Computer
Energy
(square and sum)
Buffer and
LP
analysis
Side
informationGain
LP
parameters
Speech samples
Index of
Excitation
sequence
![Page 45: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/45.jpg)
45
Analysis-by-synthesis coder
This weighted error is squared and
summed over a subframe block to give the
error energy
By performing an exhaustive search
through the codebook we find the
excitation sequence that minimizes the
error energy
![Page 46: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/46.jpg)
46
Analysis-by-synthesis coder
The gain factor for scaling the excitation
sequence is determined for each
codeword in the codebook by minimizing
the error energy for the block of samples
![Page 47: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/47.jpg)
47
CELP (synthesizer) :
From
Channeldecoder
Buffer
And
controller
Gaussian
Excitation
codebook
Pitch
Synthesis
filter
LP
Synthesis
filter
LP parameters,
gain and pitch
estimate
updates
![Page 48: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/48.jpg)
48
CELP synthesizer Cascade of two all-pole filters with coefficients
that are updated periodically
The first filter is a long-delay pitch filter used to
generate the pitch periodicity in voiced speech
This filter has this form
p
p
pbz
z
1
)(
![Page 49: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/49.jpg)
49
CELP
Parameters of the filter can be determined
by minimizing the prediction error energy,
after pitch estimation, over a frame
duration of 5msec
The second filter is a short-delay all-pole
(vocal-tract) filter and has 10-12
coefficients that are determined every 10-
20msec
![Page 50: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/50.jpg)
50
Example:
sampling frequency is 8 kHz
Subframe block duration for the pitch
estimation and excitation sequence is
performed every 5msec.
We have 40 samples per 5-msec
The excitation sequence consist of 40
samples
![Page 51: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/51.jpg)
51
Example: A codebook of 1024 sequences gives
good-quality of speech
For such a codebook size ,we require 10
bits to send codebook index
Hence the bit rate is reduced by a factor
of 4
The transmission of pitch predictor
parameters and spectral predictor brings
the bit rate to about 4800 bps
![Page 52: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/52.jpg)
52
Low-delay CELP coder
CELP has been used to achieve toll-
quality speech at 16000 bps with low
delay.
Although other types of vocoders produce
high quality speech at 16000 bps these
vocoders buffer 10-20msec of speech
samples
![Page 53: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/53.jpg)
53
Low-delay CELP coder
The one way delay is of the order of 20-40
msec
With modification of CELP, it is possible to
reduce the one-way delay to about 2ms
Low-delay CELP is achieved by using a
backward-adaptive predictor with a gain
parameter and an excitation vector size as
small as 5 samples
![Page 54: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/54.jpg)
54
Low-delay CELP coder
Excitation
Vector
quantizer
codebook
LP (high-order)
Synthesis filter ∑
Perceptual
Weighting
Filter W(z)
Error
minimization
Buffer and
window
Input Speech
+
-m)(n;
m)(n;W
m)f(n;
s(n)
Gain
Gain
adaptation
Predictor
adaptation
m)(n;f
![Page 55: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/55.jpg)
55
Low-delay CELP coder
Pitch predictor used in the conventional
forward-adaptive coder is eliminated
In order to compensate for the loss in pitch
information, the LPC predictor order is
increased significantly , to an order of 50
![Page 56: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/56.jpg)
56
Low-delay CELP coder
LPC coefficients are updated more
frequently, every 2.5 ms
5-sample excitation vector corresponds to
an excitation block duration of 0.625 msec
at 8-kHz sampling rate
![Page 57: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/57.jpg)
57
Low-delay CELP coder
The logarithm of the excitation gain is
adapted every subframe excitation block
by employing a 10th-order adaptive linear
predictor in the logarithmic scale
The coefficients of the logarithmic-gain
predictor are updated every four blocks by
performing an LPC analysis of previously
quantized excitation signal blocks
![Page 58: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/58.jpg)
58
Low-delay CELP coder
The perceptual weighting filter is also 10th
order and is updated once every four
blocks by employing an LPC analysis on
frames of the input speech signal of
duration 2.5 msec
The excitation codebook in the low-delay
CELP is also modified compared to
conventional CELP
10-bit excitation codebook is employed
![Page 59: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/59.jpg)
59
Vector Sum Excited LP : The VSELP coder and decoder basically differ in
method by which the excitation sequence is formed
In the next block diagram of the VSELP, there are three excitation sources
One excitation is obtained from the pitch period state
The other two excitation sources are obtained from two codebooks
![Page 60: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/60.jpg)
60
VSELP Decoder :
1
0
Long-term
Filter state
Codebook
1
Codebook
2
∑
Pitch
synthesis
filter
Spectral
post filter
Spectral
envelope
(LP)
synthesis
filter
Synthetic
Speech
2
![Page 61: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/61.jpg)
61
VSELP Decoder
LPC synthesis filter is implemented as a
10-pole filter and its coefficients are coded
and transmitted every 20ms
Coefficients are updated in each 5-ms
frame by interpolation
Excitation parameters are also updated
every 5ms
![Page 62: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/62.jpg)
62
VSELP Decoder
128 codewords in each of the two
codebooks
codewords are constructed from two sets
of seven basis codewords by forming
linear combinations of the seven basis
codewords
The long-term filter state is also a
codebook with 128 codeword sequences
![Page 63: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/63.jpg)
63
VSELP Decoder In each 5-msec frame, the codewords from
this codebook are filtered through the
speech system filter and correlated
with the input speech sequence
The filtered codeword is used to update
the history and the lag is transmitted to the
decoder
)(ˆ z
![Page 64: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/64.jpg)
64
VSELP Decoder
Thus the update occurs by appending the
best-filtered codeword to the history
codebook
The oldest sample in the history array is
discarded
The result is that the long-term state
becomes an adaptive codebook
![Page 65: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/65.jpg)
65
VSELP Decoder
The three excitation sequences are
selected sequentially from each of three
codebooks
Each codebook search attempts to find the
codeword that minimizes the total energy
of the perceptually weighted error
Once the codewords have been selected
the three gain parameters are optimized
![Page 66: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/66.jpg)
66
VSELP Decoder
Joint gain optimization is sequentially
accomplished by orthogonalizing each
weighted codeword vectors prior to the
codebook search
These parameters are vector quantized to
one of 256 eight-bit vectors and
transmitted in every 5-ms frame
![Page 67: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/67.jpg)
67
Vector Sum Excited LP :
The bit rate of the VSELP is about 8000 bps.
Bit allocations for 8000-bps VSELP
Parameters Bits/5-ms Frame Bits/20ms
10 LPC coefficients - 38
Average speech energy - 5Excitation codewords from
two VSELP codebooks 14 56
Gain parameters 8 32
Lag of pitch filter 7 28
Total 29 159
![Page 68: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/68.jpg)
68
VSELP Decoder Finally, an adaptive spectral post filter is
employed in VSELP following the LPC
synthesis filter; this post filter is a pole-zero
filter of the form
)/(ˆ
)(ˆ
)(ˆ
)/(ˆ)(
czA
zA
z
czzW
![Page 69: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/69.jpg)
69
DEMO
Speech Codec Male
Speaker
Female
Speaker
Music
Original Speech/Music
(16-bit sampled at 8KHz)
FS-1015 (LPC-10e 2.4
kb/s)
FS-1016(CELP 4.8 kb/s)
IS-54 ( VSELP 7.95 kb/s)
G.721 (32 kb/s ADPCM)
![Page 70: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/70.jpg)
Standard Voice Algorithms
G.711 The most widely used digital representation of voice signals is that of
the G.711 or PCM (Pulse Code Modulation)
This codec represents a 4 kHz band limited voice signal sampled at 8
kHz using 8 bits per sample A-law or m-law coding.
G.726 The protocol for the G.726 codec requires a 64 kbps A-Law or m-law
PCM signal to be encoded into four different bit rate options ranging
from 2 bits per sample to 5 bits per sample
The algorithm is based on Adaptive Differential Pulse Code Modulation
(ADPCM) and is based on 1 sample backward prediction scheme.
70
![Page 71: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/71.jpg)
G.728 The G.728 algorithm compresses PCM codec voice signals to a bit rate of 16 kbps.
This algorithm is based on a strong backward prediction scheme and is by far considered as one
of the most complex voice algorithms to be produced by the ITU standard organization.
G.729 For compression of voice signals at 8 kbps the G.729 algorithm offers toll quality with built in
algorithmic delays of less than 15 msec
Additional features described in the G.729 Annex ensure VAD1 and Comfort Noise Generation
functionalities to enhance the quality and reduce the overall bit rate
G.723.1 The most widely used algorithm for band limited channels, such as VoIP and video conferencing,
is that of G.723.1
The algorithm has two operating bit rates of 6.3 kbps and 5.3 kbps
Although the delay is not as low as that of the other ITU standards its quality is near toll quality for
the given low bit rates, making it very efficient in bit usage.
71
![Page 72: Speech Processing (Vocoders)ce.sharif.edu/courses/97-98/2/ce967-1/resources/root...Various LPC-type speech analysis and synthesis methods differ primarily in the type of excitation](https://reader035.vdocuments.net/reader035/viewer/2022070818/5f161909b363bb293f199d39/html5/thumbnails/72.jpg)
GSM2—AMR
The latest GSM standard is the multi rate Adaptive Code Excited Linear Prediction
that provides compression in the range of 4.75 to 12.2 kbps
In total the codec provides 12 bit rates that cover the half rate to full rate channel
capacity.
GSM—FR
The first digital codec used in a mobile environment is the GSM Full Rate vocoder
The codec compresses 13 bit PCM sample signals to a rate of 13 kbps
The algorithm is based on a very simple Regular Pulse Excited – Linear Prediction
Coding technique.
GSM—HR
To increase capacity, the GSM committee decided on a lower bit rate of 5.6 kbps for
the voice channel
The algorithm is based on the Vector Sum Excited Linear Predictive (VSELP) and is
computationally as complex as other low bit rate algorithms.
72