politecnico di torino · nadia perreca, id 211012 speech coding: techniques, standards and...
TRANSCRIPT
-
POLITECNICO DI TORINO
ANALOG AND TELECOMMUNICATION ELECTRONICS
MINIPROJECT:
SPEECH CODING: TECHNIQUES, STANDARDS AND APPLICATIONS
PROFESSOR: DANTE DEL CORSO
STUDENT: NADIA PERRECA, ID: 211012
ACADEMIC YEAR: 2013-2014
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
2
INDEX
INTRODUCTION 3
CHAPTER I: SPEECH CODING 4
I.1 Speech signal 4
I.2 Speech processing 7
I.3 Speech coding 8
I.4 Speech coding standards 12
I.5 Parametric representations 13
I.6 Waveform representations 17
I.7 Methods of comparison of speech coding techniques 20
CHAPTER II: PULSE CODE MODULATION TECHNIQUES 22
II.1 Pulse Code Modulation 22
II.2 Linear PCM 24
II.3 Logarithmic PCM 27
II.3.1 A and μ conversion laws 29
II.4 Differential PCM 33
II.5 Adaptive Differential PCM 37
II.6 Time division multiplexing 39
APPENDIX 42
BIBLIOGRAPHY 44
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
3
INTRODUCTION
Speech signals are maybe the most natural and common signals we can image to deal with; that’s
why, in the sphere of Information Technologies, voice communication have always played a very
important role. Speech signals are unpredictable signals, whose values and characteristic vary very
much according to the speaker and the message he wants to transmit, so there’s the need to develop
specific techniques that can simplify their complicated processing. Among them, the speech coding
is the process of obtaining a compact representation of voice signals for efficient transmission over
band-limited wired and wireless channels, storage or many other applications. “Compact” is not a
simple adjective but a key word: the goal of speech coding is to represent speech in digital form
with as few bit per second as possible without losing the intelligibility and "pleasantness" of
speech, which include speaker identity, emotions, intonation, timbre and so on. The need of
compactness is due to the technological transition from analog to digital electronics.
In the past, the speech signal coding techniques have been implemented and optimized for
networks "dedicated" to the telephone traffic; today, speech coders have become essential
components in both telecommunications and multimedia infrastructures. Commercial systems that
rely on efficient speech coding include cellular communication, voice over internet protocol
(VOIP), videoconferencing, electronic toys and so on.
The aim of this project is to analyze the basic characteristics of speech signals and the most
common speech signal processing techniques, with particular attention to the speech coding as a
compression data technique. We’ll give some information about the different types of speech signal
representations and the methods we can use to compare them. Then, we’ll focus on the different
Pulse Code Modulation techniques and their application to the speech coding, in order to point out
benefits and drawbacks of each technique and differences among all of them. In the end, we’ll
analyze the Time Division Multiplexing technique, that’s one of the most important and actual
application of Pulse Code Modulation technique.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
4
CHAPTER I: SPEECH CODING
This chapter is an introduction to the speech coding techniques, standards and applications.
We’ll analyze the basic characteristics of speech signals and the most common speech signal
processing techniques, with particular attention to the speech coding as a compression data
technique. We’ll give some information about the different types of speech signal representations
and the methods we can use to compare them.
I.1 SPEECH SIGNAL
A speech signal is created by the vocal cords, travels through the vocal tract, is produced at
speakers’ mouth and gets to the listener’s ear as a pressure wave.
From an engineering point of view, we can model the speech production with a source-filter
model: we can see the vocal cords as a source and the vocal tract a resonant cavity. If you placed a
microphone right above someone’s glottis during voicing, you would hear the glottal source by
itself as a buzzing sound. The vocal tract filters the sound energy by suppressing some components
of the glottal wave and amplifying the ones that are close to the resonance frequencies of the vocal
tract, which depends on its shape and its length. In this way, it changes the sound quality of the
complex wave produced by the sound source. That means that when we talk about speech signal,
we mean a sort filtered version of the real emitted sound.
In Figure 1, we can see a representation of a speech signal.
Figure 1: Speech signal.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
5
We can divide the speech sound into voiced and unvoiced: voiced signals are produced when the
vocal cords vibrate during the pronunciation of a phoneme, while unvoiced signals do not entail the
use of the vocal cords. Voiced signals tend to be louder like the vowels; on the other hand, unvoiced
signals tend to be more abrupt like the stop consonant. The production of voiced and unvoiced
speech is separated by silence regions: during silence region, there is no excitation supplied to the
vocal tract and hence no speech output. However, silence is an integral part of speech signal: even if
from an energy point of view it’s unimportant, its duration is essential for intelligible speech and it
helps to recognize a certain category of sounds. Without the presence of silence region between
voiced and unvoiced speech, the speech will not intelligible.
A first distinction among voiced, unvoiced and silence speech can be done looking at the signal
amplitude: if it’s low or negligible, the signal can be marked as silence, otherwise as unvoiced. If it
exceeds a threshold level that is usually chosen by the user according to the preliminary noted
characteristics of the sound he’s studying, it is declared to be voiced. In Figure 2, this classification
is illustrated.
Figure2: Speech signal: voiced, unvoiced and silence regions.
As we can see from previous figures, speech is not a predictable signal; from an analytic point of
view, it means that it’s a non stationary signal with a non-uniform probability distribution, even if
sometimes it’s approximated with a Gaussian distribution. Its characteristics vary quickly and
depend on the different emitted sound; this makes the speech signal hard to analyze and model.
A quite common practical solution consists in to model the speech signal as a slowly varying
function of time: we mean that, during intervals of 5 to 25 ms, the speech characteristics hopefully
don’t change too much and we can consider them to be almost constant. That means that, over small
time intervals, the speech signal can be considered stationary with good approximation. Over these
windows we can analyze the signal spectrum, the power density distribution and we can distinguish
voice and unvoiced sounds.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
6
A general block diagram is shown in Figure 3.
Figure 3: Block diagram of the analysis of speech signal by using frames.
The bandwidth of speech signals is concentrated in the bandwidth: 300 Hz- 3.4 kHz; they have a
low pass trend.
Even if it’s difficult to recognize this fact in the time domain, voiced signal waveforms are
periodic signals, so they have a line spectrum. On the contrary, unvoiced signal spectrum is
continuous. There may be regions where the speech can be mixed version of voiced and unvoiced
speech. In mixed speech, the speech signal will look like unvoiced speech, but you will also observe
some periodic structures. We can see the voiced speech as the useful signal and the unvoiced speech
as a sort of noise; usually, it is modeled as a White Gaussian Random Variable.
In Figure 4 we can see the signal for the speech word “six”; if we consider a frame during the
pronunciation of the consonant “s” (unvoiced signal), the signal appears continuous, as a sort of
noise, while if we analyze a frame during the pronunciation of the vocal “i” (voiced signal), we can
see a more regular signal.
Figure 4: Speech word “six”.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
7
I.2 SPEECH PROCESSING
Speech processing is the study of speech signals and the processing methods of these signals. It
involves the study of the techniques we use to deal with speech signals and of all the applications
they are suitable to. There are analog and digital speech signal processing techniques. At the very
beginning, speech signal processing was developed using analog electronics, in fact all 1G systems
have analog speech transmission. Since the 1970s, signal processing has more and more been
implemented on computers in the digital domain, in fact all 2G and 3G systems have digital speech
transmission.
Digital speech signal processing techniques are easier, less expensive, more sophisticated and
faster than the analog ones. They allow an improvement in quality of speech, are reliable and very
compact and can be implemented in Integrated Circuits. Plus, with regard to the transmission of
voice signal requiring security, the digital representation has a distinct advantage over analog
systems: the information bits can be scrambled in a manner which can be ultimately be unscrambled
at the receiver.
Digital signal processing techniques can be applied in many speech communication areas:
- Transmission and storage;
- Speaker verification and identification;
- Speech recognition (conversion from speech to written text);
- Aids to the handicapped;
- Enhancement of signal quality.
A general scheme which includes the fundamental signal processing is shown in Figure 5: a
speech coder converts the analog speech signal into a coded digital representation, which is usually
transmitted in frames over a non distorting channel. A speech decoder receives coded frames and
synthesizes reconstructed speech.
Figure 5: Block diagram of a general speech signal processing.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
8
I.3 SPEECH CODING
As we can see in Figure 5, the speech signal coding is a very important step in speech signal
processing. It involves the study of all the techniques used to represent and threat speech signals in
a more convenient form that allows us to use them for several applications such as acquisition,
manipulation, storage, transfer and so on.
In general, a speech signal coder has two essential characteristics:
- Integrity of the speech: the information contained into the speech signal must be kept intact
without distortions.
- Quality: the speech signal must be intelligible and pleasantness, that means that things such
as speaker identity, emotions, intonation, timbre and so on must be recognized.
In addition, it can have several desirable properties as:
- Low bit-rate;
- Low memory requirements;
- Low transmission power required;
- Fast transmission speed;
- Low computational complexity;
- Low coding delay;
- Robustness.
We can easily understand why all these properties are desirable. If we have a low bit-rate coder, a
less bandwidth is required for the transmission. Plus, we reduce the amount of transmitted data,
saving memory and transmission power required and increasing the transmission speed. If the coder
has a low computational complexity, the power required still decrees. The coding delay is the time
that elapses from the time the speech sample arrives at the encoder input to the time the speech
sample appears at the decoder output, so it’s clear that we want a coding delay that is as low as
possible, in order to minimize interferences and interruptions during the communication. A speech
code is robustness if it’s suitable for any types of speakers such as male, female, children, and many
different languages. That’s a very difficult property to satisfy because in general we need different
circuits and devices to deal with different types of speech sound, according to their complexity.
However, we have to put in mind that here is always a tradeoff between having one or another
property, in particular between low bit-rate and speech quality. In general, we have to design the
system according to the given specifications.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
9
There are essentially two types of digital speech signal coding: waveform representations and
parametric representations. As we can see in Figure 6, both of them involves a series of techniques
that we’ll analyze in the next sections.
Figure 6: Speech signal coding techniques
Waveform representation, as the name implies, are concerned with preserving the wave shape of the
analog speech signal trough a sampling and quantization process.
Parametric representations, on the other hand, are concerned with representing the speech signal
by using some of its characteristic parameters.
There are also hybrid representations, which are a fusion of the two illustrated coding
techniques, but we won’t analyze them.
In the study of speech signal processing, the speech coding is a very important matter, especially in
telecommunication area.
Speech coding is the process of obtaining a compact representation of the speech signal that can
be efficiently transmitted over band-limited wired and wireless channels or stored in digital media.
“Compact” is not a simple adjective but a key word: the goal of speech coding is to represent
speech in digital form with as few bits as possible without losing the intelligibility and
"pleasantness" of speech, which include speaker identity, emotions, intonation, timbre and so on.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
10
Other requirements, such as low coding delay, good performances, complexity, low losses, depend
on the particular application we’re dealing with.
In general, we can recognize parametric representations as a form of speech coding, but we’ll
refer more in detail to the ways we have to improve the bit rate of waveform representations,
because they allow to safe the quality of the signal. We’ll see that a standard bit-rate for a waveform
representation is fixed at 64 kb/s: any bit-rate below 64 kb/s is treated as compression and the
output of the source encoder is an encoded speech signal having a bit-rate less than 64 kb/s.
If we compress the speech signal by reducing the number of bits per sample, we obtain a lot of
benefits:
- Reduction of the bandwidth;
- Reduction of the transmitted data (memory occupation);
- Reduction of the transmission power required;
- Increase of the transmission speed;
- Increase of the immunity to noise (some of the saved bits per sample can be used as
protective error control bits to the speech parameters).
Usually we can distinguish four levels of quality, according to the bit rate:
- TOLL: perfect quality;
- NEAR TOLL: almost perfect;
- DIGITAL CELLULAR: noise is introduced as background but the spoken is still very well
reconstructed;
- LOW BIT RATE: speech is over that noisy artificial and not natural, but still
understandable.
In Table 1, we can see some examples of cited speech coding and relative bit rate and quality.
Table 1: Speech coding bit rate and quality.
Speech coding Bit rate (kb/s) Quality
PCM 64/32 TOLL
DPCM 32/16 NEAR TOOL
ADPCM 4 DIGITAL CELLULAR
Vocoder 4.2 LOW BIT RATE
LPC-10 2.4 LOW BIT RATE
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
11
In the past, the speech signal coding techniques have been implemented and optimized for networks
"dedicated" to the telephone traffic; but the growing need of integration between telephony and data
will involve the study of new standards that offer such services, the "voice" of IP (Internet
Protocol), and network data, which are able to ensure quality levels comparable to those offered the
old telephone network. The satellite communication systems, where the cost of the channel is very
high, mobile systems, where the number of users grows exponentially, as well as multimedia
systems, whose information content requires employment of considerable mass memory, all
applications are for which it is necessary to introduce processes of encoding voice.
We can synthesize all the areas of application in a unique immediate graph, shown in Figure 7.
Figure 7: Speech coding applications.
Clearly, according to the specific area we are interested in, we’ve to use different coding techniques
and standards.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
12
I.4 SPEECH CODING STANDARDS
Standards for landline Public Switched Telephone Service (PSTN) networks are established by the
International Telecommunication Union (ITU). The ITU is a branch of the International Standards
Organization (ISO), which also develops standards for the Moving Picture Experts Group (MPEG).
The ITU has promulgated a number of important speech and waveform coding standards at high bit
rates and with very low delay, including:
- G.711 : it standardizes the PCM 64 kb/s in which you are using a uniform quantization for
the discretization of the amplitudes in 8 bits per sample;
- G.721: it standardizes the ADPCM halving the bit-rate to 32 kb/s while maintaining the
same encoding quality;
- G.722: it standardizes the ADPCM 64 kb/s but uses two ADCPM 32 kb/s, one in the band
0-4 kHz and the other in the 4-7 kHz band;
- G.723.1: it provides two operating speeds: one at 6.3 kb/s and the other at 5.3 kb/s.
Standards for cellular telephony in Europe are established by the European Telecommunications
Standards Institute (ETSI). The ETSI has standardized algorithm for speech coding digital mobile
communication standards, published by the Global System for Mobile Telecommunications
(GSM) subcommittee. All speech coding standards for digital cellular telephone use are based on
LPC-AS algorithms.
The first GSM standard coder was based on a precursor of CELP called regular-pulse excitation
with long-term prediction. In 1999 he designed a new mobile network with global coverage called
UMTS ( Universal Mobile Telecommunication System ) for which the ETSI has proposed a new
coding standard called the AMR ( Adaptive Multi-Rate ) as it uses an encoder that generates
adaptive traffic flows with eight different speeds ( from 12.2 kb/s up to 4.75 kb/s) in function of the
operating conditions.
In Table 2 we can see some applications of illustrated standards.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
13
Table 2: Speech coding for several applications.
Application Bandwidth
(kHz)
Bit rate
(kb/s)
Standards
organization
Standard
number Algorithm Year
Landline telephone 3.4 64 ITU G.711 PCM 1988
Video conferencing 7 64 (32+32) ITU G.722 ADPCM 1988
Digital cellular 3.4 8 ITU G.729 ACELP 1996
Digital cellular 3.4 12.2 ETSI EFR ACELP 1997
VoIP 3.4 5.3–6.3 ITU G.723.1 CELP 1996
I.5 PARAMETRIC REPRESENTATIONS
Parametric representations are concerned with representing the speech signal by using some
parameters which are obtained by analyzing the speech signal spectrum.
The idea is that a sampled speech signal contains a great deal of information that is either
redundant (nonzero mutual information between successive samples in the signal) or perceptually
irrelevant (information that is not perceived by human listeners): if we succeed in describe the
signal by using some of its characteristic parameters, we can transmit these parameters instead than
the signal itself. The parameters of a signal change relatively slowly than the signal they describe
and they are in a little amount, so we can save bandwidth and increase the speed of transmission. In
this way we obtain a lot of benefits in term of transmitted data, memory safe, transmission power
and speed and so on.
Clearly, there are backwards in term of quality: the output signal is not a reconstruction of the
input signal based on its samples, but on some parameters which describes it in an encrypted way;
we’re not able to recreate the original speech, but only a dehumanizing version of it.
In order to obtain these parameters, we need to describe the speech signal production with a
mathematical model, as the one introduced in Section I.1. This model is called source-filter model
because we model the vocal cords as a source and the vocal tract a resonant cavity. The vocal tract
filters the sound energy by suppressing some components of the glottal wave and amplifying others,
the ones that are close to the resonance frequencies of the vocal tract, which depends on its shape
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
14
and its length. In this way, it changes the sound quality of the complex wave produced by the
sound source.
Figure 8: Source – Filter model.
Next step consists in representing voiced and unvoiced signals.
If we segment the speech signal into frames of small time duration in which it can be considered
as a stationary signal, as seen in Section I.1, we can examine each frame at time. The duration of a
single frame must be short enough so that the properties of the sound do not change significantly
within it; must be long enough to be able to calculate the parameters that we want to estimate (also
useful for reducing the effect of any noise which affect the signal). Plus, the series of windows
should cover the entire signal, like shown in Figure 9.
Figure 9: Framed speech signal.
We can use different types of frame: this choice, clearly, influences the quality of the analysis. The
simplest one is the rectangular waveform, but this choice can provide large fluctuation of the
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
15
parameters we are interested in. For example, if we’re measuring the energy of the signal and the
frame shifts, the part of the signal that is contained in the new frame can assume higher values than
the one it assumed in the previous frame and it causes a big difference in the signal energy. So, an
alternative to the rectangular window may be the Hanning one: tapering the ends of the window, we
avoid to have large effects on the parameters even if the signal suddenly changes. Both framing are
shown in Figure 10.
Figure 10: Different types of windows.
If we look at a single frame, we know that we can model the voiced signal as a periodic signal
and the unvoiced one as a White Gaussian Random Variable. In this way, we can model the signal
source (the vocal cords) as two distinct signal sources.
Looking at Figures 9 and 11, we can see that there’s an overlapping between different frames: this
fact allows us to predict the trend of the signal in the next frame by studying its trend in the current
frame.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
16
Figure 11: Different types of windows.
One of the most powerful speech analysis techniques, and one of the most useful methods for
encoding good quality speech at a low bit (only 2kb/s!) is the Linear Predictive Coding (LPC).
It’s defined as a method for encoding an analog signal, in which the value of a current speech
sample is estimated by using its past few speech sample values. That’s possible just because frames
are usually overlapped.
In Figure 12 we can see a schematization of an LPC model.
Figure 12: Linear Prediction Coding model.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
17
First of all, we’re interested in understanding if a voiced or an unvoiced signal has been transmitted:
according to the analysis of the frame, a switch selects the right source. We can’t neglect the effect
of area which acts like a multiplicative factor on the signal, so we find a multiplier. Then, we have
to characterize the filter, which models the effect of the vocal tract; the characteristic parameters of
this filter, such as the gain, the cut off frequency, depends on the specific frame and changes when
we consider different frames and can be estimated using different methods, such as the one of
interest, the LPC.
The parameters which describe the model, and so the speech signal, are transmitted to the
receiver. The receiver unit needs to be set up in the same channel configuration to re-synthesize a
version of the original signal spectrum in order to recreate speech; it will carry out LPC synthesis
using the received parameters and builds a source filter model, that when provided a correct input,
will accurately reconstruct the original speech signal.
LPC is generally used for speech analysis and resynthesis. Some applications of this technique are:
- Phone companies (GSM standard);
- Vocoders;
- Secure Wireless;
- Audio codecs.
I.6 WAVEFORM REPRESENTATIONS
Waveform representation (also called standard representations) are concerned with preserving the
wave shape of the analog speech signal in order to transmit a loyal representation of the speech
signal: they are characterized by a great quality. That implies many data to transmit, a lot of power
required, but also a simple structure based on the well noted sampling and quantization techniques:
that’s the reason why they are not specific to speech signals and can be used for any type of signals.
As we can say in Figure 6, there are two different types of waveform coders: Time domain
Waveform Coders and Frequency domain Waveform Coders. They are based essentially on the
same idea to represent the speech signal using the set of its samples, but they differ themselves
about the techniques used to implement it. We’ll focus on the first class of techniques and in
particular on the Pulse Code Modulation and its variants (Linear PCM (LPCM), Logarithmic PCM
(LPCM), Differential PCM (DPCM), Adaptive Differential PCM (ADPCM)).
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
18
In Figure 13, the general scheme of a waveform representation technique is shown. As
anticipated before, this block diagram is quite general and it’s suitable to schematize many types of
applications; however, regarding speech processing we have a series of standards which governess
every single step according to the application of interest.
Figure 13: Block diagram of a generic waveform representation technique.
Let’s analyze the sampler and the A/D conversion blocks more in detail, as shown in Figure 14.
Figure 14: Detailed block diagram of sampling and A/D conversion.
The sampler provides to make a continuous-time signal into a discrete-time signal. In order to keep
the value of the signal constant for the time required to the following circuits to convert it, a Sample
and Hold technique is used.
The most used sampling technique is the Pulse Amplitude Modulation (PAM). It is an analog,
impulsive modulation technique: it means that the modulating signal is an analog signal and the
carrier signal is a train of pulses whose rate depends on the Nyquist’s criteria and whose duration
depends on the time required for A/D conversion. In Figure 15 the PAM modulation is shown.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
19
Figure 15: Pulse Code Modulation; A) Analog signal (modulating signal), B) Train of pulses (carrier), C) Sampled
signal (modulated signal).
Since the bandwidth of speech signals is from 20 Hz up to 20 kHz, we should sample at least at 40
kHz (according to the Nyquist’s criteria). Anyway, we said that the energy of a speech signal is
concentrated in the firsts 4 kHz so we could think to modify the sampling rate according, for
example, to the bandwidth of telephone channel lines, which is from 300 Hz to 3.4 kHz. As we well
know, the output signal of a telephone line is a clear and pleasant sound!
In this way we could sample at a frequency at least greater than 6.8 kHz; international
standards fixed the telephonic sampling rate at 8kHz, so the sampling period has a duration of 125
μs. It means that the speech signal would be perfectly reconstructed if you have at least 8000
samples per second.
8 / → 125
The sampled signal is a discrete-time signal with continuous amplitude. The A/D converter provides
a quantization and a coding of the amplitude of each modulated signal pulse. The number of bits
assigned for the encoding has been determined by the international standards to 8 bits, so since we
have to transmit 8000 samples per second, we work at 64kb/s. The type of code used depends on
the nature of the communication channel and the transmission speed. As anticipated, any bit-rate
below 64 kb/s is treated as compression and the output of the source encoder is an encoded speech
signal having a bit-rate less than 64 kb/s.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
20
In the end, we apply a source and channel encoding, which are a set of operations that simplify
the transmission of data over the channel. For example, the source encoding allows to converge at
the same time a greater number of traffic flows on a single physical medium.
The decoding step provides more or less the same operations but in reverse, in order to obtain a
loyal version of the analog input speech signal as output.
Waveform coders are most useful in applications that require the successful coding of both voiced
and unvoiced signals. In the Public Switched Telephone Network (PSTN), for example, successful
transmission of modem and fax signaling tones, and switching signals is nearly as important as the
successful transmission of speech.
I.7 METHODS OF COMPARISON OF SPEECH CODING TECHNIQUES
If we want to compare a parametric technique with a waveform technique (or with an hybrid
technique), we need some indicator of the intelligibility and quality of the speech produced by each
coder. There are two methods for evaluating the quality of the speech signal that has undergone a
process of compression:
1) Subjective methods: they are the most significant and reliable methods of comparison, but
they are also very expensive and require high test development time. The most commonly
used parameter is the MOS (Mean Opinion Score), which represents the average ratings of
opinion of a group of listeners. To establish a MOS for a coder, listeners are asked to
classify the quality of the encoded speech in one of five categories characterized by a
numerical value:
1- Bad
2- Poor
3- Fair
4- Good
5- Excellent
We can consider a speech coder as a good one if it’s characterized by a MOS greater than
3.5-4. In Figure 16 we can see how the MOS of different types of speech coders changes
according to the increasing of the bit rate.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
21
Figure 16: Comparison of different types of speech coding techniques.
2) Objective methods: these methods are used in the initial phase of a codec project. They
provide several analytical measurements; the most important one is the relationship Signal-
to-Noise Ratio (SNR) between the power of the input signal and the power of the error
coding. The objective measures have the advantage of being able to be carried out
automatically (and therefore of very large databases), plus they don’t depend on the tastes of
listeners. The main problem with objective measures is that, especially for coders operating
at low speeds, they are not correlated with the quality of the speech signal.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
22
CHAPTER II: PULSE CODE MODULATION TECHNIQUES
In this chapter we’ll analyze the different Pulse Code Modulation techniques and their application
to the speech coding, in order to point out benefits and drawbacks of each technique and differences
among all of them. Then, we’ll analyze the Time Division Multiplexing technique, that’s a very
important application based on the Pulse Code Modulation.
II.1 PULSE CODE MODULATION
The Pulse Code Modulation (PCM) is a method used to digitally represent sampled analog signals;
in other words, it’s a quantization technique that is usually applied to PAM signals, as we
anticipated in the previous Chapter, Section I.6.
The quantization is an operation which, given a continuous amplitude signal, returns a discrete
amplitude signal. The set of discrete amplitudes depends on the range of values assumed by the
input signal and on the number of bits, N, of the quantizer. In the case of interest, in which the input
signal is a PAM signal, quantization affects the amplitude of each sample, by comparing it with the
different levels of quantization and by rounding it to the closest level. The quantization is an
operation which introduces an error called quantization error, due to the rounding or the
truncation of the signal amplitude: it’s the difference between the real analog signal (A) and the
quantized digital value of the same (A’).
′
The quantization error can be quantified by evaluating the Signal to Noise Ratio (SNR):
This quantity is usually express in dB, so:
| 10
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
23
In order to compute the power of a signal s t , we need to know its probability density function, ρ, because:
∞
where σ is the variance of the signal. As explained in the previous chapter, Section I.1, speech signal probability density function can
be approximated with a Gaussian, so we have:
36
And then:
| 10 36 ∙1
Since this parameter depends on the signal power and, so, on the signal amplitude, and since speech
signals are usually low level signals, we can expect that the SNR will be very low. As we can see in Figure 17, the SNR related to speech signals is lower than the one computed for sine or square input signals. We will see that the SNR improves at the increasing of the number of bit; in particular, for each bit we add, we improve the SNR of 6 dB.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
24
Figure 17: SNR for different types of waveforms.
II.2 LINEAR PCM
Linear PCM, or uniform PCM, is the name given to quantization algorithm in which the
reconstruction levels are uniformly distributed among the PAM range of values [0; S] (we consider a unipolar signal, but nothing changes if we consider a bipolar signal whose dynamic belongs to a
range V ; V ] ). It means that we divide the signal dynamic range into a number M 2 of interval having the same amplitude, A ; the interval amplitude is also called quantization step and is equal to:
2
As we can see in Figure 18, the ideal quantization characteristic is a step function.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
25
Figure 18: Ideal transfer function of a linear quantizer.
As anticipated in the previous section, the quantization introduces an error that’s intrinsic in the
process: it can’t be deleted, but only reduced. This error is due to the rounding or the truncation of
the signal amplitude and can be expressed as the difference between the real analog signal and the
quantized digital value of the same.
In the case of LPCM, the error has a sawthoot trend (as shown in Figure 19) and its maximum
value is:
2
Figure 19: Quantization error.
The advantage of LPCM is that the quantization error affects in the same way all the signal
dynamic; this property is desirable in many digital audio applications. However, we can easily
understand that the quantization error is much more relevant when the signals has low amplitude
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
26
rather than when the signal is high: if we deal with a low power signal, the quantization error affects
its value more than it does if we deal with an high power signal. It means that high level signals are
quantized with a good precision, while low level signals with a bad precision: in general, we want
to quantizy whit the same precision all signal levels.
Let’s evaluate the .
Since the quantization error has a sawthoot trend, we can approximate it with a triangular
waveform, whose probability density function is uniform over the function support ( /2; /2]), and easily calculate:
/
/
/
/ 12 12212 2 ∙ 12
So, we obtain:
| 10 36 ∙2 ∙ 12 10 23 10 ∙ 4 3
6 4.77
As anticipated in Section II.1, we have a very low (a negative term is present, while when we
deal with other input signal, we have a positive term), because of the low level of speech signals.
The SNR has a linear trend until the input signal reaches an amplitude that is higher than the quantizer dynamic: this condition is called overload and is shown in Figure 20.
Figure 20: SNR for Linear PCM.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
27
Plus, we can see that if we increase the number of bit, we improve the SNR of 6dB: the drawback of this choice is clearly related to the complexity of a system characterized by an elevated number
of bit.
II.3 LOGARITHMIC PCM
A clever way to improve the accuracy of the quantization technique consists in realizing a non-
linear (or non-uniform) quantization; it means that we consider a quantization step size which is
not constant over the entire dynamic range of the signal but changes according to the level of the
input signal. In this way, we can quantizy a low level signal using a very small quantization step
size and an high level signal using a wider quantization step size, obtaining an acceptable precision
over all PAM signal levels.
Since, in general, we don’t know the signal distribution, a good criteria to follow in order to
realize a non linear PCM is to make the constant on all PAM signal levels. In this way we
can realize a general technique which doesn’t depend on the specific signal amplitude.
We can obtain a non-linear quantization by using an analog or a digital process.
In the analog process, the continuous amplitude PAM signal passes through an analog
compressor before being converted into a digital signal. The compressor is essentially a logarithmic
amplifier that has the task of amplifying the lowest levels of the PAM signal and compress the high
ones.
In the digital process the continuous amplitude PAM signal is converted into a digital signal by
using a linear quantization and subsequently it passes through a compressor which modifies the
digital representation using a different number of bits.
In the last generation systems, the digital non-linear quantization method is more adopted for
obvious reasons of cost, performance, simplicity of construction and integration with all other
apparatuses numeric. Anyway, both solution need that the receiving apparatus PCM must contain
an organ known as complementary to the compressor expander which can restore the original levels
analog information.
For voice signals, whose values are usually very low, we want narrower intervals close to zero, so
the best type of non linear PCM to adopt can intuitively be the logarithmic one.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
28
Figure 21: Ideal transfer function of a logarithmic quantizer.
Actually, we can prove this fact in a more rigorous way.
If we look at the block diagram in Figure 22, we can see that the quantization error is generally
seen as an additive error.
Figure 22: Block diagram of a logarithmic quantizer.
Out of our scheme we will have D, the digital signal, which is:
For low level input signals, whose amplitude is close to zero, we want a quantization error which is
close to zero, so we can express it as:
where K is close to 1.
In this way, we can see the additive quantization error as a multiplicative error:
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
29
∙
This is a very good thing, because the quantization error is relative to the value of the signal and,
since we can see it as a multiplicative error, it changes linearly the output of the system.
As we can see in Figure 23, we have a constant SNR until the overload condition.
Figure 23: SNR for Logarithmic PCM.
II.3.1 A AND μ CONVERSION LAWS
The logarithmic characteristic does not pass through the origin: this fact can be a problem when we
process signals whose level is very close to the origin, a that’s quite common when we deal with
speech signals.
Two laws have been enacted to standardize the ways to solve this problem:
- A – Law: provides the linearization of the logarithmic characteristic for values close to the
origin, whose amplitude is fixed by a parameter A, as shown in Figure 24.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
30
Figure 24: A – Law.
Mathematically speaking, we have:
∙ | | 1
| | | | 1
This law is in the standard law in Europe.
- μ – Law: provides a translation of the logarithmic characteristic in order to obtain a passage
through the origin, as shown in Figure 25. The translation must be equal to the intercept of
the curve; the transfer function will be:
1 ∙
Figure 25: μ – Law.
This law is the standard law in USA and Japan.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
31
These laws introduce an approximation of the logarithmic characteristic: in both cases, in fact, we
don’t have a logarithmic trend close to the origin and that causes a decreasing of the SNR , as shown in Figure 26.
Figure 26: SNR of an approximated logarithmic PCM.
In Figure 27, the SNR ’s trend for both linear and logarithmic trend is shown.
Figure 27: Comparison of the SNR for different PCM.
In order to simplify the analysis of a logarithmic characteristic, we can introduce a piecewise
approximation: the shape of the function is approximated with a set of lines, one put after the
other. We divide the function in segments, which are divided in levels, as shown in Figure 28.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
32
Figure 28: Piecewise approximation.
We can notice the presence of three numbers: the first one is a sign parameters, so a bit which can
identify the polarity of the signal; the second one is a segment parameter, which can identify which
line we are considering, and the last one is a level parameter, that identifies which point of the
considered segment are we considering.
With logarithmic functions, we have, for each point, that the same distance represent the same
ratio; in piecewise approximation we can imagine that this behavior is globally satisfied, but not
locally. This approximation introduces a very bad issue: on each segment, we have a linear
approximation of the logarithm and this means that the quantization error is constant only on each
segment. This is a bad thing because if we consider points near to the bound of the segment, which
have almost the same amplitude, with points which are right or left respect to the bound, we will
have very different quantization error, even if the points are close.
The actual behavior of the signal to noise ratio is shown in Figure 29: there are ripples where we
expected to have a fat behavior; these ripples are wide 6 dB, because of the considerations
previously done.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
33
Figure 29: SNR of a piecewise approximated Log-PCM.
II.4 DIFFERENTIAL PCM
The basic problem with the previous types of PCM is that the quantizer works on a fixed dynamic
range, while the speech signal, for its nature, is usually very low. It means that we effectively work
with a number of bits that’s smaller than the number of bits of the quantizer: we don’t use the
bits which are associated to the higher values of the dynamic range. In the following section
we’ll see how to solve this problem in a very clever way that allows us to drastically reduce the bit
rate. Clearly, it means that we have also a drastically reduction of speech signal quality.
These methods are based on the observation that consecutive samples are often correlated. This
allow two considerations:
1) if samples are correlated, we can predict in a more or less precise way the value of a sample
by an estimation of previous samples;
2) correlated samples contain redundant information that are no useful, so we can delete them
and obtain a faster transmission.
Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code
modulation (PCM) but adds some functionalities based on the exposed ideas. DPCM was invented
by C. Chapin Cutler at Bell Labs in 1950.
This method provides the coding of the difference between an input signal and its predicted
value, estimated by an evaluation of previous input samples; in other words, we code the prediction
error. If the difference (so, the error) is small, it means that the two samples are strictly correlated
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
34
and we can remove redundant information, plus the number of bits required for transmission is
reduced. In this way, we can obtain compression ratios on the order of 2 to 4 and we drastically
reduce the bit rate: clearly, as we know well, this provides a drawback in term of the quality of the
obtained signal, that will be still clear and understandable, but no more pleasantness.
It’s a predictive form of coding, because we have to predict current sample value based upon
previous samples.
The general block diagram of a transmitter and receiver system based on DPCM is shown in
Figure 30.
Figure 30: Transmitter and Receiver for a DPCM.
Let’s analyze them.
The Predictor is a unit whose task is to predict the quantized value of the current input sample,
by an estimation which depends on previous sample (or samples) that were normally quantizied,
and a prediction factor. We can intuitively understand that we can do a better estimation if we
consider many samples instead than only one. In order to consider more samples, we need to
consider a framed input signal, so all the samples which occur in a frame are used to do a new
estimation. Let assume that K is the number of samples in a frame; the choice of the value of K is
critical because:
- If K is low, we have less samples and they are more correlated: they contain a lot of
redundant information that we can delete. However, we have few parameters to deal with, so
the complexity of the circuit is reduced, but the precision of the estimation decreases.
- If K is high, we have more samples and they may have very different values, so they are less
correlated. We can do a good estimation, but the complexity of the circuit increases and also
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
35
its efficiency, because if we have uncorrelated samples, there’re no redundant information to
remove.
We have to chose K in such a way that a possible error will weigh little on the estimate and,
contemporary, that we can work with samples which are correlated. A clever idea that’s usually
implemented is to multiply the frame with an exponential function that allows to weigh the close
samples and reduce the weight of the more distant samples.
The predicted value is a function of those K previous samples:
1 ; 2 ;…;
The difference between the input signal and its predicted value is called prediction error and it’s the
input of the quantizer.
→
If the prediction was right, and so x n x , this signal is zero: redundant information are deleted. Otherwise, it’s a very low signal and we need few bits to quantizy it. In this way, the bit rate is
strongly reduce: if we use, for example, 4 bits, we work at 32 Kb/s.
In the receiver the process is reversed. We need to decoder the input signal and then to add the
predicted signal.
Now we have to understand how the Predictor works.
Usually, the prediction is linear: it means that the predicted value is a weighed linear
combination of previous quantized samples.
1 2 3 …
where A, B, C … are natural numbers.
A block diagram for a linear Predictor is shown in Figure 31.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
36
Figure 31: Block diagram of a Linear Predictor.
The values of the coefficient depend on the autocorrelation function of the signal in the frame we’re
analyzing, so they are not constant numbers. Their optimum values are the ones which minimize the
prediction error power:
, , … ∶ argmin
We can obtain them by using variable gain amplifiers. The D units are delay units.
If the prediction error power is low, we can reduce the bit we need to quantizy the prediction error.
The complete block diagram for a linear Prediction DPCM coder is shown in Figure 32.
Figure 32: Transmitter with Linear Predictor for a DPCM.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
37
II.5 ADAPTIVE DIFFERENTIAL PCM Adaptive differential pulse-code modulation (ADPCM) is a variant of DPCM that introduces some
improvements in order to obtain a further reduction of the bit rate. Actually, this technique can be
applied to improve also standard PCM, LPCM, Log-PCM and so on, not only DPCM. It was
developed in the early 1970s at Bell Labs for voice coding, by P. Cummiskey, N. S. Jayant,
and James L. Flanagan.
The basic idea of ADPCM is to adapt the quantization step to the effective dynamic of the
signal we want to deal with. If we consider a DPCM, we can say that if the difference input signal
is low, ADPCM decreases the quantization step, so it can quantizy this small value with a better
precision. Otherwise, if the difference signal is high, ADPCM increases the quantization level, in
order to cover the entire dynamic.
Whit this technique, we need just a bit and we are succeed in working at bit rates lower than 8
kb/s! However, ADPCM cannot produce satisfactory quality when bit rate is lower than 16 Kb/s.
We can work until to 4 Kb/s, but we lose the quality of the signal and, in particular, the possibility
to recognize the speaker.
To further reduce the bit rate, we need to use speech signal parametric representations.
The basic block diagram for an ADPCM transmitter is shown in Figure 33: it’s very similar to the
block diagram of a DPCM shown in Figure 30, excepted for the presence of the interconnection
between the Predictor and the Quantizer, due to the adaption of the quantization step.
Figure 33: Transmitter for an ADPCM.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
38
In order to adapt the quantization step to the dynamic of the signal, we can use a multiplier that
changes it according to the Predictor’s output. There are two basic and conflicting requirements
during the design of the step-size multiplier: the need of a fast response and the prevention of
excessive step-size alterations in a stationary or steady-state situation (in which no step change is
requirement).
There are two types of ADCPM configuration:
- Adaptive Quantization Forward: the prediction is estimated on samples which haven’t
been still quantized;
- Adaptive Quantization Backward: the prediction is estimated on samples which have been
already quantizied.
For the adaptive quantization forward technique, input samples are memorized in a buffer and
sent to the Prediction unit; then, the quantization step is changed. This technique can be
implemented with a very simple structure but has two limitis:
1) Introduces a delay related to the memorization of samples in the buffer and an additive
amount of information to sent, so it makes the data rate higher.
2) There’s no possibility to recover the analog signal, so we have problems about the
realization of the receiver.
Figure 34: Adaptive Quantization Forward.
These problems are solved by the adaptive quantization backward technique, because it is
implemented by using a feedback configuration. The buffer is put at the outside of the quantizer, so
it doesn’t introduce any delay; plus, the predictor and quantizer information does not need to be
transmitted: “side information” data rate is lower! The receiver can be realized just inverting the
process. However, this technique is less precise because we adapt the quantization step according to
past frames.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
39
Figure 35: Adaptive Quantization Backward.
II.6 TIME DIVISION MULTIPLEXING
Nowadays, PCM is the standard form of digital audio in computers, CD, digital telephony and other
digital audio applications. This technique was proposed around 1930-1940s, when there was the
need to increase the number of long-distance telephone connections. This requirement,
however, conflicted with the difficulties and the cost associated to the large-conductor bundles,
which were very bulky and difficult to connect. So it was thought to multiplex a large number of
telephone connections on a single coaxial cable. This gave rise to the Time Division Multiplexing
(TDM) technique, a very modern and efficient digital technique based on PCM.
A general scheme of TDM technique is shown in Figure 36.
Figure 36: Time Division Multiplexing.
We have n channels, a switcher that selects one of them and a unique coaxial cable through which
the information is transmitted to the receiver. The demultiplexing unit selects the channel that
corresponds to the source channel and then the transmission is complete.
The idea of the TDM is based on the ability to sample a speech signal and at the same time,
during a sampling period, transmit another speech signal on another channel. Consider the same
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
40
sampling period (T 125μs) for each channel; it is divided into n time-intervals called time-slot; in each time-slot, the system transmits the sample generated from one among the n channels. All
channels are served cyclically: it means that each channel transmits samples with a period T125μs. In a sampling period, a word of n samples is sent to the receiver.
We can better understand this process looking at Figure 37.
Figure 37: Time Division Multiplexing; transmission.
The receiver receives a continuous flow of information: in order to allow a correct decodification
and association of each sample to its relative channel, we need to sent to the receiver information
about the time duration of the sampling period (the length of each word associated to each period)
and the channel associated to each received sample. That’s why the information transmitted over
the coaxial cable contains also the serial data lines (DATA), a reference frequency signal
(CLOCK) and a synchronous phase (FRAME).
Actually, the clock information may or may not be present according to the type of TDM; there
are, in fact, two different types of TDM: synchronous and asynchronous. In synchronous TDM, the
clock provides the synchronism of bits, while the phase provides a time reference to identify the slot
in which each device is enabled to transmit or receive information.
The number of channels and the rates are established by international standards.
The European TDM allows to pass 32 channels simultaneously on a single coaxial cable without,
of course, to interfere between them. Of the 32 multiplexed channels, 30 are voice channels (calls)
and 2 channels of service: the channel n.0 is used to sent the clock information at the receiver and
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
41
the n. 16 for the phase information. It applies the G.711 standard: uniform or logarithmic (A-
law) PCM 64 kb/s (sampling rate of 8 kb/s and 8 bits per code). That means that we obtain a rate of
∙ 32 ∙ 8
125 2.048 /
The American TDM allows to pass 24 channels and a single service bit: all channels are
dedicated to calls. It applies the logarithmic (μ- law) PCM 64 kb/s (sampling rate of 8 kb/s and 8
bits per code), so we obtain a rate of
∙ 1 24 ∙ 8 1
125 1.544 /
TDM can also be used within Time Division Multiple Access (TDMA), where stations sharing the
same frequency channel can communicate with one another.
An example of application that utilizes both TDM and TDMA is GSM.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
42
APPENDIX
In the following you can see some real speech signals.
The images were obtained in the LED 2, II floor of “Cittadella”, Politecnico di Torino.
Instruments: Analog Oscilloscope Hameg 1004-3, Microphone , Power Supply.
In Figure 38, two signals are show in order to point out the difference between a voiced and an
unvoiced sound.
Figure 38: Difference between voiced signals and unvoiced signals.
In Figure 39, an amplitude modulated signal is shown: it’s obtained by varying the tone of the
vowel “a”.
Figure 39: Amplitude modulation of the tone “a”.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
43
In Figure 40, we can see four different signals; they are all related to the pronunciation of the
“Hello” word but the word is spoken by four different people.
Figure 40: “Hello” word spoken by four different people.
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
44
BIBLIOGRAPHY
Texts:
L.R. Rabiner & R. W. Schafer, Digital Processing of Speech Signals, Prentice-Hall, 1978. ISBN
0-13213603-1.
D. Del Corso, Elettronica per Telecomunicazioni, McGraw-Hill, 2002. ISBN 88-386-0832-6.
Jerry D. Gibson, Digital Compression for Multimedia: Principles and Standards, Elsevier
Science (USA), 1998. ISBN 1-55860-369-7.
(Partial version available on
http://books.google.it/books?hl=it&lr=&id=aqQ2Ry6spu0C&oi=fnd&pg=PR13&dq=Speech+C
oding+Methods,+Standards,+and+Applications+Jerry+D.+Gibson&ots=vJ8yfLOEV3&sig=ovrz
OwYvkCLDU7kgBgusxljWeP0#v=onepage&q=Speech%20Coding%20Methods%2C%20Stand
ards%2C%20and%20Applications%20Jerry%20D.%20Gibson&f=false )
Wiley Encyclopedia of Telecommunications, John Wiley & Sons, 2003. ISBN 978-0-471-36972-1.
ITU-T Recommendation ( extract from the Blue book), ITU, 1988,1993.
Articles and slides downloaded from web (in May 2014):
M. Hasegawa-Johnson & A. Alwan, Speech Coding: Fundamentals and Applications, University
of Illinois at Urbana.
(http://www.seas.ucla.edu/spapl/paper/mark_eot156.pdf )
J. D. Gibson, Speech Coding Methods, Standards, Applications, University of California at Santa
Barbara.
(http://vivonets.ece.ucsb.edu/casmagarticlefinal.pdf )
-
Nadia Perreca, ID 211012 Speech coding: techniques, standards and applications
45
D. Tipper, Digital Speech Processing, University of Pittsburgh
( www.pitt.edu/~dtipper/2720/2720_Slides7.pdf )
D. P. W. Ellis, An introduction to signal processing for speech, Columbia University, 2008.
It’s a chapter of the book by Hardacastle William J., The Handbook of Phonetic Sciences, edited
by Wiley-Blackwell.
(http://academiccommons.columbia.edu/catalog/ac%3A144483 )
P. Cummiskey, Adaptive Quantization in DPCM Coding of Speech, The Bell System Technical
Journal (volume 52, issue 7, pages 1105-1118), 1973.
(http://www.alcatel.hu/bstj/vol52-1973/articles/bstj52-7-1105.pdf )
Websites:
http://en.wikipedia.org/wiki/Speech_processing
http://en.wikipedia.org/wiki/Pcm
http://www.itu.int/rec/T-REC-G.711/_page.print