ch.5 fixed-point vs. floating point. 5.1 q-format number representation on fixed-point dsps 2’s...

Ch.5 Fixed-Point vs. Floating Point

5.1 Q-format Number Representation on Fixed-Point DSPs

• 2’s Complement Number– B = bN-1…b1b0

– Decimal Value D = - bN-1 2N-1 + …+ b121+ b0

– There is a dynamic range limitation.

• The Q-format can be used to help prevent overflow in multiplication.

5.1 Q-format

• Q-format or fractional representation– Implied binary point is moved to the left.– F(B)= - bN-1 20 + bN-2 21 +…+ b12-(N-2)+ b02 -(N-1)

– The programmer keeps track of the binary point.

• Example: Q-15– 16 bit numbers—1 sign bit and 15 fractional bits.– Multiplication of 2 such numbers gives a Q-30

number.– The result can be truncated to keep the most

significant 15 fractional bits, and dropping the extended sign bit—See Fig. 5.2

Problems with Q Format

• There can be precision loss with the Q-format—Figure 5-5 illustrates the concept with the Q-12 example.

• Addition and subtraction can still be a problem—scaling can be used to help.

6.2 Finite Word Length Effects on Fixed-Point DSPs

• Coefficients in digital filters will be saved in fixed-point formats in fixed-point DSP implementations.

• The finite word length quantization effect is similar to input data quantization introduced by an A/D converter.

5.1 Finite Word Length Effects (p.2)

• In IIR filters, the fixed-point representation of the coefficients can cause the poles to shift in the z-plane.

• The amount of shift due to the quantization of a single coefficient is influenced by the positions of all the other poles.

• To reduce this effect, IIR filters are often implemented as a cascade of 2nd order systems.

5.2 Finite Word Length Effects (p.3)

• The frequency response of the implemented system is also affected by the quantization of coefficients in the difference equation.

• Finally, coefficient quantization can also lead to limit cycles in IIR filters—this means that in the absence of an input, the response of stable system to a unit impulse could result in undamped oscillations.

5.3 Floating-Point Number Representation

• C67x processor supports single precision and double precision floating-point representations.

• The formats are shown in Figure 5.6 and 5.7.

5.4 Overflow and Scaling

• Scaling is the simplest correction method for overflows in fixed-point implementations.

• This can be implemented in most filtering and transform applications.

• The input is scaled down for processing and the output is then scaled back up.

• Right shifting (dividing by 2) is an easy way to implement scaling.

• The shifting can occur until the overflows disappear from the computations.

5.4 Overflow and Scaling (p.2)

• Scaling of filter coefficients can also be used to avoid overflows.

• It can be shown that the condition to prevent overflow is– ∑ | h[k] | ≤ 1 for k = 0 to N

• For IIR filters N is taken large enough so that the remaining values are negligible.

ch.5 fixed-point vs. floating point. 5.1 q-format number representation on fixed-point dsps 2’s...

Documents