ch.5 fixed-point vs. floating point. 5.1 q-format number representation on fixed-point dsps 2’s...
TRANSCRIPT
![Page 1: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/1.jpg)
Ch.5 Fixed-Point vs. Floating Point
![Page 2: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/2.jpg)
5.1 Q-format Number Representation on Fixed-Point DSPs
• 2’s Complement Number– B = bN-1…b1b0
– Decimal Value D = - bN-1 2N-1 + …+ b121+ b0
– There is a dynamic range limitation.
• The Q-format can be used to help prevent overflow in multiplication.
![Page 3: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/3.jpg)
5.1 Q-format
• Q-format or fractional representation– Implied binary point is moved to the left.– F(B)= - bN-1 20 + bN-2 21 +…+ b12-(N-2)+ b02 -(N-1)
– The programmer keeps track of the binary point.
• Example: Q-15– 16 bit numbers—1 sign bit and 15 fractional bits.– Multiplication of 2 such numbers gives a Q-30
number.– The result can be truncated to keep the most
significant 15 fractional bits, and dropping the extended sign bit—See Fig. 5.2
![Page 4: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/4.jpg)
Problems with Q Format
• There can be precision loss with the Q-format—Figure 5-5 illustrates the concept with the Q-12 example.
• Addition and subtraction can still be a problem—scaling can be used to help.
![Page 5: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/5.jpg)
6.2 Finite Word Length Effects on Fixed-Point DSPs
• Coefficients in digital filters will be saved in fixed-point formats in fixed-point DSP implementations.
• The finite word length quantization effect is similar to input data quantization introduced by an A/D converter.
![Page 6: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/6.jpg)
5.1 Finite Word Length Effects (p.2)
• In IIR filters, the fixed-point representation of the coefficients can cause the poles to shift in the z-plane.
• The amount of shift due to the quantization of a single coefficient is influenced by the positions of all the other poles.
• To reduce this effect, IIR filters are often implemented as a cascade of 2nd order systems.
![Page 7: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/7.jpg)
5.2 Finite Word Length Effects (p.3)
• The frequency response of the implemented system is also affected by the quantization of coefficients in the difference equation.
• Finally, coefficient quantization can also lead to limit cycles in IIR filters—this means that in the absence of an input, the response of stable system to a unit impulse could result in undamped oscillations.
![Page 8: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/8.jpg)
5.3 Floating-Point Number Representation
• C67x processor supports single precision and double precision floating-point representations.
• The formats are shown in Figure 5.6 and 5.7.
![Page 9: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/9.jpg)
5.4 Overflow and Scaling
• Scaling is the simplest correction method for overflows in fixed-point implementations.
• This can be implemented in most filtering and transform applications.
• The input is scaled down for processing and the output is then scaled back up.
• Right shifting (dividing by 2) is an easy way to implement scaling.
• The shifting can occur until the overflows disappear from the computations.
![Page 10: Ch.5 Fixed-Point vs. Floating Point. 5.1 Q-format Number Representation on Fixed-Point DSPs 2’s Complement Number –B = b N-1 …b 1 b 0 –Decimal Value D](https://reader036.vdocuments.net/reader036/viewer/2022081211/56649f125503460f94c25538/html5/thumbnails/10.jpg)
5.4 Overflow and Scaling (p.2)
• Scaling of filter coefficients can also be used to avoid overflows.
• It can be shown that the condition to prevent overflow is– ∑ | h[k] | ≤ 1 for k = 0 to N
• For IIR filters N is taken large enough so that the remaining values are negligible.