ieee floating point

Click here to load reader

Upload: jason-thomas

Post on 12-Jul-2015

89 views

Category:

Education

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

IEEE 754 Standard for Floating Point Representation of Real Numbers.

There are four pieces of info to be represented: Sign of the number (Always the high order bit; 0=positive, 1=negative.) Magnitude of the number (Stored in binary with leading "1" understood. See below.) Sign of the exponent (Stored as an offset "bias" on value of exponent. See below.) Magnitude of the exponent. (Stored as unsigned binary added to offset bias.)

I. Float. (32 bits) Binary template: SEEE EEEE EMMM MMMM ... MMMM where S = sign bit, E = exponent bits (8 of these), M = mantissa bits (23 of these).

Process to represent: A. Convert decimal number to binary. B. Move radix point to 1.xxx... * 2^exp representation. C. Now, do the work: i. S. Assign the sign bit. positive --> 0, negative --> 1.

ii. Assign the mantissa. This is the fractional part of the value of the number. --ignore the leading "1". It is always 1 in this scientific notation, so there is no need to store it. The system will re-insert it later when the number is used. --form the 23 MMM...MMM bits from the first 23 of the remaining "xxx..." bits above. If there are not 23 of them, fill out with trailing zeros.

iii. The exponent "exp" may be positive or negative. Float is stored with excess-127 notation. That is, you ADD 127 to your exponent and store as a pure (unsigned) binary. When the number is re-created for use later, 127 will be subtracted from the exponent. This method saves having to use 2's complement for negative exponents.

iv: Combine in recipe SEEEEEEEEMMMM...MMM format.

Example: Find the representation of decimal -10.5 in float IEEE-754 representation. A. Convert the number: -10.5 decimal --> -1010.1 binary B. Move radix to get scientific notation: -1010.1 binary --> -1.0101 * 2^(+3) (binary) C. Now, do the work: i. Sign bit will be 1 since the number is negative.

ii. Mantissa. (101010000...0) Ignore leading one: 01010000...0 Use first 23 bits: 01010000000000000000000 -->This is mantissa.

iii. Exponent is +3. ADD excess-127 offset: 127 + (+3) = 130 decimal = 10000010 binary. These are the 8 bits for the exponent: 10000010.

D. Combine in SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM format:

11000001001010000000000000000000

To make it easier to read, group in 4's and convert each group to hex:

original 11000001001010000000000000000000 binary grouped: 1100 0001 0010 1000 0000 0000 0000 0000 binary C 1 2 8 0 0 0 0 hex or, C128 0000 hex

II. double. The process to convert to "double" is similar to "float": 1. a total of 64 bits are used to represent the number. 2. The sign bit still is one digit, always the high order digit. 3. The mantissa is stored with 52 bits instead of 23. This gives doubles greater precision. 4. The exponent is stored with 11 bits instead of 8. This gives doubles a wider range of representable numbers.

Example: Decimal +10.5 --> 0100 0000 0010 0101 0000 ... 0000 binary (64 bits) 4025 0000 0000 0000 hex

IEEE Standard 754 for Binary Floating-Point Arithmetic

Lect 2b -IEEE Floating Point Adder Arch

This Unit: Floating Point Arithmeticmilom/cis371-Spring08/lectures/05_fp.pdf · ... Floating Point 2 This Unit: Floating Point Arithmetic ... •IEEE 754 defines operations on infinity

Precision and Performance: Floating Point and IEEE 754 ...cseweb.ucsd.edu/.../cuda-5.5-doc/pdf/Floating_Point_on_NVIDIA_GP… · Precision and Performance: Floating Point and IEEE

Precision and Performance: Floating Point and IEEE …...Floating point encodings and functionality are defined in the IEEE 754 Standard [2] last revised in 2008. Goldberg [5] gives

Floating Point Arithmetic February 15, 2001 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties IA32 floating

Cortex -A9 Floating-Point Unit - ARM architectureinfocenter.arm.com/help/topic/com.arm.doc.ddi0408i/DDI0408I_cortex... · 15/06/2012 · Cortex-A9 Floating-Point Unit ... IEEE Standard

Defining the IEEE-854 Floating-Point Standard in … Technical Memorandum 110167 / Defining the IEEE-854 Floating-Point Standard in PVS Paul S. Miner Langley Research Center, Hampton,

1 IEEE Floating Point Revision Guide for Phase Test Week 5

TSC692E Floating Point Unit User’s Manualmicroelectronics.esa.int/erc32/doc/tsc692e.pdf · TSC692E Floating Point Unit ... with Standard ANSI/IEEE 754-1985 for binary Floating Point

IEEE 754 Floating-Point Formatcse.iitkgp.ac.in/~goutam/pds/pdsLect/lect15.pdf · IEEE 754 Floating-Point Format ... The range of a ﬂoating-point representation of the form m.mmm