floating point number

34
Floating Point Numbers In the decimal system, a decimal point (radix point) separates the whole numbers from the fractional part Examples: 37.25 ( whole=37, fraction = 25) 123.567 10.12345678

Upload: khan-raqib-mahmud

Post on 08-Dec-2015

31 views

Category:

Documents


1 download

DESCRIPTION

floating point number lectures

TRANSCRIPT

Page 1: Floating Point Number

Floating Point Numbers In the decimal system, a decimal point

(radix point) separates the whole numbers from the fractional part

Examples:

37.25 ( whole=37, fraction = 25)

123.567

10.12345678

Page 2: Floating Point Number

Floating Point Numbers For example, 37.25 can be analyzed as:

 

101 100 10-1 10-2

Tens Units Tenths Hundredths

3 7 2 5

37.25 = 3 x 10 + 7 x 1 + 2 x 1/10 + 5 x 1/100

Page 3: Floating Point Number

Binary Equivalent The binary equivalent of a floating point number can be computed by computing the binary representation for each part separately.

whole part: subtraction or division Fractional part: subtraction or multiplication 

Page 4: Floating Point Number

Binary Equivalent In the binary representation of a floating point number the column values will be as follows:

… 26 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4 …

… 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 …

… 64 32 16 8 4 2 1 . .5 .25 .125 .0625 …

 

Page 5: Floating Point Number

Finding Binary Equivalent of fraction part Converting .25 using Multiplication method.

Step 1 : multiply fraction by 2 until fraction becomes 0

  .25

x 2

0.5

x 2

1.0

Step 2 Collect the whole parts and place them after the radix point

64 32 16 8 4 2 1 . .5 .25 .125 .0625

. 0 1

Page 6: Floating Point Number

Finding Binary Equivalent of fraction part Converting .25 using subtraction method.

Step 1: write positional powers of two and column values for the fractional part

  . 2-1 2-2 2-3 2-4 2 -5

. ½ ¼ 1/8 1/16 1/32

. .5 .25 .125 .0625 0.03125

Page 7: Floating Point Number

Finding Binary Equivalent of fraction part Converting .25 using subtraction method.

Step 2: start subtracting the column values from left to right, place a 0 if the value cannot be subtracted or 1 if it can until the fraction becomes .0 .

.25 2 1 . .5 .25 .125 .0625

- .25 . 0 1

.0

Page 8: Floating Point Number

Binary Equivalent of FP numberGiven 37.25, convert 37 and .25 using subtraction method.  64 32 16 8 4 2 1 . .5 .25 .125 .0625

26 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4

1 0 0 1 0 1 . 0 1 37 .25 - 32 - .25 5 .0

- 4

1 37.2510 = 100101.012

-1 0

Page 9: Floating Point Number

So what is the Problem?Given the following binary representation:

37.2510 = 100101.012

7.62510 = 111.1012

0.312510 = 0.01012

How we can represent the whole and fraction part of the binary rep. in 4 bytes?

Page 10: Floating Point Number

Solution is NormalizationEvery binary number, except the one corresponding to the number zero, can be normalized by choosing the exponent so that the radix point falls to the right of the leftmost 1 bit.

37.2510 = 100101.012 = 1.0010101 x 25

7.62510 = 111.1012 = 1.11101 x 22

0.312510 = 0.01012 = 1.01 x 2-2

Page 11: Floating Point Number

So what Happened ?After normalizing, the numbers now have different mantissas and exponents.

37.2510 = 100101.012 = 1.0010101 x 25

7.62510 = 111.1012 = 1.11101 x 22

0.312510 = 0.01012 = 1.01 x 2-2

Page 12: Floating Point Number

IEEE Floating Point Representation Floating point numbers can be represented

by binary codes by dividing them into three parts:

the sign, the exponent, and the mantissa.

 

1 2 9 10 32

Page 13: Floating Point Number

IEEE Floating Point Representation The first, or leftmost, field of our floating point

representation will be the sign bit: 0 for a positive number, 1 for a negative number.

Page 14: Floating Point Number

IEEE Floating Point Representation The second field of the floating point number will be the

exponent. Since we must be able to represent both positive and

negative exponents, we will use a convention which uses a value known as a bias of 127 to determine the representation of the exponent. An exponent of 5 is therefore stored as 127 + 5 or 132; an exponent of -5 is stored as 127 + (-5) OR 122.

The biased exponent, the value actually stored, will range from 0 through 255. This is the range of values that can be represented by 8-bit, unsigned binary numbers.

Page 15: Floating Point Number

IEEE Floating Point Representation The mantissa is the set of 0’s and 1’s to the

left of the radix point of the normalized (when the digit to the left of the radix point is 1) binary number. ex:1.00101 X 23

The mantissa is stored in a 23 bit field,

Page 16: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

 Example: Find the IEEE FP representation of 40.15625.

Step 1.

Compute the binary equivalent of the whole part and the fractional part. ( convert 40 and .15625. to their binary equivalents)

 

Page 17: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  40 .15625

- 32 Result: - .12500 Result:

8 101000 .03125 .00101

- 8 - .03125

0 .0

 

So: 40.1562510 = 101000.001012

 

Page 18: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  Step 2. Normalize the number by moving the decimal point to the right of the leftmost one.

101000.00101 = 1.0100000101 x 25

 

Page 19: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  Step 3. Convert the exponent to a biased exponent

127 + 5 = 132

==> 13210 = 100001002

 

Page 20: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  Step 4. Store the results from above

Sign Exponent (from step 3) Mantissa ( from step 2)

0 10000100 01000001010 .. 0

 

Page 21: Floating Point Number

Covert 40.15625 to IEEE 32-bit format

40.15625

40-32-8 Binary value of 40 32 16 8 4 2 1 1 0 1 0 0 0

Step 1Find binary of whole number

Step 1bFind binary of fraction number

.15625 -.1250 - .03125 Binary value of .15625 .5 .25 .125 .0625 .03125 0 0 1 0 1

101000.00101

Step 2aMove decimal (Radix point) to the right of the left most 1 to come up with the exponent

.

Equals 101000.00101 x 25 = 1.0100000101

Step 2b Normalize by multipling the number by 2 and the exponent from step 3

127 + 5 = 13210

Step 3aconvert the exponent to a biased exponent by adding the bias of 127 to the exponent

Binary value of 132 132-128-4 64 32 16 8 4 2 1 1 0 0 0 1 0 0

Step 3bConvert the biased exponent to its binary value

Sign Exponent Mantissa 0 10000100 0100000101 0 010000100 01000001010…0

Step 4aPiece together values.

Sign Exponent Mantissa 0 010000100 01000001010…0

Step 4c. Left pad exponent and right pad mantissa to determine the binary equivalent of IEEE

Step 4bMove to mantissa without the 1.

Page 22: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  Ex : Find the IEEE FP representation of –24.75 Step 1. Compute the binary equivalent of the whole

part and the fractional part.  24 .75- 16 Result: - .50 Result: 8 11000 .25 .11- 8 - .25 0 .0

  So: -24.7510 = -11000.112

 

Page 23: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

  Step 2.

Normalize the number by moving the decimal point to the right of the leftmost one.

-11000.11 = -1.100011 x 24

 

Page 24: Floating Point Number

Converting decimal floating point values to stored IEEE standard values.

Step 3. Convert the exponent to a biased exponent

127 + 4 = 131

==> 13110 = 100000112

 

Step 4. Store the results from above

Sign Exponent mantissa

1 10000011 1000110..0

Page 25: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Do the steps in reverse order In reversing the normalization step move

the radix point the number of digits equal to the exponent. if exponent is +ve move to the right, if –ve move to the left.

Page 26: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Ex:  Convert the following 32 bit binary numbers to their decimal floating point equivalents.

Sign Exponent Mantissa  a. 1 01111101 010..0  

Page 27: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 1 Extract exponent (unbias exponent)

biased exponent = 01111101 = 125

exponent: 125 - 127= -2

Page 28: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 2 Write Normalized number

1 . ____________ x 2 ----

-1. 01 x 2 –2

 

mantissa

Exponent

Page 29: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 3: Write the binary number (denormalize value from step2)

 

-0.01012

 

Step 4: Convert binary number to FP equivalent ( add column values)

-0.01012 = - ( 0.25 + 0.0625) = -0.3125

Page 30: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Ex: Convert the following 32 bit binary numbers to their decimal floating point equivalents.

Sign Exponent Mantissa

0 10000011 1101010..0

 

Page 31: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 1 Extract exponent (unbias exponent)

biased exponent = 10000011 = 131

exponent: 131 - 127= 4

Page 32: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 2 Write Normalized number

1 . ____________ x 2 ----

1. 110101 x 2 4

 

mantissa

Exponent

Page 33: Floating Point Number

Converting from IEEE format to the decimal floating point values.

Step 3 Write the binary number (denormailze value from step 2)

 

11101.012

 

Step 4 Convert binary number to FP equivalent ( add column values)

11101.012 = 16 + 8 + 4 + 1 + 0.25 = 29.2510

Page 34: Floating Point Number

Proof your work Convert 0 10000100 01000001010…0 back to IEEE

32 16 8 4 2 1 1 0 1 0 0 0 32+8 = 40

Step 4aFind whole number of exponent

Step 4bFind fractional numberofthe mantissa

.5 .25 .125 .0625 .03125 0 0 1 0 1 .1250 + .03125 = .15625

Equals 101000 . 001012

Step 3aMove decimal (Radix point) to the right ofthe left most 1 to come up with the exponent

. 1.0100000101 x 25

Step 2Denormalize by multipling the number by 2 andthe exponent from step 2.Set up format 1 mantissa x 2 exponent

132 – 127 = 5

Step 1bUnbias the number by Subtracting 127from the decimal number to determinethe exponent

128 32 16 8 4 2 1 1 0 0 0 1 0 0 128+4 Binary value = 132

Step 1aDetermine the decimal of the binarynumber

Step 4cAdd together values. Make sure to includethe sign if it is a negative value

Sign Exponent Mantissa 0 010000100 01000001010…0

40.15625

Step 3bConvert binary number to FP equvalent