data representation - college of engineering - oregon...

Data Representation

January 9–14, 2013

1 / 40

Quick logistical notes

In class exercisesBring paper and pencil (or laptop) to each lecture!

Goals:• break up lectures, keep you engaged• chance to work through problems in class• ask questions!

First homework will be posted before Friday’s lecture!

2 / 40

Outline

Internal vs. external representations

Representing the natural numbersBinary number systemBinary arithmeticHexadecimal and base-N number systems

Fixed-size integer representationsRepresenting negative numbersBig endian vs. little endian

3 / 40


Internal representationHow the data is actually represented in the computer hardware

External representationHow we interpret or conceptualize the internal representation

4 / 40

Internal representations

Usually two states, which we interpret as 0 and 1

Volatile representations:• Capacitor (DRAM)

• charged or not• Flip-flop circuit (SRAM)

• one of two output signals is high

Non-volatile representations:• Region of a magnetized surface (hard disks, tape)

• positive or negative• Floating gate transistor (flash)

• change in voltage• one cell can represent more than two states!• e.g. one 16-level cell ≈ four flip-flops

5 / 40

Interacting with the internal representation

Architecture provides an interface• can interact with the internal representation• using the abstraction of the external representation

Advantages:• Don’t have to think about internal representation• Architecture can be implemented by different hardware

6 / 40

Organization of the internal representation

Usually can’t refer to individual bits

• Internal representation organized into groups• Through ISA, can read/write a group by an address

Addressable groups in MIPS• byte = 8 bits• word = 4 bytes = 32 bits• (also halfword = 2 bytes = 16 bits)

7 / 40

External representations

Conceptually, view data as a sequence of 0s and 1s

The same data can be interpreted in different ways:

Example: 1111 0110ö extended ASCII character

246 unsigned integer−10 signed 8-bit integer

8 / 40

Outline




9 / 40

Decimal number system (base 10)

How it works (positional number system):• 10 digits, used in sequence• each position corresponds to a power of 10• sum of each digit multiplied by position value

Example: 2037

. . . 105 104 103 102 101 100

. . . 100,000 10,000 1000 100 10 1(0) (0) 2 0 3 7

2 ·1000 + 0 ·100 + 3 ·10 + 7 ·1= 2000 + 0 + 30 + 7 = 2037

10 / 40

Binary number system (base 2)

Works the same way!• 2 bits, used in sequence (binary digit)• each position corresponds to a power of 2• sum of each bit multiplied by position value

Example: 110101

. . . 27 26 25 24 23 22 21 20

. . . 128 64 32 16 8 4 2 1(0) (0) 1 1 0 1 0 1

1 · 32 + 1 · 16 + 0 · 8 + 1 · 4 + 0 ·2 + 1 ·1= 32 + 16 + 0 + 4 + 0 + 1 = 53

11 / 40

Converting from binary to decimal

Very easy:• Since binary is just 0s and 1s, no need to multiply• Just add up the position values of the 1 bits

Example: 1011 0010

. . . 27 26 25 24 23 22 21 20

. . . 128 64 32 16 8 4 2 11 0 1 1 0 0 1 0

128 + 32 + 16 + 2 = 178

12 / 40

Converting from decimal to binary

Method 1: Subtracting powers of 2For each position p from left to right• If 2p ≤ n, subtract and write 1• Otherwise, write 0

Example: 157157− 128 = 29 1 for 128’s position

29− 16 = 13 0 for 64, 0 for 32, 1 for 1613− 8 = 5 1 for 85− 4 = 1 1 for 41− 1 = 0 0 for 2, 1 for 1

1001 1101

13 / 40

Converting from decimal to binary

Method 2: Successive division by 2• Divide by 2 until you reach 0, keeping track of remainders• Write the remainders, from last to first

Example: 157157 ÷ 2 = 78 R 178 ÷ 2 = 39 R 039 ÷ 2 = 19 R 119 ÷ 2 = 9 R 19 ÷ 2 = 4 R 14 ÷ 2 = 2 R 02 ÷ 2 = 1 R 01 ÷ 2 = 0 R 1

1001 1101

14 / 40

In class exercises

Convert from binary to decimal:• 0010 1010• 1001 0101

Convert from decimal to binary:• 169, by subtracting powers of 2• 84, by successive division by 2

15 / 40

Binary addition

Just like adding decimal numbers!

To add two binary numbers• Pairwise add each bit, starting from the right• 0 + 0 = 0 and 0 + 1 = 1• On 1 + 1, carry a bit to the left

Example: 0110 + 0011110110

+ 00111001

Example: 0011 + 001111

0011+ 0011

0110

16 / 40

Binary multiplication

Same algorithm as decimal (only easier)

To multiply two binary numbers A and B1. For each bit b in B:

• Multiply b × A, aligning the result with b(since b is 0 or 1, each step yields 0 or a A!)

2. Sum the results

Example: 1101 × 1101

1. 1101×1101

11010

11011101

2. 1111111010000

110100+ 1101000

10101001

Often easiest to addresults two at a time

17 / 40

Special case: multiplying by a power of 2

Super easy, just like multiplying by powers of 10 in decimal

To multiply a binary number by 2p

Add p 0s on the right

Examples• 100 × 1101 = 110100• 1010 × 1000 = 1010000

18 / 40

Hexadecimal number system (base 16)

Very useful for representing binary data concisely!

• 16 digits: 0–9, A, B, C, D, E, F• each position corresponds to a power of 16• usually prefixed with 0x

Each hex digit corresponds to 4 bits

0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111

One byte = 2 hex digits

19 / 40

Converting hexadecimal⇔ binary

Each hex digit corresponds to 4 bits

0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111

Examples• 0xA4F7 = 1010 0100 1111 0111• 0x0B60 = 0000 1011 0110 0000

We will be doing this a lot this quarter. :)

20 / 40

Converting hexadecimal⇔ decimal

Two strategies:• Convert directly• Convert hexadecimal⇔ binary⇔ decimal

Example: 0xB6A4 (direct conversion)

. . . 164 163 162 161 160

. . . 65,536 4,096 256 16 1(0) B 6 A 4

11 · 4096 + 6 · 256 + 10 · 16 + 4 · 1= 45056 + 1536 + 160 + 4 = 46,756

21 / 40

Representation in other bases

In general, we can represent numbers in any base

Some other significant bases:

• Base 8 — octal• each octal digit is equivalent to three bits

(000 = 08, 001 = 18, 010 = 28, . . . , 111 = 78)• useful in old architectures with 12, 24, 36 bit words• support in C and many assembly languages

(071 = 718 = 5310)

• Base 64 (0–9, A–Z, a–z, +, /)• each base-64 digit is equivalent to six bits• used in MIME to transmit binary data in plain ASCII text

22 / 40

In class exercises

Add in binary:• 100 1100 + 1110 1111

Multiply in binary:• 1011 × 101

Add in hexadecimal:• 0x28 + 0x4A

0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111

23 / 40

Outline




24 / 40

Arbitrary vs. fixed precision

So far, we have been assuming arbitrary precision• to represent a bigger number, just add more bits/digits!

In practice, integers have a fixed size• commonly 32 or 64 bits• based on register size of the architecture

This is significant for two reasons:• risk of overflow• representation of negative numbers

25 / 40

Representing negative numbers

Must first specify the fixed size of the integer!• With n bits, we can represent 2n different values• Idea: split space so half the values represent negatives

Sign and magnitude representation• First bit represents the sign (0 positive, 1 negative)• Rest of bits represent the magnitude, that is |x |

Suppose 4-bit integers• Examples: −1 = 1001 −4 = 1100 −7 = 1111

This is exactly the representation you’re used to in decimal!

26 / 40

Problems with sign and magnitude representation

This turns out to not be a very good representation . . . why?

Issue 1: Multiple zeros• Both 0000 and 1000 represent the same value• This is strange and requires extra effort

Issue 2: Complicated arithmeticSimple binary addition doesn’t work

0 010+ 0 011

0 1013

1 010+ 1 011

1 1013

0 010+ 1 011

1 0017

27 / 40

One’s complement representation

One’s complement• start with the fixed-size binary representation of |x |• invert every bit

Features:• Binary addition is simple (wrap-around carry)• Still two zeros (all 0s and all 1s)

Examples• -2 1. 0010 2. 1101• -3 1. 0011 2. 1100• -5 1. 0101 2. 1010

28 / 40

One’s complement addition

Overflow carries “wrap around” (added on the right)

Example: −2 +−3 = −5111101

+ 11001001

+ 11010

29 / 40

Two’s complement representation

Two’s complementTo represent a negative number x :

1. start with the fixed-size binary representation of |x |2. invert every bit3. add 1 to the result

Suppose 4-bit integers:

Examples• -1 1. 0001 2. 1110 3. 1111• -4 1. 0100 2. 1011 3. 1100• -7 1. 0111 2. 1000 3. 1001

30 / 40

Features of two’s complement representation

Range of expressible values with n bits• max: 2n−1 − 1 0 followed by all 1s• min: −2n−1 1 followed by all 0s

Fixes the issues with sign and magnitude:• Only one zero! (all 0s)• Binary arithmetic “just works” (discard carry out)

Examples: 2 + 3 = 5 −2 +−3 = −5 2 +−3 = −1

0010+ 0011

01013

1110+ 1101

10113

0010+ 1101

11113

31 / 40

Sign extension

Change the size of an integer without changing its value• if positive (left-most bit 0), pad left with 0s• if negative (left-most bit 1), pad left with 1s

Works with both one’s and two’s complement representation

Example: Extending from 8-bits to 16-bits• 1001 0110⇒ 1111 1111 1001 0110• 0001 0011⇒ 0000 0000 0001 0011

32 / 40

Carry out vs. overflow

Carry out: carry after most significant bit⇒ discard, no errorOverflow: result is out of representable range⇒ error!

Carry out 6= overflow!

Carry out is a normal part of signed integer addition

Will get a carry out when adding:• two negative numbers• a negative and a positive, result is positive

Just ignore it!

33 / 40

Two’s complement overflow detection

Overflow: result is out of representable range⇒ error!

When adding . . .• two numbers with different signs

• overflow can never occur!• two numbers with the same sign

• overflow occurs if the sign changes

34 / 40

Trade-offs between representations of negatives

In modern architectures, two’s complement is used

• Simple arithmetic operations• Only one zero• Hard to read

35 / 40

Unsigned vs. signed integers

Can interpret the same n-bit data as either unsigned or signed

Unsigned integer• Interpret as a positive number• Range: 0 to 2n

Signed integer• Interpret as two’s complement• Range: −2n−1 to 2n−1 − 1

Only different when the leftmost bit is a 1!

36 / 40

Big endian vs. little endian

Order of the addressable components in a larger data type• Usually, the order of bytes within a word

Big endianBytes ordered from most significant (left) to least (right)• Example: 256 as a 16-bit halfword: 0x0100

Little endianBytes ordered from least significant (left) to most (right)• Example: 256 as a 16-bit halfword: 0x0001

Big-endian is what we’ve been assuming so far!

37 / 40

Endian conversion

Converting from big endian to little endian1. Separate the data into addressable components (bytes)2. Write the components (not the bits!) in reverse order

Examples• 0x12345678 1. 12 34 56 78 2. 0x78563412• 0xE5AD5CCA 1. E5 AD 5C CA 2. 0xCA5CADE5

Same algorithm for converting from little to big!

38 / 40

Which architectures are what endian?

Little-endian:• x86, Atmel• MIPS (MARS simulator)

Big-endian:• Motorola 6800 and 68k

Bi-endian (configurable to be big or little):• ARM, SPARC, PowerPC• MIPS (specification)

http://en.wikipedia.org/wiki/Endianness

39 / 40

http://en.wikipedia.org/wiki/Endianness

In class exercises

Assume 8-bit integers, addressable in 2-bit chunks

For each of the following numbers:1. write in two’s complement binary form2. convert to little endian

Numbers:• -50• -100

40 / 40

data representation - college of engineering - oregon...

Documents