data representation - college of engineering - oregon...
TRANSCRIPT
Data Representation
January 9–14, 2013
1 / 40
Quick logistical notes
In class exercisesBring paper and pencil (or laptop) to each lecture!
Goals:• break up lectures, keep you engaged• chance to work through problems in class• ask questions!
First homework will be posted before Friday’s lecture!
2 / 40
Outline
Internal vs. external representations
Representing the natural numbersBinary number systemBinary arithmeticHexadecimal and base-N number systems
Fixed-size integer representationsRepresenting negative numbersBig endian vs. little endian
3 / 40
Internal vs. external representations
Internal representationHow the data is actually represented in the computer hardware
External representationHow we interpret or conceptualize the internal representation
4 / 40
Internal representations
Usually two states, which we interpret as 0 and 1
Volatile representations:• Capacitor (DRAM)
• charged or not• Flip-flop circuit (SRAM)
• one of two output signals is high
Non-volatile representations:• Region of a magnetized surface (hard disks, tape)
• positive or negative• Floating gate transistor (flash)
• change in voltage• one cell can represent more than two states!• e.g. one 16-level cell ≈ four flip-flops
5 / 40
Interacting with the internal representation
Architecture provides an interface• can interact with the internal representation• using the abstraction of the external representation
Advantages:• Don’t have to think about internal representation• Architecture can be implemented by different hardware
6 / 40
Organization of the internal representation
Usually can’t refer to individual bits
• Internal representation organized into groups• Through ISA, can read/write a group by an address
Addressable groups in MIPS• byte = 8 bits• word = 4 bytes = 32 bits• (also halfword = 2 bytes = 16 bits)
7 / 40
External representations
Conceptually, view data as a sequence of 0s and 1s
The same data can be interpreted in different ways:
Example: 1111 0110ö extended ASCII character
246 unsigned integer−10 signed 8-bit integer
8 / 40
Outline
Internal vs. external representations
Representing the natural numbersBinary number systemBinary arithmeticHexadecimal and base-N number systems
Fixed-size integer representationsRepresenting negative numbersBig endian vs. little endian
9 / 40
Decimal number system (base 10)
How it works (positional number system):• 10 digits, used in sequence• each position corresponds to a power of 10• sum of each digit multiplied by position value
Example: 2037
. . . 105 104 103 102 101 100
. . . 100,000 10,000 1000 100 10 1(0) (0) 2 0 3 7
2 ·1000 + 0 ·100 + 3 ·10 + 7 ·1= 2000 + 0 + 30 + 7 = 2037
10 / 40
Binary number system (base 2)
Works the same way!• 2 bits, used in sequence (binary digit)• each position corresponds to a power of 2• sum of each bit multiplied by position value
Example: 110101
. . . 27 26 25 24 23 22 21 20
. . . 128 64 32 16 8 4 2 1(0) (0) 1 1 0 1 0 1
1 · 32 + 1 · 16 + 0 · 8 + 1 · 4 + 0 ·2 + 1 ·1= 32 + 16 + 0 + 4 + 0 + 1 = 53
11 / 40
Converting from binary to decimal
Very easy:• Since binary is just 0s and 1s, no need to multiply• Just add up the position values of the 1 bits
Example: 1011 0010
. . . 27 26 25 24 23 22 21 20
. . . 128 64 32 16 8 4 2 11 0 1 1 0 0 1 0
128 + 32 + 16 + 2 = 178
12 / 40
Converting from decimal to binary
Method 1: Subtracting powers of 2For each position p from left to right• If 2p ≤ n, subtract and write 1• Otherwise, write 0
Example: 157157− 128 = 29 1 for 128’s position
29− 16 = 13 0 for 64, 0 for 32, 1 for 1613− 8 = 5 1 for 85− 4 = 1 1 for 41− 1 = 0 0 for 2, 1 for 1
1001 1101
13 / 40
Converting from decimal to binary
Method 2: Successive division by 2• Divide by 2 until you reach 0, keeping track of remainders• Write the remainders, from last to first
Example: 157157 ÷ 2 = 78 R 178 ÷ 2 = 39 R 039 ÷ 2 = 19 R 119 ÷ 2 = 9 R 19 ÷ 2 = 4 R 14 ÷ 2 = 2 R 02 ÷ 2 = 1 R 01 ÷ 2 = 0 R 1
1001 1101
14 / 40
In class exercises
Convert from binary to decimal:• 0010 1010• 1001 0101
Convert from decimal to binary:• 169, by subtracting powers of 2• 84, by successive division by 2
15 / 40
Binary addition
Just like adding decimal numbers!
To add two binary numbers• Pairwise add each bit, starting from the right• 0 + 0 = 0 and 0 + 1 = 1• On 1 + 1, carry a bit to the left
Example: 0110 + 0011110110
+ 00111001
Example: 0011 + 001111
0011+ 0011
0110
16 / 40
Binary multiplication
Same algorithm as decimal (only easier)
To multiply two binary numbers A and B1. For each bit b in B:
• Multiply b × A, aligning the result with b(since b is 0 or 1, each step yields 0 or a A!)
2. Sum the results
Example: 1101 × 1101
1. 1101×1101
11010
11011101
2. 1111111010000
110100+ 1101000
10101001
Often easiest to addresults two at a time
17 / 40
Special case: multiplying by a power of 2
Super easy, just like multiplying by powers of 10 in decimal
To multiply a binary number by 2p
Add p 0s on the right
Examples• 100 × 1101 = 110100• 1010 × 1000 = 1010000
18 / 40
Hexadecimal number system (base 16)
Very useful for representing binary data concisely!
• 16 digits: 0–9, A, B, C, D, E, F• each position corresponds to a power of 16• usually prefixed with 0x
Each hex digit corresponds to 4 bits
0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111
One byte = 2 hex digits
19 / 40
Converting hexadecimal⇔ binary
Each hex digit corresponds to 4 bits
0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111
Examples• 0xA4F7 = 1010 0100 1111 0111• 0x0B60 = 0000 1011 0110 0000
We will be doing this a lot this quarter. :)
20 / 40
Converting hexadecimal⇔ decimal
Two strategies:• Convert directly• Convert hexadecimal⇔ binary⇔ decimal
Example: 0xB6A4 (direct conversion)
. . . 164 163 162 161 160
. . . 65,536 4,096 256 16 1(0) B 6 A 4
11 · 4096 + 6 · 256 + 10 · 16 + 4 · 1= 45056 + 1536 + 160 + 4 = 46,756
21 / 40
Representation in other bases
In general, we can represent numbers in any base
Some other significant bases:
• Base 8 — octal• each octal digit is equivalent to three bits
(000 = 08, 001 = 18, 010 = 28, . . . , 111 = 78)• useful in old architectures with 12, 24, 36 bit words• support in C and many assembly languages
(071 = 718 = 5310)
• Base 64 (0–9, A–Z, a–z, +, /)• each base-64 digit is equivalent to six bits• used in MIME to transmit binary data in plain ASCII text
22 / 40
In class exercises
Add in binary:• 100 1100 + 1110 1111
Multiply in binary:• 1011 × 101
Add in hexadecimal:• 0x28 + 0x4A
0 0000 4 0100 8 1000 C 11001 0001 5 0101 9 1001 D 11012 0010 6 0110 A 1010 E 11103 0011 7 0111 B 1011 F 1111
23 / 40
Outline
Internal vs. external representations
Representing the natural numbersBinary number systemBinary arithmeticHexadecimal and base-N number systems
Fixed-size integer representationsRepresenting negative numbersBig endian vs. little endian
24 / 40
Arbitrary vs. fixed precision
So far, we have been assuming arbitrary precision• to represent a bigger number, just add more bits/digits!
In practice, integers have a fixed size• commonly 32 or 64 bits• based on register size of the architecture
This is significant for two reasons:• risk of overflow• representation of negative numbers
25 / 40
Representing negative numbers
Must first specify the fixed size of the integer!• With n bits, we can represent 2n different values• Idea: split space so half the values represent negatives
Sign and magnitude representation• First bit represents the sign (0 positive, 1 negative)• Rest of bits represent the magnitude, that is |x |
Suppose 4-bit integers• Examples: −1 = 1001 −4 = 1100 −7 = 1111
This is exactly the representation you’re used to in decimal!
26 / 40
Problems with sign and magnitude representation
This turns out to not be a very good representation . . . why?
Issue 1: Multiple zeros• Both 0000 and 1000 represent the same value• This is strange and requires extra effort
Issue 2: Complicated arithmeticSimple binary addition doesn’t work
0 010+ 0 011
0 1013
1 010+ 1 011
1 1013
0 010+ 1 011
1 0017
27 / 40
One’s complement representation
One’s complement• start with the fixed-size binary representation of |x |• invert every bit
Features:• Binary addition is simple (wrap-around carry)• Still two zeros (all 0s and all 1s)
Examples• -2 1. 0010 2. 1101• -3 1. 0011 2. 1100• -5 1. 0101 2. 1010
28 / 40
One’s complement addition
Overflow carries “wrap around” (added on the right)
Example: −2 +−3 = −5111101
+ 11001001
+ 11010
29 / 40
Two’s complement representation
Two’s complementTo represent a negative number x :
1. start with the fixed-size binary representation of |x |2. invert every bit3. add 1 to the result
Suppose 4-bit integers:
Examples• -1 1. 0001 2. 1110 3. 1111• -4 1. 0100 2. 1011 3. 1100• -7 1. 0111 2. 1000 3. 1001
30 / 40
Features of two’s complement representation
Range of expressible values with n bits• max: 2n−1 − 1 0 followed by all 1s• min: −2n−1 1 followed by all 0s
Fixes the issues with sign and magnitude:• Only one zero! (all 0s)• Binary arithmetic “just works” (discard carry out)
Examples: 2 + 3 = 5 −2 +−3 = −5 2 +−3 = −1
0010+ 0011
01013
1110+ 1101
10113
0010+ 1101
11113
31 / 40
Sign extension
Change the size of an integer without changing its value• if positive (left-most bit 0), pad left with 0s• if negative (left-most bit 1), pad left with 1s
Works with both one’s and two’s complement representation
Example: Extending from 8-bits to 16-bits• 1001 0110⇒ 1111 1111 1001 0110• 0001 0011⇒ 0000 0000 0001 0011
32 / 40
Carry out vs. overflow
Carry out: carry after most significant bit⇒ discard, no errorOverflow: result is out of representable range⇒ error!
Carry out 6= overflow!
Carry out is a normal part of signed integer addition
Will get a carry out when adding:• two negative numbers• a negative and a positive, result is positive
Just ignore it!
33 / 40
Two’s complement overflow detection
Overflow: result is out of representable range⇒ error!
When adding . . .• two numbers with different signs
• overflow can never occur!• two numbers with the same sign
• overflow occurs if the sign changes
34 / 40
Trade-offs between representations of negatives
In modern architectures, two’s complement is used
• Simple arithmetic operations• Only one zero• Hard to read
35 / 40
Unsigned vs. signed integers
Can interpret the same n-bit data as either unsigned or signed
Unsigned integer• Interpret as a positive number• Range: 0 to 2n
Signed integer• Interpret as two’s complement• Range: −2n−1 to 2n−1 − 1
Only different when the leftmost bit is a 1!
36 / 40
Big endian vs. little endian
Order of the addressable components in a larger data type• Usually, the order of bytes within a word
Big endianBytes ordered from most significant (left) to least (right)• Example: 256 as a 16-bit halfword: 0x0100
Little endianBytes ordered from least significant (left) to most (right)• Example: 256 as a 16-bit halfword: 0x0001
Big-endian is what we’ve been assuming so far!
37 / 40
Endian conversion
Converting from big endian to little endian1. Separate the data into addressable components (bytes)2. Write the components (not the bits!) in reverse order
Examples• 0x12345678 1. 12 34 56 78 2. 0x78563412• 0xE5AD5CCA 1. E5 AD 5C CA 2. 0xCA5CADE5
Same algorithm for converting from little to big!
38 / 40
Which architectures are what endian?
Little-endian:• x86, Atmel• MIPS (MARS simulator)
Big-endian:• Motorola 6800 and 68k
Bi-endian (configurable to be big or little):• ARM, SPARC, PowerPC• MIPS (specification)
http://en.wikipedia.org/wiki/Endianness
39 / 40
In class exercises
Assume 8-bit integers, addressable in 2-bit chunks
For each of the following numbers:1. write in two’s complement binary form2. convert to little endian
Numbers:• -50• -100
40 / 40