the design of c: a rational reconstruction

The Design of C:

A Rational Reconstruction

•1

Goals of this Lecture

• Help you learn about: – The decisions that were available to the designers of C – The decisions that were made by the designers of C … and thereby… – C !

• Why? – Learning the design rationale of the C language provides a

richer understanding of C itself • … and might be more interesting than simply learning the language

itself !!!

– A power programmer knows both the programming language and its design rationale

• But first a preliminary topic…

•2

Preliminary Topic

Number Systems

•3

Why Bits (Binary Digits)?

• Computers are built using digital circuits – Inputs and outputs can have only two values – True (high voltage) or false (low voltage) – Represented as 1 and 0

• Can represent many kinds of information – Boolean (true or false) – Numbers (23, 79, …) – Characters („a‟, „z‟, …) – Pixels, sounds – Internet addresses

• Can manipulate in many ways – Read and write – Logical operations – Arithmetic

•4

Base 10 and Base 2

• Decimal (base 10) – Each digit represents a power of 10

– 4173 = 4 x 103 + 1 x 102 + 7 x 101 + 3 x 100

• Binary (base 2) – Each bit represents a power of 2

– 10110 = 1 x 24 + 0 x 23 + 1 x 22 + 1 x 21 + 0 x 20 = 22

•5

Decimal to binary conversion:

Divide repeatedly by 2 and keep remainders 12/2 = 6 R = 0

6/2 = 3 R = 0

3/2 = 1 R = 1

1/2 = 0 R = 1

Result = 1100

Writing Bits is Tedious for People • Octal (base 8)

– Digits 0, 1, …, 7

• Hexadecimal (base 16) – Digits 0, 1, …, 9, A, B, C, D, E, F

•6

0000 = 0 1000 = 8

0001 = 1 1001 = 9

0010 = 2 1010 = A

0011 = 3 1011 = B

0100 = 4 1100 = C

0101 = 5 1101 = D

0110 = 6 1110 = E

0111 = 7 1111 = F

Thus the 16-bit binary number

1011 0010 1010 1001

converted to hex is

B2A9

Representing Colors: RGB

• Three primary colors – Red – Green – Blue

• Strength – 8-bit number for each color (e.g., two hex digits) – So, 24 bits to specify a color

• In HTML, e.g. Web page – Red: <span style="color:#FF0000">De-Comment Assignment

Due</span> – Blue: <span style="color:#0000FF">Reading Period</span>

• Same thing in digital cameras – Each pixel is a mixture of red, green, and blue

•7

Finite Representation of Integers

• Fixed number of bits in memory – Usually 8, 16, or 32 bits – (1, 2, or 4 bytes)

• Unsigned integer – No sign bit – Always 0 or a positive number – All arithmetic is modulo 2n

• Examples of unsigned integers – 00000001 1 – 00001111 15 – 00010000 16 – 00100001 33 – 11111111 255

•8

Adding Two Integers

• From right to left, we add each pair of digits

• We write the sum, and add the carry to the next column

•9

1 9 8

+ 2 6 4

Sum

Carry

0 1 1

+ 0 0 1

Sum

Carry

2

1

6

1

4

0

0

1

0

1

1

0

Base 10 Base 2

Binary Sums and Carries

a b Sum a b Carry

0 0 0 0 0 0

0 1 1 0 1 0

1 0 1 1 0 0

1 1 0 1 1 1

•10

XOR (“exclusive OR”)

AND

0100 0101

+ 0110 0111

1010 1100

69

103

172

Modulo Arithmetic

• Consider only numbers in a range – E.g., five-digit car odometer: 0, 1, …, 99999 – E.g., eight-bit numbers 0, 1, …, 255

• Roll-over when you run out of space – E.g., car odometer goes from 99999 to 0, 1, … – E.g., eight-bit number goes from 255 to 0, 1, …

• Adding 2n doesn‟t change the answer – For eight-bit number, n=8 and 2n=256 – E.g., (37 + 256) mod 256 is simply 37

• This can help us do subtraction… – Suppose you want to compute a – b – Note that this equals a + (256 -1 - b) + 1

•11

Ones’ and Two’s Complement

• Ones‟ complement: flip every bit – E.g., b is 01000101 (i.e., 69 in decimal) – One‟s complement is 10111010 – That‟s simply 255-69

• Subtracting from 11111111 is easy (no carry needed!)

• Two‟s complement – Add 1 to the one‟s complement – E.g., (255 – 69) + 1 1011 1011

•12

- 0100 0101 1111 1111

1011 1010

b

one’s complement

Putting it All Together

• Computing “a – b” – Same as “a + 256 – b” – Same as “a + (255 – b) + 1” – Same as “a + onesComplement(b) + 1” – Same as “a + twosComplement(b)”

• Example: 172 – 69 – The original number 69: 0100 0101 – One‟s complement of 69: 1011 1010 – Two‟s complement of 69: 1011 1011 – Add to the number 172: 1010 1100 – The sum comes to: 0110 0111 – Equals: 103 in decimal

•13

1010 1100 + 1011 1011

1 0110 0111

Signed Integers

• Sign-magnitude representation – Use one bit to store the sign

• Zero for positive number • One for negative number

– Examples • E.g., 0010 1100 44 • E.g., 1010 1100 -44

– Hard to do arithmetic this way, so it is rarely used • Complement representation

– Ones‟ complement • Flip every bit • E.g., 1101 0011 -44

– Two‟s complement • Flip every bit, then add 1 • E.g., 1101 0100 -44

•14

Overflow: Running Out of Room

• Adding two large integers together – Sum might be too large to store in the number of

bits available – What happens?

• Unsigned integers – All arithmetic is “modulo” arithmetic – Sum would just wrap around

• Signed integers – Can get nonsense values – Example with 16-bit integers

• Sum: 10000+20000+30000 • Result: -5536

•15

Bitwise Operators: AND and OR • Bitwise AND (&)

– Mod on the cheap! • E.g., 53 % 16 • … is same as 53 & 15;

• Bitwise OR (|)

•16

&

0

1

0 1

0 0

0 1

|

0

1

0 1

0 1

1 1

0 0 1 1 0 1 0 1

0 0 0 0 1 1 1 1

53

& 15

0 0 0 0 0 1 0 1 5

Bitwise Operators: Not and XOR

• Ones‟ complement (~) – Turns 0 to 1, and 1 to 0

– E.g., set last three bits to 0

• x = x & ~7;

• XOR (^) – 0 if both bits are the same

– 1 if the two bits are different

•17

^

0

1

0 1

0 1

1 0

Bitwise Operators: Shift Left/Right • Shift left (<<): Multiply by powers of 2

– Shift some # of bits to the left, filling the blanks with 0

• Shift right (>>): Divide by powers of 2 – Shift some # of bits to the right – For unsigned integer, fill in blanks with 0 – What about signed negative integers?

• Can vary from one machine to another!

•18

0 0 1 1 0 1 0 1 53

1 1 0 1 0 0 0 0 53<<2

0 0 1 1 0 1 0 1 53

0 0 0 0 1 1 0 1 53>>2

Example: Counting the 1’s

• How many 1 bits in a number? – E.g., how many 1 bits in the binary representation of 53?

– Four 1 bits • How to count them?

– Look at one bit at a time – Check if that bit is a 1 – Increment counter

• How to look at one bit at a time? – Look at the last bit: n & 1 – Check if it is a 1: (n & 1) == 1, or simply (n & 1)

•19

0 0 1 1 0 1 0 1

Counting the Number of ‘1’ Bits #include <stdio.h> #include <stdlib.h> int main(void) { unsigned int n; unsigned int count; printf("Number: "); if (scanf("%u", &n) != 1) { fprintf(stderr, "Error: Expect unsigned int.\n"); exit(EXIT_FAILURE); } for (count = 0; n > 0; n >>= 1) count += (n & 1); printf("Number of 1 bits: %u\n", count); return 0; }

•20

Summary

• Computer represents everything in binary – Integers, floating-point numbers, characters,

addresses, … – Pixels, sounds, colors, etc.

• Binary arithmetic through logic operations – Sum (XOR) and Carry (AND) – Two‟s complement for subtraction

• Bitwise operators – AND, OR, NOT, and XOR – Shift left and shift right – Useful for efficient and concise code, though

sometimes cryptic

•21

The Main Event

The Design of C

•22

Goals of C

Designers wanted C to support:

– Systems programming • Development of Unix OS • Development of Unix programming tools

But also:

– Applications programming • Development of financial, scientific, etc. applications

Systems programming was the primary intended use

•23

The Goals of C (cont.)

The designers wanted C to be:

– Low-level • Close to assembly/machine language • Close to hardware

But also: – Portable

• Yield systems software that is easy to port to differing hardware

•24

The Goals of C (cont.)

The designers wanted C to be:

– Easy for people to handle • Easy to understand • Expressive

– High (functionality/sourceCodeSize) ratio

But also: – Easy for computers to handle

• Easy/fast to compile • Yield efficient machine language code

Commonality: – Small/simple

•25

Design Decisions

In light of those goals… – What design decisions did the designers of C have?

– What design decisions did they make?

Consider programming language features, from simple to complex…

•26

Feature 1: Data Types

• Previously in this lecture: – Bits can be combined into bytes – Our interpretation of a collection of bytes gives it meaning

• A signed integer, an unsigned integer, a RGB color, etc.

• A data type is a well-defined interpretation of a collection of bytes

• A high-level programming language should provide primitive data types – Facilitates abstraction – Facilitates manipulation via associated well-defined

operators – Enables compiler to check for mixed types, inappropriate use

of types, etc.

•27

Primitive Data Types

• Issue: What primitive data types should C provide? • Thought process

– C should handle: • Integers • Characters • Character strings • Logical (alias Boolean) data • Floating-point numbers

– C should be small/simple • Decisions

– Provide integer, character, and floating-point data types – Do not provide a character string data type (More on that later) – Do not provide a logical data type (More on that later)

•28

Integer Data Types

• Issue: What integer data types should C provide? • Thought process

– For flexibility, should provide integer data types of various sizes

– For portability at application level, should specify size of each

data type

– For portability at systems level, should define integral data types

in terms of natural word size of computer

– Primary use will be systems programming

•29

Why? Why?

Integer Data Types (cont.)

• Decisions – Provide three integer data types: short, int, and long

– Do not specify sizes; instead:

• int is natural word size

• 2 <= bytes in short <= bytes in int <= bytes in long

• Incidentally, on lab machines using gcc209 – Natural word size: 4 bytes

– short: 2 bytes

– int: 4 bytes

– long: 4 bytes (32bit OS), 8 bytes (64bit OS)

•30

Integer Constants

• Issue: How should C represent integer constants? • Thought process

– People naturally use decimal – Systems programmers often use binary, octal, hexadecimal

• Decisions – Use decimal notation as default – Use "0" prefix to indicate octal notation – Use "0x" prefix to indicate hexadecimal notation – Do not allow binary notation; too verbose, error prone – Use "L" suffix to indicate long constant – Do not use a suffix to indicate short constant; instead must use cast

• Examples – int: 123, -123, 0173, 0x7B – long: 123L, -123L, 0173L, 0x7BL – short: (short)123, (short)-123, (short)0173, (short)0x7B

•31

Was that a good

decision?

Why?

Unsigned Integer Data Types

• Issue: Should C have both signed and unsigned integer data types?

• Thought process – Must represent positive and negative integers

• Signed types are essential

– Unsigned data can be twice as large as signed data • Unsigned data could be useful

– Unsigned data are good for bit-level operations • Bit-level operations are common in systems programming

– Implementing both signed and unsigned data types is complex • Must define behavior when an expression involves both

•32

Unsigned Integer Data Types (cont.)

• Decisions

– Provide unsigned integer types: unsigned short, unsigned int,

and unsigned long

– Conversion rules in mixed-type expressions are complex

• Generally, mixing signed and unsigned converts signed to

unsigned

• See King book Section 7.4 for details

•33

Do you see

any potential

problems? Was providing

unsigned types a

good decision?

Unsigned Integer Constants

• Issue: How should C represent unsigned integer constants? • Thought process

– “L” suffix distinguishes long from int; also could use a suffix to distinguish signed from unsigned

– Octal or hexadecimal probably are used with bit-level operators • Decisions

– Default is signed – Use "U" suffix to indicate unsigned – Integers expressed in octal or hexadecimal automatically are

unsigned • Examples

– unsigned int: 123U, 0173, 0x7B – unsigned long: 123UL, 0173L, 0x7BL – unsigned short: (short)123U, (short)0173, (short)0x7B

•34

There’s More!

To be continued next lecture!

•35

the design of c: a rational reconstruction

Documents