sg 2: fit1001 computer systems s1 2006 1 important notice for lecturers this file is provided as an...

64
SG 2: FIT1001 Computer Systems S1 2006 1 Important Notice for Lecturers This file is provided as an example only Lecturers are expected to modify / enhance slides to suit their teaching style Lecturers are expected to cover the topics presented in these slides Lecturers can export slides to another format if it suits their teaching style (but must cover the topics indicated in the slides) This file should not be used AS PROVIDED – you should modify it to suit your own needs! This slide should be deleted from this presentation Provided by the FIT1001 SIG

Upload: dylan-watson

Post on 27-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

SG 2: FIT1001 Computer Systems S1 2006 1

Important Notice for Lecturers

• This file is provided as an example only• Lecturers are expected to modify / enhance slides to

suit their teaching style • Lecturers are expected to cover the topics presented

in these slides• Lecturers can export slides to another format if it

suits their teaching style (but must cover the topics indicated in the slides)

• This file should not be used AS PROVIDED – you should modify it to suit your own needs!

• This slide should be deleted from this presentation• Provided by the FIT1001 SIG

www.monash.edu.au

www.monash.edu.au

FIT1001- Computer Systems

Lecture 2

Data Representation and Computer Arithmetic

SG 2: FIT1001 Computer Systems S1 2006 4

Lecture 2: Learning Objectives• Explain the concept of byte ordering in terms of endian representation• Explain the various representations for integer, signed, fractional, floating-

point and character data;• Demonstrate conversion methods for integer, signed, fractional and floating-

point numbers;• Demonstrate basic mathematical operations on signed integers;• Understand the origins and methods of storage used for binary data.

SG 2: FIT1001 Computer Systems S1 2006 5

Introduction

• Before examining the structure, design and management of memory

– Focus on the representation of different types of numerical data

– The types of information that can be stored

• Units of data storage– Bit– Byte (8 adjacent bytes)

> 2 alternates states per bit therefore 28 = 256 patterns

– Computer / programmer can interpret as:> A number between 0 and 255

> A number between -128 and +127

> A character such as ‘A’, ‘w’, ‘\’, ‘?’

SG 2: FIT1001 Computer Systems S1 2006 6

Introduction

– Nibble (4 adjacent bytes)> Two nibbles together make a byte

– Word> A contiguous group of bytes (commonly 16 bits)

> Represents 216 = 65,536 bits

> Term generally applied to larger groups– Longword (double-word) 232

– Quadword 264

– Octaword 2128

• Memory therefore is a linear array of bits, nibbles, bytes, words ………

– Depends on the basic addressable unit selected for construction

SG 2: FIT1001 Computer Systems S1 2006 7

Introduction

• This linear array is used to represent the data structures required

– Lists, arrays, trees, records etc

• How do we organise this linear memory?– Make memory access one byte wide

> Lowest common multiple solution

> Slow

– A wider path to memory produces better performance> More information is transmitted in one access

> 32 and 64 bits are common arrangements today

www.monash.edu.au

Byte Ordering – Endian Representation

SG 2: FIT1001 Computer Systems S1 2006 9

Endian Representation

• Microprocessor architectures commonly use two different methods to store the individual bytes of data in memory

– This is referred to as byte ordering– Two main methods – big and little endian

> E.g., PowerPC stores a two-byte integer with its most significant byte (MSB) first, followed by its least significant byte (LSB)

> E.g., Intel x86 processors used in PCs, store a two-byte integer with the least significant byte first (LSB), followed by the most significant byte (MSB)

• Big-endian and little-endian – Derived from Jonathan Swift’s Gulliver’s Travels (1726)

SG 2: FIT1001 Computer Systems S1 2006 10

Endian Representation

• Generally the endian format of your computer can be safely ignored

– However in certain circumstances it becomes critically important

> E.g., Reading data from files that were created on a computer that has a different endian can produce errors

– The same problem can occur when reading data from a network

• Illustrate endian representation graphically– Use the following notation

> Bits (B0, B1…)

> Characters, 8 bits per byte (C0, C1….)

> Words of 16 bits = 2 bytes (W0, W1…)

> Longwords of 2 words = 4 bytes = 32 bits (L0, L1...)

SG 2: FIT1001 Computer Systems S1 2006 11

Big-Endian View

• This convention locates:– The MSB on the left (of a longword)– The LSB on the right (of a longword)– English like from left to right– The significance decreases with the increasing item number

(memory address)

Memory address values

SG 2: FIT1001 Computer Systems S1 2006 12

Little-Endian View

• This convention still locates:– The MSB on the left (of a longword)– The LSB on the right (of a longword)– But from right to left– The significance increases with the decreasing item number

(memory address)

Memory address values

SG 2: FIT1001 Computer Systems S1 2006 13

Endian Representation

• An example:– Suppose we have a 32-bit (longword) hexadecimal

number 1234567816

– Endianness > Does not indicate what the value ends with when stored in

memory but rather which end it begins with

– We will revisit Endian Representation in Study Guide 5

www.monash.edu.au

Signed Integer

Representation

SG 2: FIT1001 Computer Systems S1 2006 15

Unsigned Integer Data

• Unsigned integers– Positive integers plus 0, i.e., 0, 1, 2, 3, 4, ...– Also referred to as the non-negative integers

• The unsigned integers are represented as simple binary numbers

– E.g., (Considering words of 8 bits):> 1010 is represented as 000010102

> 24910 is represented as 111110012

> 010 is represented as 000000002

– 25610 cannot be represented by a word of 8 bits

– A word of n bits can represent any of the integers 0, 1, 2, ., (2n-1)> E.g., Word of 16 bits = 216-1 = 65,535> E.g., Longword, 32 bits = 232-1 = 4,294,967,295

SG 2: FIT1001 Computer Systems S1 2006 16

Signed Integer Data

• Signed-Magnitude Representation– To distinguish between positive and negative integers– Sign bit: typically the sign bit is the leftmost bit– The sign bit denotes the sign (positive or negative) of the

integer> If the sign bit is 0, the integer is positive

> if the sign bit is 1, the integer is negative

– The magnitude of the integer is given by the remaining bits– E.g., (Considering words of 8 bits):

> +3710 is represented as 001001012

> -3710 is represented as 101001012

– A word of n bits can represent any of the integers > -(2n-1), ..., -1, 0, 1, ..., (2n-1)

SG 2: FIT1001 Computer Systems S1 2006 17

Signed Integer Data - Addition

• Computers must be able to perform arithmetic calculations

– Signed-magnitude arithmetic is done much the same way as humans carry out pencil and paper arithmetic

• Binary addition is as easy as it gets– 0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 10– Rule1

> If the signs are the same add the magnitude and use same sign for result

– Rule2> If the signs differ, determine with integer has the largest

magnitude

> Sign of result is the same as integer with the largest magnitude

> Magnitude: subtract smaller integer from larger one

SG 2: FIT1001 Computer Systems S1 2006 18

Signed Integer Data - Addition

• Computers must be able to perform arithmetic calculations

– Signed-magnitude arithmetic is done much the same way as humans carry out pencil and paper arithmetic

• Binary addition is as easy as it gets– 0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 10– The simplicity of this system makes it possible for digital

circuits to carry out arithmetic operations

– E.g., 7510 + 4610 = 12110

> 7510 = 010010112

> 4610 = 011110012

> 12110 = 011110012

SG 2: FIT1001 Computer Systems S1 2006 19

Signed Integer Data - Addition

– In the previous example the sum produced a value that fitted neatly into seven bits

> In some cases this is not true

> E.g., 10710 + 4610 = 15310

– The carry from the seventh bit overflows and is discarded

– The result is thus erroneous: 10710 + 4610 ≠ 2510

– What if the signs are negative?> E.g., -4610 + -2510 = -7110

SG 2: FIT1001 Computer Systems S1 2006 20

Signed Integer Data - Subtraction

• Binary subtraction is also easy– 0 - 0 = 0 1 - 0 = 1– 1 - 1 = 0 0 - 1 = 1 (with a borrow of 1) – Terms

> Minuend: in a problem a-b, a is the minuend

> Subtrahend: in the problem a-b, b is the subtrahend

> E.g., 9910 - 7910 = 2010

0112 <= borrows

0 11000112 9910

0 - 10011112 7910

0 00101002 2010

E.g., 7910 - 9910 = -2010

– We can see from the previous example that 11000112 is the largest

– The difference is = 00101002

– All we need to do is change the sign so it is negative 100101002

SG 2: FIT1001 Computer Systems S1 2006 21

Signed Integer Data - Subtraction

– Subtraction is the same as adding the opposite> Equates to negating the value we wish to subtract

> Can then add the value – E.g., 9910 - 7910 = 2010 is the same as -7910 + 9910 = 2010

– Easier than performing subtraction, especially with binary

> E.g., Add -1910 to +1310

– = -1910 + +1310

– = -1910 – 1310 = -610

– Recall rules for addition number 2

012 <= borrows

1 00100112 -1910

0 - 00011012 +1310

1 00001102 -610

SG 2: FIT1001 Computer Systems S1 2006 22

Complement Systems

• Signed magnitude representation– Is easy for people to understand– Requires complicated computer hardware– Allows two different representations for zero

> positive zero and negative zero– +0 is represented as 000000002

– -0 is represented as 111111112

• For these reasons (among others) computers systems employ complement systems for numeric value representation

– Used to simplify subtractions, negative numbers and logical operations

SG 2: FIT1001 Computer Systems S1 2006 23

Complement Systems

• In Base-B arithmetic we can computer two complements

– B’s and (B-1) complements– Decimal

> 10’s complement / 9’s complement

– Binary> 2’s complement / 1’s complement

• 10’s complement– Decimal number N with d digits is 10d - N (except 0)– E.g., 418930 has 6 digits so d = 6, so it is subtracted from

100000010 > 1000000 – 418930 = 581070

SG 2: FIT1001 Computer Systems S1 2006 24

Complement Systems

• 9’s complement– Decimal number N with d digits is (10d - 1 - N)– E.g., 418930 is ((106 – 1) – 418930)

> = 999999 - 418930

> = 581069 (of course, 1 less than the 10’s complement)

– A decimal number can be subtracted from another by adding the difference from all nines and adding back a carry

> E.g., Find 222 – 49 using 9’s complement999 – 49 = 950, so becomes 222 + 950 = 1172– And add back the carry in the (indicated in red) = 173

– Known as:> Diminished radix complement / casting out the 9’s

– Extended to binary operations for simplification of arithmetic> No need to process sign bit separately as in signed magnitude

SG 2: FIT1001 Computer Systems S1 2006 25

Complement Systems - 1’s Complement

• 1’s complement– Binary number N with d digits is (2d - 1 - N)

– E.g., 0101102 has 6 digits, so d = 6, so the number is subtracted from 1111112

> 0101102 is 10000002 - 12 - 0101102 = 1111112 - 0101102 = 1010012

– Shortcut > Reverse all 0’s and 1’s

> E.g., +3710 = 001001012 / -3710 = 11011010 2

– Positive numbers are represented in signed magnitude form– A negative number is represented by taking 1’s complement

of the positive equivalent of the number– Therefore a word of n bits can represent any of the integers -(2n-1-1),

..., -1, 0, 1, ..., (2n-1-1)

SG 2: FIT1001 Computer Systems S1 2006 26

Complement Systems - 1’s Subtraction

• 1’s complement subtraction– The carry bit is “carried around” and added to the sum– E.g., 4810 – 1910 = 2910

> Express 19 in 1’s complement – 111011002

> Now becomes 4810 + -1910

– The “end carry around” does adds some complexity> 1’s complement is simpler to implement than signed magnitudes

– 1’s complement has a disadvantage: > Two representations for zero - positive and negative

SG 2: FIT1001 Computer Systems S1 2006 27

Complement Systems - 2’s Complement

• 2’s complement– Binary number N with d digits is 2d – N

– E.g., 0101102 has 6 digits, so d = 6, so the number is subtracted from 10000002

> 0101102 is 26 – 0101102 = 10000002 – 0101102 = 1010102

– Positive numbers are represented in signed magnitude form– A negative number is represented by taking 1’s complement

of the positive equivalent of the number and then adding 1 to it

> E.g., +3710 = 001001012

> -3710 = 110110102 + 12 = 110110112

> +128 cannot be represented

> -128 = 011111112 + 12 = 100000002

– A word of n bits can represent any of the integers: > -2n-1, ..., -1, 0, 1, ..., (2n-1-1)

SG 2: FIT1001 Computer Systems S1 2006 28

Complement Systems - 2’s Complement

– Shortcut> From right to left

– Leave all trailing zeroes unchanged, keep the first 1, reverse the remaining 0s and 1s

> E.g., 001001012 = 110110112

• Complement systems are useful because they eliminate the need for special circuitry for subtraction

– The difference of two values is found by adding the minuend to the complement of the subtrahend

SG 2: FIT1001 Computer Systems S1 2006 29

Complement Systems – 2’s Subtraction

• 2’s complement subtraction– Add the two binary numbers– Discard any carries emitting from the high order bit

– E.g., 4810 – 1910 = 2910

> Express 19 in 2’s complement – 1’s complement = 111011002

– = + 12

– 2’s complement = 111011012

• When using any finite number of bits to represent a number

– Run the risk of the result becoming too large to be stored– Can’t always prevent overflow, we can always detect

overflow

SG 2: FIT1001 Computer Systems S1 2006 30

Complement Systems – 2’s Subtraction

– An overflow condition is easy to detect> E.g., 10710 + 4610 = 15310

– 10710 = 11010112

– 4610 = 1011102

> The nonzero carry from the seventh bit overflows into the sign bit

– Thus giving the erroneous result: 107 + 46 = -103

> An overflow occurs in two conditions:– Two positive numbers are added and the result is negative– Two negatives numbers are added and the result is positive– Not possible in 2’s to have overflow if positive and negative

numbers are being added

www.monash.edu.au

Floating Point Representation

SG 2: FIT1001 Computer Systems S1 2006 32

Floating Point Representation

• Signed magnitude, one’s complement, and two’s complement representation deal with integer values only

– Without modification these formats are not useful in scientific or business applications that deal with real number values

– Real numbers> Include rational and irrational numbers

– Rational: defined as all numbers of the form:» (integer_numerator) /(non-zero integer_denominator)» Include integers and repeating decimal fractions

– Irrational: numbers which cannot be represented as an exact ratio of two integers

2 = 1.414213562..., = 3.1415926535...

– A computer is finite and cannot store an infinite number of anything

SG 2: FIT1001 Computer Systems S1 2006 33

Floating Point Representation

• Can perform floating-point calculations using any integer format

– Early days> Mainframes / microcomputers had no floating point

> Programming methods emulated floating point

> Programs are written that make it seem as if floating-point values are being used

– Today’s computers are equipped with specialized hardware that performs floating-point arithmetic

• Floating-point numbers allow an arbitrary number of decimal places to the right of the decimal point

– E.g., 0.5 * 0.25 = 0.125

SG 2: FIT1001 Computer Systems S1 2006 34

Floating Point Representation

– These values are often expressed in scientific notation: > E.g., 0.125 = 1.25 * 10-1 / 5,000,000 = 5.0 * 106

– A binary fraction can be represented by many floating point forms simply by moving the binary point

> E.g., 11100000000002 – = 0.1112 * 213

– = 0.001112 * 215

– = 11.12 * 211

• Numbers written in scientific notation have three components

SG 2: FIT1001 Computer Systems S1 2006 35

Floating Point Representation

• Computer representation of a floating-point number consists of three fixed-size fields

» Significand is a fancy name for mantissa

– The one-bit sign field is the sign of the stored value– The size of the exponent field, determines the range of

values that can be represented– The size of the significand determines the precision of the

representation

SG 2: FIT1001 Computer Systems S1 2006 36

Floating Point Representation

– The significand of a floating-point number is always preceded by an implied binary point

> Thus the significand always contains a fractional binary value

– The exponent indicates the power of 2 to which the significand is raised

> E.g., express 3210 in a 14-bit model with a 5-bit exponent and an 8-bit significand

– 3210 = 1000002 = 0.1 x 26

SG 2: FIT1001 Computer Systems S1 2006 37

Floating Point Representation

• Problem 1 – No unique representation for each number

– The illustrations shown at the right are all equivalent representations for 32using our simplified model

– These synonymous representations waste space

– Can also cause confusion

• Problem 2 - No allowances for negative exponents– No way to express 0.5 (2-1) – There is no sign in the exponent field

SG 2: FIT1001 Computer Systems S1 2006 38

Floating Point Representation

• To solve problem 1– Establish a rules where the mantissa has the form 0.1xxx….

> E.g.,– The normalized form of 11100000000002 = 0.1112 * 213

– The normalized form of 110.112 = 0.110112 * 23

– The normalized form of 0.0000001012 = 0.1012 * 2-6

– Known as the Normalized Form

• To solve problem 2– Use a biased exponent– Approximately midway in the range of values expressible by

the exponent– Subtract the bias from the value in the exponent to determine

its true value

SG 2: FIT1001 Computer Systems S1 2006 39

Floating Point Representation

> E.g., express 3210 in revised 14-bit model with a 5-bit 16 biased exponent and an 8-bit significand

– 3210 = 1000002 = 0.1 x 26

– Biased exponent = 16 + 6 = 2210 = 101102

> E.g., express 0.062510 in revised 14-bit model with a 5-bit 16 biased exponent and an 8-bit significand

– 0.062510 = .00012 (2-4)= 0.1 x 2-3

– Biased exponent = 16 + -3, giving 1310 = 011012

SG 2: FIT1001 Computer Systems S1 2006 40

Floating Point Representation

• As the leading bit in the significand is a 1– The 1 is implied and can be discarded– By using an implied 1 the precision of the representation is

increased by a power of two> So 3210 = 1000002 = 0.1 x 26 can now be stored as:

> 0.062510 = .00012 (2-4)= 0.1 x 2-3 can now be stored as:

> Important: Must include the 1 when converting back!

SG 2: FIT1001 Computer Systems S1 2006 41

Floating Point Representation

• E.g.,– Consider that a representation uses 32 bits with 8 exponent

bits> So, the number of significand (mantissa) bits is 32 - 8 (exponent

bits) - 1 sign bit = 23

> The bias is 28-1 - 1 = 127– -85.211 = 1010101.001101100000010000011...2

= 0.1010101001101100000010000011... 2 27

– Exponent + bias = 7 + 127 = 134 = 100001102

– So, -85.211 is represented as: 1 10000110 01010100110110000001000

SG 2: FIT1001 Computer Systems S1 2006 42

Floating Point Representation

• Underflow / Overflow– No matter how many bits are used in a floating-point

representation the model is finite> Will always have values that cannot be represented

– Overflow> Negative: negative values greater then what can be represented

> Positive: positive values greater then what can be represented

– Underflow> Negative: negative values between 0 and the first expressible

negative number

> Positive: negative values between 0 and the first expressible negative number

SG 2: FIT1001 Computer Systems S1 2006 43

Floating Point Representation

– Consider 32 bits floating point representation with 8 exponent bits and 23 mantissa bits

> The maximum possible exponent = 27 - 1 = 127

> The minimum possible exponent = -27 = -128

> The maximum possible mantissa = 1-2-(23+1) = 1 - 2-24

> The minimum possible mantissa = 0.5 (.100…02)

SG 2: FIT1001 Computer Systems S1 2006 44

Floating Point Representation

• IEEE-754– Institute of Electrical and Electronic Engineers (IEEE)– 1985: Standards for single and double precision floating point– Previously no set standard, caused numerous incompatible

representations across various manufacturers– IEEE-754 Single Precision Standard

> 32 bits: 8 bit 127 biased exponent / 23 bit significand / 1 bit sign

> Normalisation: significand has the form 1.xxx (not 0.1xxx…)

> Some bit patterns used to represent special values– E.g., A biased exponent of 255 indicates special values

» If the significand is zero, the value is ± infinity» If the significand is nonzero, the value is NaN, “not a number”

– E.g., An exponent of 0 can indicate a number of special values including positive and negative 0

SG 2: FIT1001 Computer Systems S1 2006 45

Floating Point Representation

> Therefore exponent range is -126 to +127

> Range of values that can be represented:

– Negative overflow: 3.4 * 10-38 – Negative underflow: 1.17 * 10-38 – Positive overflow: 3.4 * 1038 – Positive underflow: 1.17 * 1038

SG 2: FIT1001 Computer Systems S1 2006 46

Floating Point Representation

– E.g.,> Consider the same example of -85.211 as a IEE-754

representation:– The bias = 127

– -85.211 = 1010101.001101100000010000011...2

= 1.010101001101100000010000011...2 26

– Exponent + bias = 6 + 127 = 133 = 100001012

– So, -85.211 is represented as:

1 10000101 01010100110110000001000

SG 2: FIT1001 Computer Systems S1 2006 47

Floating Point Representation

– IEEE-754 Double Precision Standard> 64 bits: 11 bit 1023 biased exponent / 52 bit significand / 1 bit

sign– The special exponent value for a double precision number is 2047– Also an exponent of 0 can indicate a number of special values

including positive and negative 0– Therefore exponent range is -1022 to +1023

> Range of values that can be represented

– Negative overflow: 1.79 * 10-308 – Negative underflow: 2.22 * 10-308 – Positive overflow: 1.79 * 10308 – Positive underflow: 2.22 * 10308

SG 2: FIT1001 Computer Systems S1 2006 48

Floating Point Representation

• Intel 80x86 Family> Supports three formats

– Short (32 bit, single) / Long (64 bit, double) / temporary (80 bits, extended)

– Single and double are converted temporarily into extended

> Temporary has 80 bits divided into 4 sections:– 1 bit sign / 15 bit 16383 biased exponent / 12 stored / 63 bit

significand– Range of values that can be represented (approx.)

» 1.7 * 10-4932 to 1.1 * 104932

SG 2: FIT1001 Computer Systems S1 2006 49

Floating Point Errors

• No matter how many bits are used in a floating-point representation the model is finite

• The real number system is infinite– Floating points and their calculations in some instances can

give nothing more than an approximation of a real value

• Using a greater number of bits in a representation– Can reduce errors but can never totally eliminate them– Must be aware of the possible magnitude of error in

calculations

• Floating point errors: blatant, subtle or unnoticed– Blatant: numeric overflow / underflow can cause programs to

crash– Subtle: can lead to erroneous results / hard to detect

SG 2: FIT1001 Computer Systems S1 2006 50

Floating Point Errors

• E.g., The 14-bit revised model cannot exactly represent the decimal value 128.7510

– In 0.1xxx format> 10000000.112

– .1000000011 * 28

– Biased exponent = 16 + 8 = 2410 = 110002

– Therefore floating point representation is: 1 11000 000000012

– In 1.xxx format > 10000000.112

– 1.000000011 * 27

– Biased exponent = 16 + 7 = 2310 = 101112

– Therefore floating point representation is: 1 10111 000000012

SG 2: FIT1001 Computer Systems S1 2006 51

Floating Point Errors

– In both cases> Lost the low-order bit giving a relative error of:

128.75 – 128

128

> If we had a procedure that repetitively added 0.75 to 128.75– Would have an error of over 2% after only four iterations– Better to iteratively add 0.75 to itself and then add 128.75 to 128.75

≈ .58%

www.monash.edu.au

Character Data Representation

SG 2: FIT1001 Computer Systems S1 2006 53

Character Data Representation• Human understandable characters must be

converted to computer understandable bit patterns using some sort of character encoding scheme

– Computer calculations are not useful until their results can be displayed in a manner that is meaningful to people

– Need to store the results of calculations– Provide a means for data input

• As computers have evolved, character codes have evolved

– Larger computer memories and storage devices permit richer character codes

• The earliest computer coding systems used six bits

SG 2: FIT1001 Computer Systems S1 2006 54

Character Data Representation• Binary-coded decimal (BCD)

– BCD encodes each digit of a decimal number to a 4 digit binary

– E.g., > 9810 is represented as 1001 1000

> 340210 is represented as 0011 0100 0000 0010

– When stored in an 8 bit byte> Upper nibble becomes the sign (see zones)

> Lower nibble is the digit itself in BCD

> E.g., – 912 is represented as 1111 1001 1111 0001 1111 0010– +912 is represented as 1111 1001 1111 0001 1100 0010– -912 is represented as 1111 1001 1111 0001 1101 0010

SG 2: FIT1001 Computer Systems S1 2006 55

Character Data Representation– Save on space and computation by placing adjacent digits

into adjacent nibbles leaving one nibble for the sign– E.g.,

> +128 is represented as 0001 0010 1000 1100

> -45 is represented as 0000 0100 0101 1101

– IBM used a 6 bit variation of BCD> Used by IBM mainframes in the 1950s and 1960s

> Could not represent lowercase letters

• Extended Binary-Coded Decimal Interchange Code (EBCDIC)

– 1964: Extended BCD to 8 bits > Maintain backward compatibility

> Provide better information processing capabilities for IBM System/360

SG 2: FIT1001 Computer Systems S1 2006 56

Character Data Representation– First widely-used computer code that supported

> Upper and lowercase alphabetic characters

> Special characters, such as punctuation and control characters

– EBCDIC and BCD are still in use by IBM mainframes today– E.g.,

> The character a is 1000 0001

> The character A is 1100 0001– Translation from upper to lower case easy– Zone bits help to test validity of input

> The digit 3 is 1111 0011

– BCD and EBCDIC were based upon punched card codes

SG 2: FIT1001 Computer Systems S1 2006 57

Character Data Representation• American Standard for Information Interchange

(ASCII)– 7-bit ASCII was chosen as a replacement for 6-bit codes by

other companies– ASCII was based upon telecommunications (Telex) codes– Until recently ASCII was the dominant character code outside

the IBM mainframe world– Uses 7 bits, 27-1 or 127 individual characters or symbols

> 32 controls

> 10 digits

> 52 letters (upper and lowercase)

> 32 special characters ($ and @ for example)

– The 8th bit was designated for parity checking

SG 2: FIT1001 Computer Systems S1 2006 58

Character Data Representation

SG 2: FIT1001 Computer Systems S1 2006 59

Character Data Representation– As computers became more reliable the need for parity bit

faded> Computer manufacturers extended ASCII to provide more

characters

> Some international characters

> Between ranges 128 - 255 (28 - 1)

> There are a few varying ‘extended sets’ – The most popular is

displayed to the right:

SG 2: FIT1001 Computer Systems S1 2006 60

Character Data Representation– ANSI (American National

Standards Institute) character codes

> The most popular character set of many Windows programs

> Characters 32 through126 are displayable characters from the ASCII

> Characters 0 through 31 are not supported by Windows

> Characters 128 through 255 are different to the extended set on the previous slide

SG 2: FIT1001 Computer Systems S1 2006 61

Character Data Representation• Unicode

– EBCDIC and ASCII are built around the Latin alphabet> Thus, restricted in their ability for data representation for non-

Latin aplhabets

> Countries developed their own codes for native languages

– 16-bit system that can encode the characters of every language in the world (well almost!)

> The Java programming language, and some operating systems now use Unicode as their default character code

– 16 bits = 216 = 65,636 characters– Downward compatible with ASCII and Latin-1 character sets– Unicode codespace is divided into six parts

> The first part is for Western alphabet codes, including English, Greek, and Russian

SG 2: FIT1001 Computer Systems S1 2006 62

Character Data Representation

SG 2: FIT1001 Computer Systems S1 2006 63

Character Data Representation– English section of Unicode example (right)

> ACSII equivalent of A is 4116

> Unicode is equivalent of A:– 00 4116

– Full chart list:> http://www.unicode.org/charts/

SG 2: FIT1001 Computer Systems S1 2006 64

Next Week

• Study Guide 3– Overview of Boolean Algebra and Digital Logic